JP2022520428A

JP2022520428A - Enzyme with RUVC domain

Info

Publication number: JP2022520428A
Application number: JP2021547336A
Authority: JP
Inventors: トーマス，ブライアン; ブラウン，クリストファー; カンター，ローズ; デヴォート，オードラ; バターフィールド，クリスティーナ; アレクサンダー，リサ; エス．エー．ゴルツマン，ダニエラ; リュー，ジェイソン
Original assignee: メタゲノミアイピーテクノロジーズ，エルエルシー
Priority date: 2019-02-14
Filing date: 2020-02-14
Publication date: 2022-03-30
Also published as: EP3924482A4; AU2020223370B2; JP7502537B2; JP2023179468A; CN116515797A; CA3241703A1; KR20210139254A; CN113728098A; EP3924482A1; KR20240007322A; WO2020168291A1; MX2021009886A; JP2024133476A; MX2023006575A; AU2023206079A1; US20240117330A1; KR102623312B1; CA3130135A1; AU2020223370A1

Abstract

【解決手段】本開示は、際立ったドメイン特徴を有するエンドヌクレアーゼ酵素、ならびにそのような酵素またはその変異体を使用する方法を提供する。【選択図】図１The present disclosure provides endonuclease enzymes with distinctive domain characteristics, as well as methods of using such enzymes or variants thereof. [Selection diagram] Fig. 1

Description

相互参照
本出願は、２０１９年２月１４日に出願された「ＭＧ１ＥＮＺＹＭＥＳＷＩＴＨＲＵＶＣＤＯＭＡＩＮＳ」と題される米国仮出願第６２／８０５，８６８号、および２０１９年７月１５日に出願された「ＭＧ１ＥＮＺＹＭＥＳＷＩＴＨＲＵＶＣＤＯＭＡＩＮＳ」と題される米国仮出願第６２／８７４，４１４号、および、２０１９年２月１４日に出願された「ＭＧ２ＥＮＺＹＭＥＳＣＯＮＴＡＩＮＩＮＧＲＵＶＣＤＯＭＡＩＮＳ」と題される米国仮出願第６２／８０５，８７８号、および２０１９年２月１４日に出願された「ＭＧ３ＥＮＺＹＭＥＳＷＩＴＨＲＵＶＣＤＯＭＡＩＮＳ」と題される米国仮出願第６２／８０５，８９９号の利益を主張し、これらの各々は、参照により完全に本明細書に組み込まれる。 Mutual Reference This application is filed in US Provisional Application No. 62 / 805,868 entitled "MG1 ENZYMES WITH RUVC DOMAINS" filed February 14, 2019, and filed July 15, 2019. US Provisional Application No. 62 / 874,414 entitled "MG1 ENZYMES WITH RUVC DOMAINS" and US Provisional Application No. 62 / entitled "MG2 ENZYMES CONTAING RUVC DOMAINS" filed on February 14, 2019. Claim the interests of US Provisional Application Nos. 805, 878, and US Provisional Application No. 62 / 805,899 entitled "MG3 ENZYMES WITH RUVC DOMAINS" filed February 14, 2019, each of which is by reference. Fully incorporated herein.

Ｃａｓ酵素は、それらの関連するクラスター化して規則的な配置の短い回文配列リピート（ＣＲＩＳＰＲ）ガイドリボ核酸（ＲＮＡ）とともに、原核生物免疫系に広がる（～４５％の細菌、～８４％の古細菌）構成成分であり、ＣＲＩＳＰＲ－ＲＮＡガイド核酸切断（ＣＲＩＳＰＲ－ＲＮＡｇｕｉｄｅｄｎｕｃｌｅｉｃａｃｉｄｃｌｅａｖａｇｅ）によって、感染性ウイルスおよびプラスミドなどの非自己核酸からそのような微生物を保護する役割を果たすように思われる。ＣＲＩＳＰＲＲＮＡエレメントをコードするデオキシリボ核酸（ＤＮＡ）エレメントは、構造と長さが比較的保存されている場合があるが、それらのＣＲＩＳＰＲ関連（Ｃａｓ）タンパク質は非常に多様であり、種々様々な核酸相互作用ドメインを含有している。ＣＲＩＳＰＲＤＮＡエレメントは早くとも１９８７年には観察されていたが、ＣＲＩＳＰＲ／Ｃａｓ複合体のプログラム可能なエンドヌクレアーゼ切断能力は比較的最近になって認識され、多様なＤＮＡ操作および遺伝子編集の用途における、組換えＣＲＩＳＰＲ／Ｃａｓシステムの使用につながっている。 Cas enzymes spread to the prokaryotic immune system (~ 45% bacteria, ~ 84% paleobacilli), along with their associated clustered and regularly arranged short circular sequence repeat (CRISPR) -guided ribonucleic acid (RNA). ) It is a component and appears to play a role in protecting such microorganisms from non-self-nucleic acids such as infectious viruses and plasmids by CRISPR-RNA guided nucleic acid cleavage. Deoxyribonucleic acid (DNA) elements encoding CRISPR RNA elements may be relatively conserved in structure and length, but their CRISPR-related (Cas) proteins are very diverse and vary from nucleic acid to each other. Contains an action domain. Although the CRISPR DNA element was observed as early as 1987, the programmable endonuclease cleavage capacity of the CRISPR / Cas complex has been recognized relatively recently and has been used in a variety of DNA manipulation and gene editing applications. It has led to the use of recombinant CRISPR / Cas systems.

配列表
本出願は配列表を含んでおり、この配列表はＡＳＣＩＩフォーマットで電子的に提出され、参照によりその全体が本明細書に組み込まれる。前記ＡＳＣＩＩのコピーは、２０２０年２月１３日に作成され、５５９２１－７０３＿６０１＿ＳＬ．ｔｘｔというファイル名であり、２３，３６３，１１３バイトのサイズである。 Sequence Listing This application contains a sequence listing, which is submitted electronically in ASCII format and is incorporated herein by reference in its entirety. A copy of the ASCII was made on February 13, 2020, and 55921-703_601_SL. The file name is txt and the size is 23,363,113 bytes.

いくつかの態様では、本開示は操作されたヌクレアーゼシステムを提供し、上記操作されたヌクレアーゼシステムは：（ａ）ＲｕｖＣ＿ＩＩＩドメインとＨＮＨドメインとを含むエンドヌクレアーゼであって、ここで、上記エンドヌクレアーゼは難培養性微生物（ｕｎｃｕｌｔｉｖａｔｅｄｍｉｃｒｏｏｒｇａｎｉｓｍ）に由来し、ここで、上記エンドヌクレアーゼは、クラス２のＩＩ型Ｃａｓエンドヌクレアーゼである、エンドヌクレアーゼと；（ｂ）上記エンドヌクレアーゼと複合体を形成するように構成される、操作されたガイドリボ核酸構造であって、上記操作されたガイドリボ核酸構造は：（ｉ）標的デオキシリボ核酸配列にハイブリダイズするように構成されたガイドリボ核酸配列と；（ｉｉ）上記エンドヌクレアーゼに結合するように構成されたｔｒａｃｒリボ核酸配列とを含む、操作されたガイドリボ核酸構造と、を含む。いくつかの実施形態では、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：１８２７－３６３７のいずれか１つに対して少なくとも７０％、少なくとも７５％、少なくとも８０％、または少なくとも９０％の配列同一性を有する配列を含む。 In some embodiments, the present disclosure provides an engineered nuclease system, wherein the engineered nuclease system is: (a) an endonuclease comprising a RuvC_III domain and an HNH domain, wherein the endonuclease is. Derived from a refractory nucleic acid, where the endonuclease is a class 2 type II Cas endonuclease, and is configured to form a complex with the endonuclease; (b) the endonuclease. The engineered guide ribonucleic acid structure to which the engineered guide ribonucleic acid structure is: (i) with a guide ribonucleic acid sequence configured to hybridize to the target deoxyribonucleic acid sequence; (ii) to the endonuclease. Includes an engineered guide ribonucleic acid structure, including a tracr ribonucleic acid sequence configured to bind. In some embodiments, the RuvC_III domain comprises a sequence having at least 70%, at least 75%, at least 80%, or at least 90% sequence identity to any one of SEQ ID NOs: 1827-3637. ..

いくつかの態様では、本開示は操作されたヌクレアーゼシステムを提供し、上記操作されたヌクレアーゼシステムは：（ａ）配列番号：１８２７－３６３７のいずれか１つに対して少なくとも７５％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含むエンドヌクレアーゼと；（ｂ）上記エンドヌクレアーゼと複合体を形成するように構成される、操作されたガイドリボ核酸構造であって、上記操作されたガイドリボ核酸構造は：（ｉ）標的デオキシリボ核酸配列にハイブリダイズするように構成されたガイドリボ核酸配列と；（ｉｉ）上記エンドヌクレアーゼに結合するように構成されたｔｒａｃｒリボ核酸配列と、を含む、ガイドリボ核酸構造と、を含む。 In some embodiments, the present disclosure provides an engineered nuclease system, wherein the engineered nuclease system has at least 75% sequence identity to any one of: (a) SEQ ID NO: 1827-3637. With an endonuclease containing a RuvC_III domain; (b) an engineered guide ribonucleic acid structure configured to form a complex with the endonuclease: (i). Includes a guide ribonucleic acid structure comprising a guide ribonucleic acid sequence configured to hybridize to a target deoxyribonucleic acid sequence; (ii) a tracr ribonucleic acid sequence configured to bind to the endonuclease.

いくつかの態様では、本開示は操作されたヌクレアーゼシステムを提供し、上記操作されたヌクレアーゼシステムは：（ａ）配列番号：５５１２－５５３７を含むプロトスペーサー隣接モチーフ（ＰＡＭ）配列に結合するように構成されたエンドヌクレアーゼであって、ここで、上記エンドヌクレアーゼはクラス２のＩＩ型Ｃａｓエンドヌクレアーゼである、エンドヌクレアーゼと；（ｂ）上記エンドヌクレアーゼと複合体を形成するように構成される、操作されたガイドリボ核酸構造であって、上記操作されたガイドリボ核酸構造は：（ｉ）標的デオキシリボ核酸配列にハイブリダイズするように構成されたガイドリボ核酸配列と；（ｉｉ）上記エンドヌクレアーゼに結合するように構成されたｔｒａｃｒリボ核酸配列と、を含む、操作されたガイドリボ核酸構造と、を含む。 In some embodiments, the present disclosure provides an engineered nuclease system such that the engineered nuclease system binds to a protospacer flanking motif (PAM) sequence comprising: (a) SEQ ID NO: 5512-5537. Constituted end nucleases, wherein the end nuclease is a class 2 type II Cas end nuclease with an end nuclease; (b) an operation configured to form a complex with the end nuclease. The guided guide ribonucleic acid structure, wherein the engineered guide ribonucleic acid structure is: (i) a guide ribonucleic acid sequence configured to hybridize to the target deoxyribonucleic acid sequence; (ii) to bind to the end nuclease. Includes an engineered guide ribonucleic acid structure, including a constructed tracr ribonucleic acid sequence.

いくつかの実施形態では、エンドヌクレアーゼは難培養性微生物に由来する。いくつかの実施形態では、エンドヌクレアーゼは、異なるＰＡＭ配列に結合するように操作されていない。いくつかの実施形態では、エンドヌクレアーゼは、Ｃａｓ９エンドヌクレアーゼ、Ｃａｓ１４エンドヌクレアーゼ、Ｃａｓ１２ａエンドヌクレアーゼ、Ｃａｓ１２ｂエンドヌクレアーゼ、Ｃａｓ１２ｃエンドヌクレアーゼ、Ｃａｓ１２ｄエンドヌクレアーゼ、Ｃａｓ１２ｅエンドヌクレアーゼ、Ｃａｓ１３ａエンドヌクレアーゼ、Ｃａｓ１３ｂエンドヌクレアーゼ、Ｃａｓ１３ｃエンドヌクレアーゼ、またはＣａｓ１３ｄエンドヌクレアーゼではない。いくつかの実施形態では、エンドヌクレアーゼは、Ｃａｓ９エンドヌクレアーゼに対して８０％未満の同一性を有する。いくつかの実施形態では、エンドヌクレアーゼはＨＮＨドメインをさらに含む。いくつかの実施形態では、ｔｒａｃｒリボ核酸配列は、配列番号：５４７６－５５１１および配列番号：５５３８のいずれか１つから選択される約６０～９０の連続するヌクレオチドに対して少なくとも８０％の配列同一性を有する配列を含む。 In some embodiments, endonucleases are derived from refractory microorganisms. In some embodiments, the endonuclease has not been engineered to bind to a different PAM sequence. In some embodiments, the endonucleases are Cas9 endonucleases, Cas14 endonucleases, Cas12a endonucleases, Cas12b endonucleases, Cas12c endonucleases, Cas12d endonucleases, Cas12e endonucleases, Cas13a endonucleases, Cas13b endonucleases, Cas13c endonucleases. , Or Cas13d endonuclease. In some embodiments, the endonuclease has less than 80% identity to the Cas9 endonuclease. In some embodiments, the endonuclease further comprises an HNH domain. In some embodiments, the tracr ribonucleic acid sequence is at least 80% sequence identical to about 60-90 contiguous nucleotides selected from any one of SEQ ID NO: 5476-5511 and SEQ ID NO: 5538. Includes sex sequences.

いくつかの態様では、本開示は操作されたヌクレアーゼシステムを提供し、上記操作されたヌクレアーゼシステムは、（ａ）操作されたガイドリボ核酸構造であって、上記操作されたガイドリボ核酸構造は：（ｉ）標的デオキシリボ核酸配列にハイブリダイズするように構成されたガイドリボ核酸配列と；（ｉｉ）エンドヌクレアーゼに結合するように構成されたｔｒａｃｒリボ核酸配列であって、ここで、上記ｔｒａｃｒリボ核酸配列は、配列番号：５４７６－５５１１および配列番号：５５３８のいずれか１つから選択される約６０～９０の連続するヌクレオチドに対して少なくとも８０％の配列同一性を有する配列を含む、ｔｒａｃｒリボ核酸配列と；を含む、操作されたガイドリボ核酸構造と、（ｂ）操作されたガイドリボ核酸に結合するように構成されたクラス２のＩＩ型Ｃａｓエンドヌクレアーゼと、を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５１２－５５３７を含む群から選択されるプロトスペーサー隣接モチーフ（ＰＡＭ）配列に結合するように構成される。 In some embodiments, the present disclosure provides an engineered nuclease system, wherein the engineered nuclease system is (a) an engineered guide ribonucleic acid structure and the engineered guide ribonucleic acid structure is: (i). ) A guide ribonucleic acid sequence configured to hybridize to a target deoxyribonucleic acid sequence; (ii) a tracr ribonucleic acid sequence configured to bind to an endonuclease, wherein the tracr ribonucleic acid sequence is here. With a tracr ribonucleic acid sequence comprising a sequence having at least 80% sequence identity to about 60-90 contiguous nucleotides selected from any one of SEQ ID NO: 5476-5511 and SEQ ID NO: 5538; Includes an engineered guide ribonucleic acid structure comprising (b) a class 2 type II Cas endonuclease configured to bind to the engineered guide ribonucleic acid. In some embodiments, the endonuclease is configured to bind to a protospacer flanking motif (PAM) sequence selected from the group comprising SEQ ID NO: 5512-5537.

いくつかの実施形態では、操作されたガイドリボ核酸構造は、少なくとも２つのリボ核酸ポリヌクレオチドを含む。いくつかの実施形態では、操作されたガイドリボ核酸構造は、ガイドリボ核酸配列とｔｒａｃｒリボ核酸配列とを含む１つのリボ核酸ポリヌクレオチドを含む。 In some embodiments, the engineered guide ribonucleic acid structure comprises at least two ribonucleic acid polynucleotides. In some embodiments, the engineered guide ribonucleic acid structure comprises one ribonucleic acid polynucleotide comprising a guide ribonucleic acid sequence and a tracr ribonucleic acid sequence.

いくつかの実施形態では、ガイドリボ核酸配列は、原核生物、細菌、古細菌、真核生物、真菌、植物、哺乳動物、またはヒトのゲノム配列に相補的である。いくつかの実施形態では、ガイドリボ核酸配列は、１５～２４ヌクレオチド長である。いくつかの実施形態では、エンドヌクレアーゼは、エンドヌクレアーゼのＮ末端またはＣ末端の近位にある１つ以上の核局在化配列（ＮＬＳ）を含む。いくつかの実施形態では、ＮＬＳは、配列番号：５５９７－５６１２から選択される配列を含む。 In some embodiments, the guideribonucleic acid sequence is complementary to a prokaryotic, bacterial, archaeal, eukaryotic, fungal, plant, mammalian, or human genomic sequence. In some embodiments, the guideribonucleic acid sequence is 15-24 nucleotides in length. In some embodiments, the endonuclease comprises one or more nuclear localization sequences (NLS) located proximal to the N-terminus or C-terminus of the endonuclease. In some embodiments, the NLS comprises a sequence selected from SEQ ID NO: 5597-5612.

いくつかの実施形態では、操作されたヌクレアーゼシステムは、５’から３’に、標的デオキシリボ核酸配列の５’に少なくとも２０のヌクレオチドの配列を含む第１の相同性アームと、少なくとも１０のヌクレオチドの合成ＤＮＡ配列と、標的デオキシリボ核酸配列の３’に少なくとも２０のヌクレオチドの配列を含む第２の相同性アームとを含む、一本鎖または二本鎖のＤＮＡ修復鋳型をさらに含む。いくつかの実施形態では、第１または第２の相同性アームは、少なくとも４０、８０、１２０、１５０、２００、３００、５００、または１，０００のヌクレオチドの配列を含む。 In some embodiments, the engineered nuclease system comprises a first homology arm comprising a sequence of at least 20 nucleotides in 5'of the target deoxyribonucleic acid sequence from 5'to 3'and at least 10 nucleotides. It further comprises a single-stranded or double-stranded DNA repair template comprising a synthetic DNA sequence and a second homology arm comprising a sequence of at least 20 nucleotides in 3'of the target deoxyribonucleic acid sequence. In some embodiments, the first or second homology arm comprises a sequence of at least 40, 80, 120, 150, 200, 300, 500, or 1,000 nucleotides.

いくつかの実施形態では、上記操作されたヌクレアーゼシステムは、Ｍｇ^２＋の供給源（ｓｏｕｒｃｅ）をさらに含む。 In some embodiments, the engineered nuclease system further comprises a source of Mg ²⁺ .

いくつかの実施形態では、エンドヌクレアーゼおよびｔｒａｃｒリボ核酸配列は、同じ門内の別個の細菌種に由来する。いくつかの実施形態では、エンドヌクレアーゼは、Ｄｅｒｍａｂａｃｔｅｒ属に属する細菌に由来する。いくつかの実施形態では、エンドヌクレアーゼは、Ｖｅｒｒｕｃｏｍｉｃｒｏｂｉａ門、ＣａｎｄｉｄａｔｕｓＰｅｒｅｇｒｉｎｉｂａｃｔｅｒｉａ門、またはＣａｎｄｉｄａｔｕｓＭｅｌａｉｎａｂａｃｔｅｒｉａ門に属する細菌に由来する。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５９２－５５９５のいずれか１つに対して少なくとも９０％の同一性を有する１６ＳｒＲＮＡ遺伝子を含む細菌に由来する。 In some embodiments, the endonuclease and tracr ribonucleic acid sequences are derived from distinct bacterial species within the same phylum. In some embodiments, the endonuclease is derived from a bacterium belonging to the genus Dermabacter. In some embodiments, the endonuclease is derived from a bacterium belonging to the phylum Verrucomicrobiota, the phylum Candidatus Peregrinibacteria, or the phylum Candidatus Melainabacteria. In some embodiments, the endonuclease is derived from a bacterium containing a 16S rRNA gene having at least 90% identity to any one of SEQ ID NOs: 5592-5595.

いくつかの実施形態では、ＨＮＨドメインは、配列番号：５６３８－５４６０のいずれか１つに対して少なくとも７０％または少なくとも８０％の同一性を有する配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：１－１８２６またはそれらに対して少なくとも５５％の同一性を有するその変異体を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：１８２７－１８３０あるいは配列番号：１８２７－２１４０からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。 In some embodiments, the HNH domain comprises a sequence having at least 70% or at least 80% identity to any one of SEQ ID NOs: 5638-5460. In some embodiments, the endonuclease comprises SEQ ID NO: 1-1826 or a variant thereof having at least 55% identity to them. In some embodiments, the endonuclease is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1827-1830 or SEQ ID NO: 1827-2140. include.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：３６３８－３６４１あるいは配列番号：３６３８－３９５４からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６１５－５６３２からなる群から選択される少なくとも１つ、少なくとも２つ、少なくとも３つ、少なくとも４つ、または少なくとも５つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：１－４あるいは配列番号：１－３１９からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。 In some embodiments, the endonuclease is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3638-3461 or SEQ ID NO: 3638-3954. include. In some embodiments, the endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5615-5632. In some embodiments, the endonuclease is a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1-4 or SEQ ID NO: 1-319. include.

いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４６１－５４６４、配列番号：５４７６－５４７９、あるいは配列番号：５４７６－５４８９からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、ステムおよびループからなるヘアピンを含むと予想されるＲＮＡ配列を含み、ここで、上記ステムは、少なくとも１０、少なくとも１２、または少なくとも１４の塩基対のリボヌクレオチド、および上記ループの４つの塩基対以内の非対称バルジを含む。 In some embodiments, the guide RNA structure is at least 70%, 80% relative to a sequence selected from the group consisting of SEQ ID NO: 5461-5464, SEQ ID NO: 5476-5479, or SEQ ID NO: 5476-5489. , Or sequences that are 90% identical. In some embodiments, the guide RNA structure comprises an RNA sequence that is expected to include a hairpin consisting of a stem and a loop, wherein the stem is at least 10, at least 12, or at least 14 base pair ribos. Includes nucleotides and asymmetric bulges within the four base pairs of the loop.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５１２－５５１５あるいは配列番号：５５２７－５５３０からなる群から選択される配列を含むＰＡＭに結合するように構成される。 In some embodiments, the endonuclease is configured to bind to a PAM comprising a sequence selected from the group consisting of SEQ ID NO: 5512-5515 or SEQ ID NO: 5527-5530.

いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：１８２７に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４６１あるいは配列番号：５４７６の少なくとも１つに対して少なくとも７０％、少なくとも８０％、または少なくとも９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５１２あるいは配列番号：５５２７を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：１８２８に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４６２あるいは配列番号：５４７７の少なくとも１つに対して少なくとも７０％、少なくとも８０％、あるいは少なくとも９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５１３あるいは配列番号：５５２８を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：１８２９に対して少なくとも７０％、少なくとも８０％、あるいは少なくとも９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４６３あるいは配列番号：５４７８の少なくとも１つに対して少なくとも７０％、少なくとも８０％、あるいは少なくとも９０％同一である配列を含み；および、（ｃ）エンドヌクレアーゼは、配列番号：５５１４あるいは配列番号：５５２９を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：１８３０に対して少なくとも７０％、少なくとも８０％、あるいは少なくとも９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４６４あるいは配列番号：５４７９の少なくとも１つに対して少なくとも７０％、少なくとも８０％、あるいは少なくとも９０％同一である配列を含み；および、（ｃ）エンドヌクレアーゼは、配列番号：５５１５あるいは配列番号：５５３０を含むＰＡＭに結合するように構成される。 In some embodiments, (a) the endonuclease comprises a sequence that is at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 1827; (b) the guide RNA structure is SEQ ID NO: : 5461 or contains a sequence that is at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 5476; and (c) the endonuclease is SEQ ID NO: 5512 or SEQ ID NO: 5527. Is configured to bind to a PAM containing. In some embodiments, (a) the endonuclease comprises a sequence that is at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 1828; (b) the guide RNA structure is SEQ ID NO: : 5462 or contains a sequence that is at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 5477; and (c) the endonuclease is SEQ ID NO: 5513 or SEQ ID NO: 5528. Is configured to bind to a PAM containing. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 1829; (b) the guide RNA structure is SEQ ID NO: : 546 or contains a sequence that is at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 5478; and (c) the endonuclease is SEQ ID NO: 5514 or SEQ ID NO: :. It is configured to bind to a PAM containing 5259. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 1830; (b) the guide RNA structure is SEQ ID NO: : 5464 or contains a sequence that is at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 5479; and (c) the endonuclease is SEQ ID NO: 5515 or SEQ ID NO: :. It is configured to bind to a PAM containing 5530.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：２１４１－２１４２あるいは配列番号：２１４１－２２４１からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：３９５５－３９５６あるいは配列番号：３９５５－４０５５からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６３２－５６３８からなる群から選択される少なくとも１つ、少なくとも２つ、少なくとも３つ、少なくとも４つ、または少なくとも５つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：３２０－３２１あるいは配列番号：３２０－４２０からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４６５、配列番号：５４９０－５４９１、あるいは配列番号：５４９０－５４９４からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、少なくとも８、少なくとも１０、または少なくとも１２の塩基対のリボヌクレオチドを含むヘアピンを含む、ｔｒａｃｒリボ核酸配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５１６および配列番号：５５３１からなる群から選択される配列を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：２１４１に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４９０に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５３１を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：２１４２に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４６５あるいは配列番号：５４９１に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５１６を含むＰＡＭに結合するように構成される。 In some embodiments, the endonuclease is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2141-2142 or SEQ ID NO: 2141-2241. include. In some embodiments, the endonuclease is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3955-3965 or SEQ ID NO: 3955-4055. include. In some embodiments, the endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5632-5638. In some embodiments, the endonuclease is a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 320-321 or SEQ ID NO: 320-420. include. In some embodiments, the guide RNA structure is at least 70%, 80%, or at least 70%, or 80% of a sequence selected from the group consisting of SEQ ID NO: 5465, SEQ ID NO: 5490-5491, or SEQ ID NO: 5490-5494. Contains sequences that are 90% identical. In some embodiments, the guide RNA structure comprises a tracr ribonucleic acid sequence comprising a hairpin containing at least 8, at least 10, or at least 12 base pair ribonucleotides. In some embodiments, the endonuclease is configured to bind to a PAM comprising a sequence selected from the group consisting of SEQ ID NO: 5516 and SEQ ID NO: 5531. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2141; (b) the guide RNA structure is SEQ ID NO: 5490. Containing sequences that are at least 70%, 80%, or 90% identical relative to; and (c) the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5531. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2142; (b) the guide RNA structure is SEQ ID NO: 5465. Alternatively, it comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5491; and (c) the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5516.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：２２４５－２２４６からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４０５９－４０６０からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６３９－５６４８からなる群から選択される少なくとも１つ、少なくとも２つ、少なくとも３つ、少なくとも４つ、または少なくとも５つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４２４－４２５からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４９８－５４９９および配列番号：５５３９からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、ガイドリボ核酸配列の少なくとも８つのヌクレオチドおよびｔｒａｃｒリボ核酸配列の少なくとも８つのヌクレオチドを含む中断されていない塩基対領域を有するヘアピンを含むと予想されるガイドリボ核酸配列を含み、ここで、上記ｔｒａｃｒリボ核酸配列は、５’から３’に、第１のヘアピンと第２のヘアピンとを含み、ここで、上記第１のヘアピンは上記第２のヘアピンよりも長いステムを有する。 In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2245-2246. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4059-4060. In some embodiments, the endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5639-5648. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 424-425. In some embodiments, the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5498-5499 and SEQ ID NO: 5539. .. In some embodiments, the guide RNA structure is expected to include a hairpin having an uninterrupted base pair region containing at least 8 nucleotides of the guide ribonucleic acid sequence and at least 8 nucleotides of the tracr ribonucleic acid sequence. Includes a sequence, wherein the tracr ribonucleic acid sequence comprises a first hairpin and a second hairpin at 5'to 3'where the first hairpin is more than the second hairpin. Has a long stem.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：２２４２－２２４４あるいは配列番号：２２４７－２２４９からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４０５６－４０５８および配列番号：４０６１－４０６３からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６３９－５６４８からなる群から選択される少なくとも１つ、少なくとも２つ、少なくとも３つ、少なくとも４つ、または少なくとも５つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４２１－４２３あるいは配列番号：４２６－４２８からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４６６－５４６７、配列番号：５４９５－５４９７、配列番号：５５００－５５０２、および配列番号：５５３９からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、ガイドリボ核酸配列の少なくとも８つのヌクレオチドおよびｔｒａｃｒリボ核酸配列の少なくとも８つのヌクレオチドを含む中断されていない塩基対領域を有するヘアピンを含むと予想されるガイドリボ核酸配列を含み、ここで、上記ｔｒａｃｒリボ核酸配列は、５’から３’に、第１のヘアピンと第２のヘアピンとを含み、ここで、上記第１のヘアピンは上記第２のヘアピンよりも長いステムを有する。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５１７－５５１８あるいは配列番号：５５３２－５５３４からなる群から選択される配列を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：２２４７に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５５００に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５１７あるいは配列番号：５５３２を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：２２４８に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５５０１に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５１８あるいは配列番号：５５３３を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：２２４９に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５５０２に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５３４を含むＰＡＭに結合するように構成される。 In some embodiments, the endonuclease is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2242-2244 or SEQ ID NO: 2247-2249. include. In some embodiments, the endonuclease is at least 70%, 80%, or 90% identical to the sequence selected from the group consisting of SEQ ID NO: 4056-4058 and SEQ ID NO: 4061-4063. include. In some embodiments, the endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5639-5648. In some embodiments, the endonuclease is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 421-423 or SEQ ID NO: 426-428. include. In some embodiments, the guide RNA structure is at least for a sequence selected from the group consisting of SEQ ID NO: 5466-5467, SEQ ID NO: 5495-5497, SEQ ID NO: 5500-5502, and SEQ ID NO: 5539. Contains sequences that are 70%, 80%, or 90% identical. In some embodiments, the guide RNA structure is expected to include a hairpin having an uninterrupted base pair region containing at least 8 nucleotides of the guide ribonucleic acid sequence and at least 8 nucleotides of the tracr ribonucleic acid sequence. Includes a sequence, wherein the tracr ribonucleic acid sequence comprises a first hairpin and a second hairpin at 5'to 3'where the first hairpin is more than the second hairpin. Has a long stem. In some embodiments, the endonuclease is configured to bind to a PAM comprising a sequence selected from the group consisting of SEQ ID NO: 5517-5518 or SEQ ID NO: 5532-5534. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2247; (b) the guide RNA structure is SEQ ID NO: 5500. Containing sequences that are at least 70%, 80%, or 90% identical relative to; and (c) the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5517 or SEQ ID NO: 5532. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2248; (b) the guide RNA structure is SEQ ID NO: 5501. Containing sequences that are at least 70%, 80%, or 90% identical relative to; and (c) the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5518 or SEQ ID NO: 5533. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2249; (b) the guide RNA structure is SEQ ID NO: 5502. Containing sequences that are at least 70%, 80%, or 90% identical relative to; and (c) the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5534.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：２２５３あるいは配列番号：２２５３－２４８１からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４０６７あるいは配列番号：４０６７－４２９５からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６４９のペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４３２あるいは配列番号：４３２－６６０からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４６８あるいは配列番号：５５０３からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５１９からなる群から選択される配列を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：２２５３に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４６８あるいは配列番号：５５０３に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５１９を含むＰＡＭに結合するように構成される。 In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2253 or SEQ ID NO: 2253-2481. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4067 or SEQ ID NO: 4067-4295. In some embodiments, the endonuclease comprises a peptide motif of SEQ ID NO: 5649. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 432 or SEQ ID NO: 432-660. In some embodiments, the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5468 or SEQ ID NO: 5503. In some embodiments, the endonuclease is configured to bind to a PAM containing a sequence selected from the group consisting of SEQ ID NO: 5519. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2253; (b) the guide RNA structure is SEQ ID NO: 5468. Alternatively, it comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5503; and (c) the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5519.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：２４８２－２４８９からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４２９６－４３０３からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：６６１－６６８からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：２４９０－２４９８からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４３０４－４３１２からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：６６９－６７７からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５５０４からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。 In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to the sequence selected from the group consisting of SEQ ID NO: 2482-2489. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to the sequence selected from the group consisting of SEQ ID NO: 4296-4303. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 661-668. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2490-2498. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to the sequence selected from the group consisting of SEQ ID NO: 4304-4212. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 669-677. In some embodiments, the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to the sequence selected from the group consisting of SEQ ID NO: 5504.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：２４９９あるいは配列番号：２４９９－２７５０からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４３１３あるいは配列番号：４３１３－４５６４からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６５０－５６６７からなる群から選択される少なくとも１つ、少なくとも２つ、少なくとも３つ、少なくとも４つ、または少なくとも５つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：６７８あるいは配列番号：６７８－９２９からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４６９あるいは配列番号：５５０５に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５２０あるいは配列番号：５５３５を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：２４９９に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４６９あるいは配列番号：５５０５に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５２０あるいは配列番号：５５３５を含むＰＡＭに結合するように構成される。 In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2499 or SEQ ID NO: 2499-2750. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4313 or SEQ ID NO: 4313-4564. In some embodiments, the endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5650-5667. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 678 or SEQ ID NO: 678-929. In some embodiments, the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5469 or SEQ ID NO: 5505. In some embodiments, the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5520 or SEQ ID NO: 5535. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2499; (b) the guide RNA structure is SEQ ID NO: 5469. Alternatively, it comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5505; and (c) the endonuclease binds to a PAM comprising SEQ ID NO: 5520 or SEQ ID NO: 5535. It is composed.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：２７５１あるいは配列番号：２７５１－２９１３からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４５６５あるいは配列番号：４５６５－４７２７からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６６８－５６７８からなる群から選択される少なくとも１つ、少なくとも２つ、少なくとも３つ、少なくとも４つ、または少なくとも５つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：９３０あるいは配列番号：９３０－１０９２からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４７０あるいは配列番号：５５０６に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５２１あるいは配列番号：５５３６からなる群から選択される配列を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：２７５１に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４７０あるいは配列番号：５５０６に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５２１あるいは配列番号：５５３６を含むＰＡＭに結合するように構成される。 In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2751 or SEQ ID NO: 2751-2913. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4565 or SEQ ID NO: 4565-4727. In some embodiments, the endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5668-5678. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 930 or SEQ ID NO: 930-1092. In some embodiments, the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5470 or SEQ ID NO: 5506. In some embodiments, the endonuclease is configured to bind to a PAM comprising a sequence selected from the group consisting of SEQ ID NO: 5521 or SEQ ID NO: 5536. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2751; (b) the guide RNA structure is SEQ ID NO: 5470. Alternatively, it comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5506; and (c) the endonuclease binds to a PAM comprising SEQ ID NO: 5521 or SEQ ID NO: 5536. It is composed.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：２９１４あるいは配列番号：２９１４－３１７４からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４７２８あるいは配列番号：４７２８－４９８８からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６７６－５６７８からなる群から選択される少なくとも１つ、少なくとも２つ、または少なくとも３つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：１０９３あるいは配列番号：１０９３－１３５３からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４７１、配列番号：５５０７、および配列番号：５５４０－５５４２からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、５つ未満の塩基対のリボヌクレオチドを含む少なくとも２つのヘアピンを含むと予想されるｔｒａｃｒリボ核酸配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５２２を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：２９１４に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４７１あるいは配列番号：５５０７に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５２２を含むＰＡＭに結合するように構成される。 In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2914 or SEQ ID NO: 2914-3174. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4728 or SEQ ID NO: 4728-4988. In some embodiments, the endonuclease comprises at least one, at least two, or at least three peptide motifs selected from the group consisting of SEQ ID NO: 5676-5678. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1093 or SEQ ID NO: 1093-1353. In some embodiments, the guide RNA structure is at least 70%, 80%, or 90% of the sequence selected from the group consisting of SEQ ID NO: 5471, SEQ ID NO: 5507, and SEQ ID NO: 5540-5542. Contains sequences that are identical. In some embodiments, the guide RNA structure comprises a tracr ribonucleic acid sequence that is expected to contain at least two hairpins containing less than five base pair ribonucleotides. In some embodiments, the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5522. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2914; (b) the guide RNA structure is SEQ ID NO: 5471. Alternatively, it comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5507; and (c) the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5522.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：３１７５あるいは配列番号：３１７５－３３３０からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：４９８９あるいは配列番号：４９８９－５１４６からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６７９－５６８６からなる群から選択される少なくとも１つ、少なくとも２つ、少なくとも３つ、少なくとも４つ、または少なくとも５つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：１３５４あるいは配列番号：１３５４－１５１１からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４７２あるいは配列番号：５５０８からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５２３あるいは配列番号：５５３７からなる群から選択される配列を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：３１７５に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４７２あるいは配列番号：５５０８に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５２３あるいは配列番号：５５３７を含むＰＡＭに結合するように構成される。 In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3175 or SEQ ID NO: 3175-3330. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4989 or SEQ ID NO: 4989-5146. In some embodiments, the endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5679-5686. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1354 or SEQ ID NO: 1354-1511. In some embodiments, the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5472 or SEQ ID NO: 5508. In some embodiments, the endonuclease is configured to bind to a PAM comprising a sequence selected from the group consisting of SEQ ID NO: 5523 or SEQ ID NO: 5537. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 3175; (b) the guide RNA structure is SEQ ID NO: 5472. Alternatively, it comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5508; and (c) the endonuclease binds to a PAM comprising SEQ ID NO: 5523 or SEQ ID NO: 5537. It is composed.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：３３３１あるいは配列番号：３３３１－３４７４からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５１４７あるいは配列番号：５１４７－５２９０からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６７４－５６７５および配列番号：５６８７－５６９３からなる群から選択される少なくとも１つ、少なくとも２つ、少なくとも３つ、少なくとも４つ、または少なくとも５つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：１５１２あるいは配列番号：１５１２－１６５５からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４７３あるいは配列番号：５５０９からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５２４を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：３３３１に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４７３あるいは配列番号：５５０９に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５２４を含むＰＡＭに結合するように構成される。 In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3331 or SEQ ID NO: 3331-3474. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5147 or SEQ ID NO: 5147-5290. In some embodiments, the endonuclease is at least one, at least two, at least three, at least four, or at least five selected from the group consisting of SEQ ID NO: 5674-5675 and SEQ ID NO: 5687-5693. Contains two peptide motifs. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1512 or SEQ ID NO: 1512-1655. In some embodiments, the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5473 or SEQ ID NO: 5509. In some embodiments, the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5524. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 3331; (b) the guide RNA structure is SEQ ID NO: 5473. Alternatively, it comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5509; and (c) the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5524.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：３４７５あるいは配列番号：３４７５－３５６８からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５２９１あるいは配列番号：５２９１－５３８９からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５６９４－５６９９からなる群から選択される少なくとも１つ、少なくとも２つ、少なくとも３つ、少なくとも４つ、または少なくとも５つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：１６５６あるいは配列番号：１６５６－１７５５からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４７４あるいは配列番号：５５１０に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５２５を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：３４７５に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４７４あるいは配列番号：５５１０に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５２５を含むＰＡＭに結合するように構成される。 In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3475 or SEQ ID NO: 3475-3568. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5291 or SEQ ID NO: 5291-5389. In some embodiments, the endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5694-569. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1656 or SEQ ID NO: 1656-1755. In some embodiments, the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5474 or SEQ ID NO: 5510. In some embodiments, the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5525. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 3475; (b) the guide RNA structure is SEQ ID NO: 5474. Alternatively, it comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5510; and (c) the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5525.

いくつかの実施形態では、エンドヌクレアーゼは、配列番号：３５６９あるいは配列番号：３５６９－３６３７からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５３９０あるいは配列番号：５３９０－５４６０からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは配列番号：５７００－５７１７からなる群から選択される少なくとも１つ、少なくとも２つ、少なくとも３つ、少なくとも４つ、または少なくとも５つのペプチドモチーフを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：１７５６あるいは配列番号：１７５６－１８２６からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、ガイドＲＮＡ構造は、配列番号：５４７５あるいは配列番号：５５１１に対して少なくとも７０％、８０％、または９０％同一である配列を含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５２６を含むＰＡＭに結合するように構成される。いくつかの実施形態では、（ａ）エンドヌクレアーゼは、配列番号：３５６９に対して少なくとも７０％、８０％、または９０％同一である配列を含み；（ｂ）ガイドＲＮＡ構造は、配列番号：５４７５あるいは配列番号：５５１１に対して少なくとも７０％、８０％、または９０％同一である配列を含み；および（ｃ）エンドヌクレアーゼは、配列番号：５５２６を含むＰＡＭに結合するように構成される。いくつかの実施形態では、配列同一性は、ＢＬＡＳＴＰ、ＣＬＵＳＴＡＬＷ、ＭＵＳＣＬＥ、ＭＡＦＦＴ、またはＳｍｉｔｈ－Ｗａｔｅｒｍａｎ相同性検索アルゴリズムによって決定される。いくつかの実施形態では、配列同一性は、３のｗｏｒｄｌｅｎｇｔｈ（Ｗ）、１０のｅｘｐｅｃｔａｔｉｏｎ（Ｅ）のパラメータ、および１１のｅｘｉｓｔｅｎｃｅ、１のｅｘｔｅｎｓｉｏｎでギャップコストを設定するＢＬＯＳＵＭ６２スコアリングマトリックスのパラメータを使用して、ならびに条件付き組成スコアマトリックス調整（ｃｏｎｄｉｔｉｏｎａｌｃｏｍｐｏｓｉｔｉｏｎａｌｓｃｏｒｅｍａｔｒｉｘａｄｊｕｓｔｍｅｎｔ）を使用して、ＢＬＡＳＴＰ相同性検索アルゴリズムによって決定される。 In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3569 or SEQ ID NO: 3569-3637. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5390 or SEQ ID NO: 5390-5460. In some embodiments, the endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5700-5717. In some embodiments, the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1756 or SEQ ID NO: 1756-1826. In some embodiments, the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5475 or SEQ ID NO: 5511. In some embodiments, the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5526. In some embodiments, (a) the endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 3569; (b) the guide RNA structure is SEQ ID NO: 5475. Alternatively, it comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5511; and (c) the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5526. In some embodiments, sequence identity is determined by BLASTP, CLUSTALW, MUSCLE, MAFFT, or Smith-Waterman homology search algorithms. In some embodiments, sequence identity uses parameters from the BLASTUM62 scoring matrix that set the gap cost at 3 wordlength (W), 10 extraction (E) parameters, and 11 extensions and 1 extensions. And, as well as using a conditional compositional score matrix adjustment, it is determined by the BLASTP homology search algorithm.

いくつかの態様では、本開示は操作されたガイドリボ核酸ポリヌクレオチドを提供し、上記操作されたガイドリボ核酸ポリヌクレオチドは：（ａ）標的ＤＮＡ分子中の標的配列に相補的なヌクレオチド配列を含むＤＮＡ標的化セグメントと；（ｂ）ハイブリダイズして二本鎖ＲＮＡ（ｄｓＲＮＡ）二重鎖を形成するヌクレオチドの２つの相補的なストレッチを含むタンパク質結合セグメントであって、ここで、上記ヌクレオチドの２つの相補的なストレッチは介在ヌクレオチドで互いに共有結合し、ここで、上記操作されたガイドリボ核酸ポリヌクレオチドは、配列番号：１８２７－３６３７のいずれか１つに対して少なくとも７５％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含むエンドヌクレアーゼと複合体を形成し、上記複合体を標的ＤＮＡ分子の標的配列に標的化するように構成される。いくつかの実施形態では、ＤＮＡ標的化セグメントは、ヌクレオチドの２つの相補的なストレッチの両方の５’に位置する。 In some embodiments, the present disclosure provides an engineered guide ribonucleic acid polynucleotide, wherein the engineered guide ribonucleic acid polynucleotide comprises: (a) a DNA target comprising a nucleotide sequence complementary to the target sequence in the target DNA molecule. A protein binding segment comprising two complementary stretches of a nucleotide that hybridizes with a modified segment to form a double-stranded RNA (dsRNA) duplex, wherein the two complements of the above nucleotide. Stretches covalently bind to each other at intervening nucleotides, wherein the engineered guide ribonucleic acid polynucleotide has at least 75% sequence identity to any one of SEQ ID NOs: 1827-3637. It is configured to form a complex with an endonuclease containing the above complex and target the complex to the target sequence of the target DNA molecule. In some embodiments, the DNA targeting segment is located at 5'both of the two complementary stretches of the nucleotide.

いくつかの実施形態では、（ａ）タンパク質結合セグメントは、配列番号：５４７６－５４７９あるいは配列番号：５４７６－５４８９からなる群から選択される配列に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｂ）タンパク質結合セグメントは、（配列番号：５４９０－５４９１あるいは配列番号：５４９０－５４９４）および配列番号：５５３８からなる群から選択される配列に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｃ）タンパク質結合セグメントは、配列番号：５４９８－５４９９からなる群から選択される配列に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｄ）タンパク質結合セグメントは、配列番号：５４９５－５４９７および配列番号：５５００－５５０２からなる群から選択される配列に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｅ）タンパク質結合セグメントは、配列番号：５５０３に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｆ）タンパク質結合セグメントは、配列番号：５５０４に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｇ）タンパク質結合セグメントは、配列番号：５５０５に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｈ）タンパク質結合セグメントは、配列番号：５５０６に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｉ）タンパク質結合セグメントは、配列番号：５５０７に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｊ）タンパク質結合セグメントは、配列番号：５５０８に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｋ）タンパク質結合セグメントは、配列番号：５５０９に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；（ｌ）タンパク質結合セグメントは、配列番号：５５１０に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含み；あるいは、（ｍ）タンパク質結合セグメントは、配列番号：５５１１に対して少なくとも７０％、少なくとも８０％、または少なくとも９０％の同一性を有する配列を含む。 In some embodiments, (a) the protein binding segment is at least 70%, at least 80%, or at least 90 of the sequence selected from the group consisting of SEQ ID NO: 5476-5479 or SEQ ID NO: 5476-5489. Includes a sequence having% identity; (b) the protein binding segment is at least relative to a sequence selected from the group consisting of (SEQ ID NO: 5490-5491 or SEQ ID NO: 5490-5494) and SEQ ID NO: 5538. Containing sequences with 70%, at least 80%, or at least 90% identity; (c) protein binding segments are at least 70%, at least 70%, at least relative to the sequence selected from the group consisting of SEQ ID NO: 5498-5499. Containing sequences with 80%, or at least 90% identity; (d) the protein binding segment is at least 70 relative to the sequence selected from the group consisting of SEQ ID NO: 5495-5497 and SEQ ID NO: 5500-5502. Containing sequences having%, at least 80%, or at least 90% identity; (e) the protein binding segment has at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5503. Containing the sequence having; (f) the protein binding segment comprises a sequence having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5504; (g) the protein binding segment comprises the sequence. Includes sequences having at least 70%, at least 80%, or at least 90% identity to number: 5505; (h) protein binding segments are at least 70%, at least 80%, to SEQ ID NO: 5506. Or contains a sequence having at least 90% identity; (i) the protein binding segment comprises a sequence having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5507; ( j) The protein binding segment comprises a sequence having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5508; (k) the protein binding segment to SEQ ID NO: 5509. Containing sequences with at least 70%, at least 80%, or at least 90% identity; (l) protein binding segments are at least 70% less than SEQ ID NO: 5510. Both contain sequences having 80%, or at least 90% identity; or (m) the protein binding segment has at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5511. Includes sequences with.

いくつかの実施形態では、（ａ）ガイドリボ核酸ポリヌクレオチドは、ステムとループとを含むヘアピンを含むＲＮＡ配列を含み、ここで、上記ステムは、少なくとも１０、少なくとも１２、または少なくとも１４の塩基対のリボヌクレオチド、および上記ループの４つの塩基対内の非対称バルジを含み；（ｂ）ガイドリボ核酸ポリヌクレオチドは、少なくとも８、少なくとも１０、または少なくとも１２の塩基対のリボヌクレオチドを含むヘアピンを含むと予想されるｔｒａｃｒリボ核酸配列を含み；（ｃ）ガイドリボ核酸ポリヌクレオチドは、ガイドリボ核酸配列の少なくとも８つのヌクレオチドとｔｒａｃｒリボ核酸配列の少なくとも８つのヌクレオチドとを含む中断されていない塩基対領域を有するヘアピンを含むと予想されるガイドリボ核酸配列を含み、ここで、上記ｔｒａｃｒリボ核酸配列は、５’から３’に、第１のヘアピンと第２のヘアピンとを含み、ここで、上記第１のヘアピンは上記第２のヘアピンよりも長いステムを有し；あるいは、（ｄ）ガイドリボ核酸ポリヌクレオチドは、５つ未満の塩基対のリボヌクレオチドを含む少なくとも２つのヘアピンを含むと予想されるｔｒａｃｒリボ核酸配列を含む。 In some embodiments, (a) a guide ribonucleic acid polynucleotide comprises an RNA sequence comprising a hairpin comprising a stem and a loop, wherein the stem is at least 10, at least 12, or at least 14 base pairs. Includes ribonucleotides and asymmetric bulges within the four base pairs of the loop; (b) Guided ribonucleic acid polynucleotides are expected to include hairpins containing at least 8, at least 10, or at least 12 base pairs of ribonucleotides. Containing the tracr ribonucleic acid sequence; (c) the guide ribonucleic acid polynucleotide comprises a hairpin having an uninterrupted base pair region containing at least 8 nucleotides of the guide ribonucleic acid sequence and at least 8 nucleotides of the tracr ribonucleic acid sequence. It comprises the expected guide ribonucleic acid sequence, wherein the tracr ribonucleic acid sequence comprises a first hairpin and a second hairpin in 5'to 3', where the first hairpin is the first. It has a stem longer than 2 hairpins; or (d) the guide ribonucleic acid polynucleotide comprises a tracr ribonucleic acid sequence that is expected to contain at least 2 hairpins containing ribonucleotides of less than 5 base pairs.

いくつかの態様では、本開示は、本明細書に記載される操作されたガイドリボ核酸ポリヌクレオチドのいずれかをコードするデオキシリボ核酸ポリヌクレオチドを提供する。 In some embodiments, the present disclosure provides a deoxyribonucleic acid polynucleotide that encodes one of the engineered guide ribonucleic acid polynucleotides described herein.

いくつかの態様では、本開示は、生物における発現のために最適化された、操作された核酸配列を含む核酸を提供し、ここで、上記核酸は、ＲｕｖＣ＿ＩＩＩドメインとＨＮＨドメインとを含むクラス２のＩＩ型Ｃａｓエンドヌクレアーゼをコードし、ここで、上記クラス２のＩＩ型Ｃａｓエンドヌクレアーゼは、難培養性微生物に由来する。 In some embodiments, the present disclosure provides a nucleic acid comprising an engineered nucleic acid sequence optimized for expression in an organism, wherein the nucleic acid is a class 2 comprising a RuvC_III domain and an HNH domain. Encodes the type II Cas endonuclease of, where the class 2 type II Cas endonuclease is derived from a refractory microorganism.

いくつかの態様では、本開示は、生物における発現のために最適化された、操作された核酸配列を含む核酸を提供し、ここで、上記核酸は、配列番号：１８２７－３６３７のいずれか１つに対して少なくとも７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含むエンドヌクレアーゼをコードする。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：３６３８－５４６０のいずれか１つに対して少なくとも７０％または少なくとも８０％の配列同一性を有するＨＮＨドメインを含む。いくつかの実施形態では、エンドヌクレアーゼは、配列番号：５５７２－５５９１またはそれらに対して少なくとも７０％の配列同一性を有するその変異体を含む。いくつかの実施形態では、エンドヌクレアーゼは、エンドヌクレアーゼのＮ末端またはＣ末端の近位にある１つ以上の核局在化配列（ＮＬＳ）をコードする配列を含む。いくつかの実施形態では、ＮＬＳは、配列番号：５５９７－５６１２から選択される配列を含む。 In some embodiments, the present disclosure provides a nucleic acid comprising an engineered nucleic acid sequence optimized for expression in an organism, wherein the nucleic acid is any one of SEQ ID NOs: 1827-3637. It encodes an endonuclease containing a RuvC_III domain with at least 70% sequence identity to one. In some embodiments, the endonuclease comprises an HNH domain having at least 70% or at least 80% sequence identity to any one of SEQ ID NOs: 3638-5460. In some embodiments, the endonuclease comprises SEQ ID NO: 5571-5591 or a variant thereof having at least 70% sequence identity relative to them. In some embodiments, the endonuclease comprises a sequence encoding one or more nuclear localization sequences (NLS) located proximal to the N-terminus or C-terminus of the endonuclease. In some embodiments, the NLS comprises a sequence selected from SEQ ID NO: 5597-5612.

いくつかの実施形態では、生物は、原核生物、細菌、真核生物、真菌、植物、哺乳動物、げっ歯類、またはヒトである。いくつかの実施形態では、生物は大腸菌であり、および：（ａ）核酸配列は、配列番号：５５７２－５５７５からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％の同一性を有する；（ｂ）核酸配列は、配列番号：５５７６－５５７７からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％の同一性を有する；（ｃ）核酸配列は、配列番号：５５７８－５５８０からなる群から選択される配列に対して少なくとも７０％、８０％、または９０％の同一性を有する；（ｄ）核酸配列は、配列番号：５５８１に対して少なくとも７０％、８０％、または９０％の同一性を有する；（ｅ）核酸配列は、配列番号：５５８２に対して少なくとも７０％、８０％、または９０％の同一性を有する；（ｆ）核酸配列は、配列番号：５５８３に対して少なくとも７０％、８０％、または９０％の同一性を有する；（ｇ）核酸配列は、配列番号：５５８４に対して少なくとも７０％、８０％、または９０％の同一性を有する；（ｈ）核酸配列は、配列番号：５５８５に対して少なくとも７０％、８０％、または９０％の同一性を有する；（ｉ）核酸配列は、配列番号：５５８６に対して少なくとも７０％、８０％、または９０％の同一性を有する；あるいは、（ｊ）核酸配列は、配列番号：５５８７に対して少なくとも７０％、８０％、または９０％の同一性を有する。いくつかの実施形態では、生物はヒトであり、および：（ａ）核酸配列は、配列番号：５５８８あるいは配列番号：５５８９に対して少なくとも７０％、８０％、または９０％の同一性を有する；あるいは、（ｂ）核酸配列は、配列番号：５５９０あるいは配列番号：５５９１に対して少なくとも７０％、８０％、または９０％の同一性を有する。 In some embodiments, the organism is a prokaryote, bacterium, eukaryote, fungus, plant, mammal, rodent, or human. In some embodiments, the organism is E. coli, and: (a) the nucleic acid sequence is at least 70%, 80%, or 90% of the sequence selected from the group consisting of SEQ ID NO: 5571-5575. Have identity; (b) the nucleic acid sequence has at least 70%, 80%, or 90% identity to the sequence selected from the group consisting of SEQ ID NO: 5576-5571; (c) nucleic acid sequence. Has at least 70%, 80%, or 90% identity to a sequence selected from the group consisting of SEQ ID NO: 5578-5580; (d) Nucleic acid sequence is at least relative to SEQ ID NO: 5581. Has 70%, 80%, or 90% identity; (e) Nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5582; (f) Nucleic acid sequence. Has at least 70%, 80%, or 90% identity to SEQ ID NO: 5583; (g) the nucleic acid sequence is at least 70%, 80%, or 90% to SEQ ID NO: 5584. Have identity; (h) the nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5585; (i) the nucleic acid sequence has at least 70% to SEQ ID NO: 5586. It has 70%, 80%, or 90% identity; or (j) the nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5587. In some embodiments, the organism is human and: (a) the nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5588 or SEQ ID NO: 5589; Alternatively, (b) the nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5590 or SEQ ID NO: 5591.

いくつかの態様では、本開示は、ＲｕｖＣ＿ＩＩＩドメインとＨＮＨドメインとを含むクラス２のＩＩ型Ｃａｓエンドヌクレアーゼをコードする核酸配列を含むベクターを提供し、ここで、上記エンドヌクレアーゼは難培養性微生物に由来する。 In some embodiments, the present disclosure provides a vector comprising a nucleic acid sequence encoding a class 2 type II Cas endonuclease comprising a RuvC_III domain and an HNH domain, wherein the endonuclease is here for refractory microorganisms. Derived from.

いくつかの態様では、本開示は、本明細書に記載される核酸のいずれかを含むベクターを提供する。いくつかの実施形態では、ベクターは、エンドヌクレアーゼと複合体を形成するように構成される操作されたガイドリボ核酸構造をコードする核酸をさらに含み、上記操作されたガイドリボ核酸構造は：（ａ）標的デオキシリボ核酸配列にハイブリダイズするように構成されたガイドリボ核酸配列と；（ｂ）エンドヌクレアーゼに結合するように構成されたｔｒａｃｒリボ核酸配列と、を含む。いくつかの実施形態では、ベクターは、プラスミド、ミニサークル、ＣＥＬｉＤ、アデノ随伴ウイルス（ＡＡＶ）由来のビリオン、またはレンチウイルスである。 In some embodiments, the present disclosure provides a vector comprising any of the nucleic acids described herein. In some embodiments, the vector further comprises a nucleic acid encoding an engineered guide ribonucleic acid structure configured to form a complex with an endonuclease, wherein the engineered guide ribonucleic acid structure is: (a) target. It comprises a guide ribonucleic acid sequence configured to hybridize to a deoxyribonucleic acid sequence; (b) a tracr ribonucleic acid sequence configured to bind to an endonuclease. In some embodiments, the vector is a plasmid, minicircle, CELiD, adeno-associated virus (AAV) -derived virion, or lentivirus.

いくつかの態様では、本開示は、本明細書に記載されるベクターのいずれかを含む細胞を提供する。 In some embodiments, the present disclosure provides cells comprising any of the vectors described herein.

いくつかの態様では、本開示は、エンドヌクレアーゼを製造する方法を提供し、上記方法は、本明細書に記載される細胞のいずれかを培養する工程を含む。 In some embodiments, the disclosure provides a method of producing an endonuclease, which method comprises culturing any of the cells described herein.

いくつかの態様では、本開示は、二本鎖デオキシリボ核酸ポリヌクレオチドを結合、切断、標識、または修飾するための方法を提供し、上記方法は：（ａ）クラス２のＩＩ型Ｃａｓエンドヌクレアーゼおよび上記二本鎖デオキシリボ核酸ポリヌクレオチドに結合するように構成される操作されたガイドリボ核酸構造と複合体を形成するクラス２のＩＩ型Ｃａｓエンドヌクレアーゼに、上記二本鎖デオキシリボ核酸ポリヌクレオチドを接触させる工程を含み；（ｂ）ここで、上記二本鎖デオキシリボ核酸ポリヌクレオチドは、プロトスペーサー隣接モチーフ（ＰＡＭ）を含み；（ｃ）上記ＰＡＭは、配列番号：５５１２－５５２６あるいは配列番号：５５２７－５５３７からなる群から選択される配列を含む。いくつかの実施形態では、二本鎖デオキシリボ核酸ポリヌクレオチドは、操作されたガイドリボ核酸構造の配列に相補的な配列を含む第１の鎖と、ＰＡＭを含む第２の鎖とを含む。いくつかの実施形態では、ＰＡＭは、操作されたガイドリボ核酸構造の配列に相補的な配列の３’末端に直接隣接している。 In some embodiments, the present disclosure provides methods for binding, cleaving, labeling, or modifying double-stranded deoxyribonucleic acid polynucleotides, wherein: (a) Class 2 type II Cas endonuclease and The step of contacting the double-stranded deoxyribonucleic acid polynucleotide with a class 2 type II Cas endonuclease that forms a complex with an engineered guide ribonucleic acid structure configured to bind to the double-stranded deoxyribonucleic acid polynucleotide. (B) Here, the double-stranded deoxyribonucleic acid polynucleotide comprises a protospacer flanking motif (PAM); (c) the PAM is from SEQ ID NO: 5512-5526 or SEQ ID NO: 5527-5537. Contains sequences selected from the group of In some embodiments, the double-stranded deoxyribonucleic acid polynucleotide comprises a first strand comprising a sequence complementary to the sequence of the engineered guide ribonucleic acid structure and a second strand comprising PAM. In some embodiments, the PAM is directly flanking the 3'end of a sequence complementary to the sequence of the engineered guide ribonucleic acid structure.

いくつかの実施形態では、クラス２のＩＩ型Ｃａｓエンドヌクレアーゼは、Ｃａｓ９エンドヌクレアーゼ、Ｃａｓ１４エンドヌクレアーゼ、Ｃａｓ１２ａエンドヌクレアーゼ、Ｃａｓ１２ｂエンドヌクレアーゼ、Ｃａｓ１２ｃエンドヌクレアーゼ、Ｃａｓ１２ｄエンドヌクレアーゼ、Ｃａｓ１２ｅエンドヌクレアーゼ、Ｃａｓ１３ａエンドヌクレアーゼ、Ｃａｓ１３ｂエンドヌクレアーゼ、Ｃａｓ１３ｃエンドヌクレアーゼ、またはＣａｓ１３ｄエンドヌクレアーゼではない。いくつかの実施形態では、クラス２のＩＩ型Ｃａｓエンドヌクレアーゼは、難培養性微生物に由来する。いくつかの実施形態では、二本鎖デオキシリボ核酸ポリヌクレオチドは、真核生物、植物、真菌、哺乳動物、げっ歯類、またはヒトの二本鎖デオキシリボ核酸ポリヌクレオチドである。 In some embodiments, class 2 type II Cas endonucleases are Cas9 endonucleases, Cas14 endonucleases, Cas12a endonucleases, Cas12b endonucleases, Cas12c endonucleases, Cas12d endonucleases, Cas12e endonucleases, Cas13a endonucleases, Cas13b. Not an endonuclease, Cas13c endonuclease, or Cas13d endonuclease. In some embodiments, class 2 type II Cas endonucleases are derived from refractory microorganisms. In some embodiments, the double-stranded deoxyribonucleic acid polynucleotide is a double-stranded deoxyribonucleic acid polynucleotide of eukaryote, plant, fungus, mammal, rodent, or human.

いくつかの実施形態では、（ａ）ＰＡＭは、配列番号：５５１２－５５１５および配列番号：５５２７－５５３０からなる群から選択される配列を含む；（ｂ）ＰＡＭは配列番号：５５１６あるいは配列番号：５５３１を含む；（ｃ）ＰＡＭは配列番号：５５３９を含む；（ｄ）ＰＡＭは配列番号：５５１７あるいは配列番号：５５１８を含む；（ｅ）ＰＡＭは配列番号：５５１９を含む；（ｆ）ＰＡＭは配列番号：５５２０あるいは配列番号：５５３５を含む；（ｇ）ＰＡＭは配列番号：５５２１あるいは配列番号：５５３６を含む；（ｈ）ＰＡＭは配列番号：５５２２を含む；（ｉ）ＰＡＭは配列番号：５５２３あるいは配列番号：５５３７を含む；（ｊ）ＰＡＭは配列番号：５５２４を含む；（ｋ）ＰＡＭは配列番号：５５２５を含む；または、（ｌ）ＰＡＭは配列番号：５５２６を含む。 In some embodiments, (a) PAM comprises a sequence selected from the group consisting of SEQ ID NO: 5512-5515 and SEQ ID NO: 5527-5530; (b) PAM comprises SEQ ID NO: 5516 or SEQ ID NO: :. Includes 5531; (c) PAM comprises SEQ ID NO: 5539; (d) PAM comprises SEQ ID NO: 5517 or SEQ ID NO: 5518; (e) PAM comprises SEQ ID NO: 5519; (f) PAM comprises. SEQ ID NO: 5520 or SEQ ID NO: 5535; (g) PAM comprises SEQ ID NO: 5521 or SEQ ID NO: 5536; (h) PAM comprises SEQ ID NO: 5522; (i) PAM comprises SEQ ID NO: 5523. Alternatively, the SEQ ID NO: 5537 is included; (j) PAM comprises SEQ ID NO: 5524; (k) PAM comprises SEQ ID NO: 5525; or (l) PAM comprises SEQ ID NO: 5526.

いくつかの態様では、本開示は、標的核酸遺伝子座を改変するための方法を提供し、上記方法は、本明細書に記載される操作されたヌクレアーゼシステムのいずれかを上記標的核酸遺伝子座に送達する工程を含み、ここで、エンドヌクレアーゼは、操作されたガイドリボ核酸構造と複合体を形成するように構成され、ここで、上記複合体は、上記複合体が上記標的核酸遺伝子座に結合すると、上記複合体が標的核酸遺伝子座を改変するように構成される。いくつかの実施形態では、標的核酸遺伝子座を改変することは、標的核酸遺伝子座を結合、ニッキング、切断、標識することを含む。いくつかの実施形態では、標的核酸遺伝子座は、デオキシリボ核酸（ＤＮＡ）またはリボ核酸（ＲＮＡ）を含む。いくつかの実施形態では、標的核酸は、ゲノムＤＮＡ、ウイルスＤＮＡ、ウイルスＲＮＡ、または細菌ＤＮＡを含む。いくつかの実施形態では、標的核酸遺伝子座はインビトロである。いくつかの実施形態では、標的核酸遺伝子座は細胞内にある。いくつかの実施形態では、細胞は、原核細胞、細菌細胞、真核細胞、真菌細胞、植物細胞、動物細胞、哺乳動物細胞、げっ歯類細胞、霊長類細胞、またはヒト細胞である。 In some embodiments, the present disclosure provides a method for modifying a target nucleic acid locus, wherein the method incorporates any of the engineered nuclease systems described herein into the target nucleic acid locus. Containing a step of delivery, where the endonuclease is configured to form a complex with the engineered guide ribonucleic acid structure, where the complex is associated with the binding of the complex to the target nucleic acid locus. , The complex is configured to modify the target nucleic acid locus. In some embodiments, modifying the target nucleic acid locus comprises binding, nicking, cleaving, and labeling the target nucleic acid locus. In some embodiments, the target nucleic acid locus comprises deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). In some embodiments, the target nucleic acid comprises genomic DNA, viral DNA, viral RNA, or bacterial DNA. In some embodiments, the target nucleic acid locus is in vitro. In some embodiments, the target nucleic acid locus is intracellular. In some embodiments, the cell is a prokaryotic cell, a bacterial cell, a eukaryotic cell, a fungal cell, a plant cell, an animal cell, a mammalian cell, a rodent cell, a primate cell, or a human cell.

いくつかの実施形態では、操作されたヌクレアーゼシステムを標的核酸遺伝子座に送達することは、請求項１３５－１４０のいずれかに記載の核酸または請求項１４２－１４６のいずれかに記載のベクターを送達することを含む。いくつかの実施形態では、操作されたヌクレアーゼシステムを標的核酸遺伝子座に送達することは、エンドヌクレアーゼをコードするオープンリーディングフレームを含む核酸を送達することを含む。いくつかの実施形態では、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されるプロモーターを含む。いくつかの実施形態では、標的核酸遺伝子座への操作されたヌクレアーゼシステムは、エンドヌクレアーゼをコードするオープンリーディングフレームを含有するキャップされたｍＲＮＡ（ｃａｐｐｅｄｍＲＮＡ）を送達することを含む。いくつかの実施形態では、標的核酸遺伝子座への操作されたヌクレアーゼシステムは、翻訳されたポリペプチドを送達することを含む。いくつかの実施形態では、標的核酸遺伝子座への操作されたヌクレアーゼシステムは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結される操作されたガイドリボ核酸構造をコードするデオキシリボ核酸（ＤＮＡ）を送達することを含む。いくつかの実施形態では、エンドヌクレアーゼは、標的遺伝子座で、またはその近位で、一本鎖切断または二本鎖切断を引き起こす。 In some embodiments, delivering the engineered nuclease system to the target nucleic acid locus delivers the nucleic acid of any of claims 135-140 or the vector of any of claims 142-146. Including doing. In some embodiments, delivering the engineered nuclease system to the target nucleic acid locus comprises delivering a nucleic acid comprising an open reading frame encoding an endonuclease. In some embodiments, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. In some embodiments, the engineered nuclease system to the target nucleic acid locus comprises delivering a capped mRNA containing an open reading frame encoding an endonuclease. In some embodiments, the engineered nuclease system to the target nucleic acid locus comprises delivering the translated polypeptide. In some embodiments, the engineered nuclease system to the target nucleic acid locus is a deoxyribonucleic acid (DNA) that encodes an engineered guide ribonucleic acid structure that is operably linked to the ribonucleic acid (RNA) pol III promoter. Includes delivery. In some embodiments, endonucleases cause single-strand or double-strand breaks at or proximal to the target locus.

本開示のさらなる態様および利点は、以下の詳細な説明から当業者に容易に明白となり、ここでは、本開示の例示的な実施形態のみが示され、説明されている。理解されるように、本開示は、他の実施形態および異なる実施形態においても可能であり、その様々な詳細は、そのすべてが本開示から逸脱することなく様々な明白な点で修正することができる。したがって、図面および説明は本来、例示的なものとしてみなされ、限定的なものであるとはみなされない。 Further aspects and advantages of the present disclosure will be readily apparent to those of skill in the art from the detailed description below, wherein only exemplary embodiments of the present disclosure are shown and described. As will be appreciated, this disclosure is possible in other embodiments as well as in different embodiments, the various details of which may be modified in various obvious ways without departing from the present disclosure. can. Therefore, the drawings and descriptions are by nature considered exemplary and not limiting.

参照による組み込み
本明細書で言及される全ての出版物、特許、および特許出願は、あたかも個々の出版物、特許、または特許出願が参照によって組み込まれるよう具体的かつ個別に示されるかのように、同じ程度まで参照により本明細書に組み込まれる。 Incorporation by Reference All publications, patents, and patent applications referred to herein are as if the individual publications, patents, or patent applications are specifically and individually indicated to be incorporated by reference. , To the same extent incorporated herein by reference.

本発明の新規な特徴は、とりわけ、添付の特許請求の範囲内に明記される。本発明の特徴および利点のより良い理解は、本発明の原理が用いられる例示的実施形態を説明する以下の詳細な説明と、以下の添付図面（本明細書では「図（”Ｆｉｇｕｒｅ”および”ＦＩＧ．”）」とも称される）とを参照することによって得られるであろう。 The novel features of the invention are specified, among other things, within the appended claims. A better understanding of the features and advantages of the invention is the following detailed description illustrating exemplary embodiments in which the principles of the invention are used and the following accompanying drawings (in the present specification, "Figure" and ". It will be obtained by referring to FIG. ")").

様々なクラスおよび型のＣＲＩＳＰＲ／Ｃａｓ遺伝子座の典型的な組織を示す。Shows typical tissues of CRISPR / Cas loci of various classes and types. 両方が結合されるハイブリッドｓｇＲＮＡと比較した、天然のクラス２／ＩＩ型ｃｒＲＮＡ／ｔｒａｃｒＲＮＡペアの構造を示す。Shown is the structure of a native Class 2 / II type crRNA / tracrRNA pair compared to a hybrid sgRNA to which both are bound. ＭＧ１ファミリーからの酵素をコードするＣＲＩＳＰＲ遺伝子座の構成を示す概念図を示す。A conceptual diagram showing the composition of the CRISPR locus encoding an enzyme from the MG1 family is shown. ＭＧ２ファミリーからの酵素をコードするＣＲＩＳＰＲ遺伝子座の構成を示す概念図を示す。A conceptual diagram showing the composition of the CRISPR locus encoding an enzyme from the MG2 family is shown. ＭＧ３ファミリーからの酵素をコードするＣＲＩＳＰＲ遺伝子座の構成を示す概念図を示す。A conceptual diagram showing the composition of the CRISPR locus encoding an enzyme from the MG3 family is shown. 黄色ブドウ球菌からのＣａｓ９（配列番号：５６１３）に対する、本開示（ＭＧ１－１）の酵素の構造に基づいたアラインメントを示す。An alignment based on the structure of the enzyme of the present disclosure (MG1-1) with respect to Cas9 (SEQ ID NO: 5613) from Staphylococcus aureus is shown. 黄色ブドウ球菌からのＣａｓ９（配列番号：５６１３）に対する、本開示（ＭＧ２－１）の酵素の構造に基づいたアラインメントを示す。An alignment based on the structure of the enzyme of the present disclosure (MG2-1) with respect to Cas9 (SEQ ID NO: 5613) from Staphylococcus aureus is shown. ＡｃｔｉｎｏｍｙｃｅｓｎａｅｓｌｕｎｄｉｉからのＣａｓ９（配列番号：５６１４）に対する、本開示（ＭＧ３－１）の酵素の構造に基づいたアラインメントを示す。The alignment based on the structure of the enzyme of the present disclosure (MG3-1) with respect to Cas9 (SEQ ID NO: 5614) from Actinomyces naeslundii is shown. ＭＧ１ファミリー酵素ＭＧ１－１～ＭＧ１－６（配列番号：５、６、９、１、２、および３）の構造に基づいたアラインメントを示す。Alignments based on the structure of the MG1 family enzymes MG1-1 to MG1-6 (SEQ ID NOs: 5, 6, 9, 1, 2, and 3) are shown. ＭＧ１ファミリー酵素ＭＧ１－１～ＭＧ１－６（配列番号：５、６、９、１、２、および３）の構造に基づいたアラインメントを示す。Alignments based on the structure of the MG1 family enzymes MG1-1 to MG1-6 (SEQ ID NOs: 5, 6, 9, 1, 2, and 3) are shown. ＭＧ１ファミリー酵素ＭＧ１－１～ＭＧ１－６（配列番号：５、６、９、１、２、および３）の構造に基づいたアラインメントを示す。Alignments based on the structure of the MG1 family enzymes MG1-1 to MG1-6 (SEQ ID NOs: 5, 6, 9, 1, 2, and 3) are shown. ＭＧ１ファミリー酵素ＭＧ１－１～ＭＧ１－６（配列番号：５、６、９、１、２、および３）の構造に基づいたアラインメントを示す。Alignments based on the structure of the MG1 family enzymes MG1-1 to MG1-6 (SEQ ID NOs: 5, 6, 9, 1, 2, and 3) are shown. ＭＧ１ファミリー酵素ＭＧ１－１～ＭＧ１－６（配列番号：５、６、９、１、２、および３）の構造に基づいたアラインメントを示す。Alignments based on the structure of the MG1 family enzymes MG1-1 to MG1-6 (SEQ ID NOs: 5, 6, 9, 1, 2, and 3) are shown. ＭＧ１ファミリー酵素ＭＧ１－１～ＭＧ１－６（配列番号：５、６、９、１、２、および３）の構造に基づいたアラインメントを示す。Alignments based on the structure of the MG1 family enzymes MG1-1 to MG1-6 (SEQ ID NOs: 5, 6, 9, 1, 2, and 3) are shown. ＭＧ１ファミリー酵素ＭＧ１－１～ＭＧ１－６（配列番号：５、６、９、１、２、および３）の構造に基づいたアラインメントを示す。Alignments based on the structure of the MG1 family enzymes MG1-1 to MG1-6 (SEQ ID NOs: 5, 6, 9, 1, 2, and 3) are shown. ＭＧ１ファミリー酵素ＭＧ１－１～ＭＧ１－６（配列番号：５、６、９、１、２、および３）の構造に基づいたアラインメントを示す。Alignments based on the structure of the MG1 family enzymes MG1-1 to MG1-6 (SEQ ID NOs: 5, 6, 9, 1, 2, and 3) are shown. 様々な長さの標的化配列を含有するその対応するｓｇＲＮＡと複合体を形成するＭＧ１－４による、ＤＮＡのインビトロ切断を示す。In vitro cleavage of DNA by MG1-4 complexing with its corresponding sgRNA containing targeting sequences of various lengths is shown. ＭＧ１－４とその対応するｓｇＲＮＡを使用した、大腸菌ゲノムＤＮＡの細胞切断を示す。標的スペーサーまたは非標的スペーサーと共にＭＧ１－４で形質転換された細胞の希釈系列が示される（上）；下パネルは定量化されたデータを示し、左のバーは非標的ｓｇＲＮＡを表し、右のバーは標的ｓｇＲＮＡを表す。A cell cleavage of E. coli genomic DNA using MG1-4 and its corresponding sgRNA is shown. Diluted sequences of MG1-4 transformed cells with targeted or non-targeted spacers are shown (top); lower panel shows quantified data, left bar represents non-targeted sgRNA, right bar Represents the target sgRNA. ヒトゲノム中の様々な位置を標的とする様々な異なる標的化配列を含有するそれらの対応するｓｇＲＮＡと一緒に、実施例１１に記載されるＭＧ１－４またはＭＧ１－６の構築物を用いたＨＥＫ細胞のトランスフェクションによって生成された細胞インデル形成を示す。HEK cells using the MG1-4 or MG1-6 constructs described in Example 11 with their corresponding sgRNAs containing various different targeting sequences targeting different locations in the human genome. It shows the cell indel formation produced by transfection. 様々な長さの標的化配列を含有するその対応するｓｇＲＮＡと複合体を形成するＭＧ３－６によるＤＮＡのビトロ切断を示す。Vitro cleavage of DNA by MG3-6 forming a complex with its corresponding sgRNA containing targeting sequences of various lengths is shown. ＭＧ３－７とその対応するｓｇＲＮＡを使用した、大腸菌ゲノムＤＮＡの細胞切断を示す。標的スペーサーまたは非標的スペーサーと共にＭＧ３－７で形質転換された細胞の希釈系列が示される（上）；下パネルは定量化されたデータを示し、左のバーは非標的ｓｇＲＮＡを表し、右のバーは標的ｓｇＲＮＡを表す。A cell cleavage of E. coli genomic DNA using MG3-7 and its corresponding sgRNA is shown. Diluted sequences of MG3-7 transformed cells with targeted or non-targeted spacers are shown (top); bottom panel shows quantified data, left bar represents non-targeted sgRNA, right bar Represents the target sgRNA. ヒトゲノム中の様々な位置を標的とする様々な異なる標的化配列を含有するそれらの対応するｓｇＲＮＡと一緒に、実施例１３に記載されるＭＧ３－７の構築物を用いたＨＥＫ細胞のトランスフェクションによって生成された細胞インデル形成を示す。Generated by transfection of HEK cells with the MG3-7 construct described in Example 13 together with their corresponding sgRNAs containing various different targeting sequences targeting different positions in the human genome. Shows cell indel formation. 様々な長さの標的化配列を含有するその対応するｓｇＲＮＡと複合体を形成するＭＧ１５－１によるＤＮＡのインビトロ切断を示す。In vitro cleavage of DNA by MG15-1, which forms a complex with its corresponding sgRNA containing targeting sequences of various lengths, is shown. アガロースゲルを示し、これは、様々なＭＧファミリーのヌクレアーゼおよびそれらの対応するｔｒａｃｒＲＮＡまたはｓｇＲＮＡを含有するＴＸＴＬ抽出物の存在下でのＰＡＭベクターライブラリー切断の結果を示す。Shown is an agarose gel showing the results of cleavage of the PAM vector library in the presence of TXTL extracts containing various MG family nucleases and their corresponding tracrRNA or sgRNA. アガロースゲルを示し、これは、様々なＭＧファミリーのヌクレアーゼおよびそれらの対応するｔｒａｃｒＲＮＡまたはｓｇＲＮＡを含有するＴＸＴＬ抽出物の存在下でのＰＡＭベクターライブラリー切断の結果を示す。Shown is an agarose gel showing the results of cleavage of the PAM vector library in the presence of TXTL extracts containing various MG family nucleases and their corresponding tracrRNA or sgRNA. アガロースゲルを示し、これは、様々なＭＧファミリーのヌクレアーゼおよびそれらの対応するｔｒａｃｒＲＮＡまたはｓｇＲＮＡを含有するＴＸＴＬ抽出物の存在下でのＰＡＭベクターライブラリー切断の結果を示す。Shown is an agarose gel showing the results of cleavage of the PAM vector library in the presence of TXTL extracts containing various MG family nucleases and their corresponding tracrRNA or sgRNA. アガロースゲルを示し、これは、様々なＭＧファミリーのヌクレアーゼおよびそれらの対応するｔｒａｃｒＲＮＡまたはｓｇＲＮＡを含有するＴＸＴＬ抽出物の存在下でのＰＡＭベクターライブラリー切断の結果を示す。Shown is an agarose gel showing the results of cleavage of the PAM vector library in the presence of TXTL extracts containing various MG family nucleases and their corresponding tracrRNA or sgRNA. 本明細書に記載されるＭＧ酵素の対応するｓｇＲＮＡの予想される構造（例えば、実施例７でのように予想される）を示す。The expected structure of the corresponding sgRNA of the MG enzyme described herein (eg, as in Example 7) is shown. 本明細書に記載されるＭＧ酵素の対応するｓｇＲＮＡの予想される構造（例えば、実施例７でのように予想される）を示す。The expected structure of the corresponding sgRNA of the MG enzyme described herein (eg, as in Example 7) is shown. 本明細書に記載されるＭＧ酵素の対応するｓｇＲＮＡの予想される構造（例えば、実施例７でのように予想される）を示す。The expected structure of the corresponding sgRNA of the MG enzyme described herein (eg, as in Example 7) is shown. 本明細書に記載されるＭＧ酵素の対応するｓｇＲＮＡの予想される構造（例えば、実施例７でのように予想される）を示す。The expected structure of the corresponding sgRNA of the MG enzyme described herein (eg, as in Example 7) is shown. 本明細書に記載されるＭＧ酵素の対応するｓｇＲＮＡの予想される構造（例えば、実施例７でのように予想される）を示す。The expected structure of the corresponding sgRNA of the MG enzyme described herein (eg, as in Example 7) is shown. 本明細書に記載されるＭＧ酵素の対応するｓｇＲＮＡの予想される構造（例えば、実施例７でのように予想される）を示す。The expected structure of the corresponding sgRNA of the MG enzyme described herein (eg, as in Example 7) is shown. 本明細書に記載される（例えば、実施例６に記載される）ＮＧＳを介して導き出されたＰＡＭ配列のｓｅｑＬｏｇｏ表現を示す。The seqLogo representation of the PAM sequence derived via NGS described herein (eg, described in Example 6) is shown. 本明細書に記載される（例えば、実施例６に記載される）ＮＧＳを介して導き出されたＰＡＭ配列のｓｅｑＬｏｇｏ表現を示す。The seqLogo representation of the PAM sequence derived via NGS described herein (eg, described in Example 6) is shown. 本明細書に記載される（例えば、実施例６に記載される）ＮＧＳを介して導き出されたＰＡＭ配列のｓｅｑＬｏｇｏ表現を示す。The seqLogo representation of the PAM sequence derived via NGS described herein (eg, described in Example 6) is shown. 本明細書に記載される（例えば、実施例６に記載される）ＮＧＳを介して導き出されたＰＡＭ配列のｓｅｑＬｏｇｏ表現を示す。The seqLogo representation of the PAM sequence derived via NGS described herein (eg, described in Example 6) is shown. 本明細書に記載される（例えば、実施例６に記載される）ＮＧＳを介して導き出されたＰＡＭ配列のｓｅｑＬｏｇｏ表現を示す。The seqLogo representation of the PAM sequence derived via NGS described herein (eg, described in Example 6) is shown. 本明細書に記載される（例えば、実施例６に記載される）ＮＧＳを介して導き出されたＰＡＭ配列のｓｅｑＬｏｇｏ表現を示す。The seqLogo representation of the PAM sequence derived via NGS described herein (eg, described in Example 6) is shown. 本明細書に記載される（例えば、実施例６に記載される）ＮＧＳを介して導き出されたＰＡＭ配列のｓｅｑＬｏｇｏ表現を示す。The seqLogo representation of the PAM sequence derived via NGS described herein (eg, described in Example 6) is shown. ＭＧ２－７とその対応するｓｇＲＮＡを使用した、大腸菌ゲノムＤＮＡの細胞切断を示す。標的スペーサーまたは非標的スペーサーと共にＭＧ２－７で形質転換された細胞の希釈系列が示される（上）；下パネルは定量化されたデータを示し、右のバーは非標的ｓｇＲＮＡを表し、左のバーは標的ｓｇＲＮＡを表す。A cell cleavage of E. coli genomic DNA using MG2-7 and its corresponding sgRNA is shown. Diluted sequences of MG2-7 transformed cells with targeted or non-targeted spacers are shown (top); bottom panel shows quantified data, right bar represents non-targeted sgRNA, left bar Represents the target sgRNA. ＭＧ１４－１とその対応するｓｇＲＮＡを使用した、大腸菌ゲノムＤＮＡの細胞切断を示す。標的スペーサーまたは非標的スペーサーと共にＭＧ１４－１で形質転換された細胞の希釈系列が示される（上）；下パネルは定量化されたデータを示し、右のバーは非標的ｓｇＲＮＡを表し、左のバーは標的ｓｇＲＮＡを表す。A cell cleavage of E. coli genomic DNA using MG14-1 and its corresponding sgRNA is shown. The dilution series of cells transformed with MG14-1 with the target spacer or non-target spacer is shown (top); the lower panel shows the quantified data, the right bar represents the non-target sgRNA, and the left bar shows. Represents the target sgRNA. ＭＧ１５－１とその対応するｓｇＲＮＡを使用した、大腸菌ゲノムＤＮＡの細胞切断を示す。標的スペーサーまたは非標的スペーサーと共にＭＧ１５－１で形質転換された細胞の希釈系列が示される（上）；下パネルは定量化されたデータを示し、右のバーは非標的ｓｇＲＮＡを表し、左のバーは標的ｓｇＲＮＡを表す。A cell cleavage of E. coli genomic DNA using MG15-1 and its corresponding sgRNA is shown. Diluted sequences of MG15-1 transformed cells with targeted or non-targeted spacers are shown (top); bottom panel shows quantified data, right bar represents non-targeted sgRNA, left bar Represents the target sgRNA.

配列表の簡単な説明
本明細書とともに出願された配列表は、本開示の方法、組成物、およびシステムで使用される例示的なポリヌクレオチドおよびポリペプチド配列を提供する。以下は配列表における配列の例示的な説明である。 Brief Description of Sequence Listing The sequence listing, filed with this specification, provides exemplary polynucleotide and polypeptide sequences used in the methods, compositions, and systems of the present disclosure. The following is an exemplary description of the sequences in the sequence listing.

ＭＧ１ MG1

配列番号：１－３１９は、ＭＧ１ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 1-319 show the full length peptide sequence of MG1 nuclease.

配列番号：１８２７－２１４０は、上記のＭＧ１ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 1827-2140 shows the peptide sequence of the RuvC_III domain of the above MG1 nuclease.

配列番号：３６３８－３９５５は、上記のＭＧ１ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 3638-3955 represent peptides in the HNH domain of the above MG1 nuclease.

配列番号：５４７６－５４７９は、上記のＭＧ１ヌクレアーゼと同じ遺伝子座（例えば、それぞれ配列番号：１－４と同じ遺伝子座）に由来するＭＧ１ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5476-5479 represents the nucleotide sequence of MG1 tracrRNA derived from the same locus as MG1 nuclease above (eg, the same locus as SEQ ID NO: 1-4, respectively).

配列番号：５４６１－５４６４は、ＭＧ１ヌクレアーゼ（例えば、それぞれ配列番号：１－４）と機能するように操作されたｓｇＲＮＡのヌクレオチド配列を示し、ここで、Ｎｓは標的化配列のヌクレオチドを表示する。 SEQ ID NO: 5461-5464 represent the nucleotide sequence of the sgRNA engineered to function with the MG1 nuclease (eg, SEQ ID NO: 1-4, respectively), where Ns represents the nucleotide of the targeting sequence.

配列番号：５５７２－５５７５は、ＭＧ１ファミリー酵素（配列番号：１－４）の大腸菌のコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5571-5575 represents the nucleotide sequence of the E. coli codon-optimized coding sequence of the MG1 family enzyme (SEQ ID NO: 1-4).

配列番号：５５８８－５５８９は、ＭＧ１ファミリー酵素（配列番号：１および３）のヒトのコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5588-5589 represent the nucleotide sequence of the human codon-optimized coding sequence for the MG1 family of enzymes (SEQ ID NOs: 1 and 3).

配列番号：５６１６－５６３２は、ＭＧ１ファミリー酵素のペプチドモチーフ特性を示す。 SEQ ID NO: 5616-5632 show the peptide motif properties of the MG1 family of enzymes.

ＭＧ２ MG2

配列番号：３２０－４２０は、ＭＧ２ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 320-420 shows the full length peptide sequence of MG2 nuclease.

配列番号：２１４１－２２４１は、上記のＭＧ２ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 2141-2241 shows the peptide sequence of the RuvC_III domain of the above MG2 nuclease.

配列番号：３９５５－４０５５は、上記のＭＧ２ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 3955-4055 represent peptides in the HNH domain of the above MG2 nuclease.

配列番号：５４９０－５４９４は、上記のＭＧ２ヌクレアーゼと同じ遺伝子座（例えば、それぞれ配列番号：３２０、３２１、３２３、３２５、および３２６と同じ遺伝子座）に由来するＭＧ２ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5490-5494 represents the nucleotide sequence of MG2 tracrRNA derived from the same locus as MG2 nuclease above (eg, the same locus as SEQ ID NOs: 320, 321, 323, 325, and 326, respectively).

配列番号：５４６５は、ＭＧ２ヌクレアーゼ（例えば、上記の配列番号：３２１）と機能するように操作されたｓｇＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5465 represents the nucleotide sequence of the sgRNA engineered to function with the MG2 nuclease (eg, SEQ ID NO: 321 above).

配列番号：５５７２－５５７５は、ＭＧ２ファミリー酵素の大腸菌のコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5571-5575 represents the nucleotide sequence of the E. coli codon-optimized coding sequence of the MG2 family of enzymes.

配列番号：５６３１－５６３８は、ＭＧ２ファミリー酵素のペプチド配列特性を示す。 SEQ ID NO: 5631-5638 show peptide sequence characteristics of MG2 family enzymes.

ＭＧ３ MG3

配列番号：４２１－４３１は、ＭＧ３ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 421-431 shows the full length peptide sequence of MG3 nuclease.

配列番号：２２４２－２２５１は、上記のＭＧ３ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 2242-2251 shows the peptide sequence of the RuvC_III domain of the above MG3 nuclease.

配列番号：４０５６－４０６６は、上記のＭＧ３ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 4056-4066 represent peptides in the HNH domain of the MG3 nuclease above.

配列番号：５４９５－５５０２は、上記のＭＧ３ヌクレアーゼと同じ遺伝子座（例えば、それぞれ配列番号：４２１－４２８と同じ遺伝子座）に由来するＭＧ３ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5495-5502 shows the nucleotide sequence of MG3 tracrRNA derived from the same locus as the above MG3 nuclease (eg, the same locus as SEQ ID NO: 421-428, respectively).

配列番号：５４６６－５４６７は、ＭＧ３ヌクレアーゼ（例えば、配列番号：４２１－４２３）と機能するように操作されたｓｇＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5466-5467 represents the nucleotide sequence of the sgRNA engineered to function with the MG3 nuclease (eg, SEQ ID NO: 421-423).

配列番号：５５７８－５５８０は、ＭＧ３ファミリー酵素の大腸菌のコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5578-5580 show the nucleotide sequence of the E. coli codon-optimized coding sequence of the MG3 family of enzymes.

配列番号：５６３９－５６４８は、ＭＧ３ファミリー酵素のペプチド配列特性を示す。 SEQ ID NO: 5639-5648 show peptide sequence characteristics of MG3 family enzymes.

ＭＧ４ MG4

配列番号：４３２－６６０は、ＭＧ４ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 432-660 shows the full-length peptide sequence of MG4 nuclease.

配列番号：２２５３－２４８１は、上記のＭＧ４ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 2253-2481 shows the peptide sequence of the RuvC_III domain of the above MG4 nuclease.

配列番号：４０６７－４２９５は、上記のＭＧ４ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 4067-4295 represents the peptide in the HNH domain of the above MG4 nuclease.

配列番号：５５０３は、上記のＭＧ４ヌクレアーゼと同じ遺伝子座に由来するＭＧ４ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5503 shows the nucleotide sequence of MG4 tracrRNA derived from the same locus as the above MG4 nuclease.

配列番号：５４６８は、ＭＧ４ヌクレアーゼと機能するように操作されたｓｇＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5468 represent the nucleotide sequence of the sgRNA engineered to function with the MG4 nuclease.

配列番号：５６４９は、ＭＧ４ファミリー酵素のペプチド配列特性を示す。 SEQ ID NO: 5649 shows peptide sequence characteristics of MG4 family enzymes.

ＭＧ６ MG6

配列番号：６６１－６６８は、ＭＧ６ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 661-668 show the full length peptide sequence of MG6 nuclease.

配列番号：２４８２－２４８９は、上記のＭＧ６ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 2482-2489 shows the peptide sequence of the RuvC_III domain of the above MG6 nuclease.

配列番号：４２９６－４３０３は、上記のＭＧ３ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 4296-4303 represent peptides in the HNH domain of the above MG3 nuclease.

ＭＧ７ MG7

配列番号：６６９－６７７は、ＭＧ７ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 669-677 show the full length peptide sequence of MG7 nuclease.

配列番号：２４９０－２４９８は、上記のＭＧ７ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 2490-2498 represent the peptide sequence of the RuvC_III domain of the MG7 nuclease above.

配列番号：４３０４－４３１２は、上記のＭＧ３ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 4304-4212 shows the peptide of the HNH domain of the above MG3 nuclease.

配列番号：５５０４は、上記のＭＧ７ヌクレアーゼと同じ遺伝子座に由来するＭＧ７ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5504 shows the nucleotide sequence of MG7 tracrRNA derived from the same locus as MG7 nuclease above.

ＭＧ１４配列番号：６７８－９２９は、ＭＧ１４ヌクレアーゼの完全長ペプチド配列を示す。 MG14 SEQ ID NO: 678-929 shows the full length peptide sequence of MG14 nuclease.

配列番号：２４９９－２７５０は、上記のＭＧ１４ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 2499-2750 shows the peptide sequence of the RuvC_III domain of the above MG14 nuclease.

配列番号：４３１３－４５６４は、上記のＭＧ１４ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 4313-4564 represent peptides in the HNH domain of the above MG14 nuclease.

配列番号：５５０５は、上記のＭＧ１４ヌクレアーゼと同じ遺伝子座に由来するＭＧ１４ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5505 shows the nucleotide sequence of MG14 tracrRNA derived from the same locus as the above MG14 nuclease.

配列番号：５５８１は、ＭＧ１４ファミリー酵素の大腸菌のコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5581 shows the nucleotide sequence of the codon-optimized coding sequence of E. coli of the MG14 family of enzymes.

配列番号：５６５０－５６６７は、ＭＧ１４ファミリー酵素のペプチド配列特性を示す。 SEQ ID NO: 5650-5667 show peptide sequence characteristics of MG14 family enzymes.

ＭＧ１５ MG15

配列番号：９３０－１０９２は、ＭＧ１５ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 930-1092 shows the full-length peptide sequence of MG15 nuclease.

配列番号：２７５１－２９１３は、上記のＭＧ１５ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 2751-2913 show the peptide sequence of the RuvC_III domain of the above MG15 nuclease.

配列番号：４５６５－４７２７は、上記のＭＧ１５ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 4565-4727 represents the peptide of the HNH domain of the above MG15 nuclease.

配列番号：５５０６は、上記のＭＧ１５ヌクレアーゼと同じ遺伝子座に由来するＭＧ１５ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5506 represents the nucleotide sequence of MG15 tracrRNA derived from the same locus as the MG15 nuclease above.

配列番号：５４７０は、ＭＧ１５ヌクレアーゼと機能するように操作されたｓｇＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5470 represents the nucleotide sequence of the sgRNA engineered to function with the MG15 nuclease.

配列番号：５５８２は、ＭＧ１５ファミリー酵素の大腸菌のコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5582 show the nucleotide sequence of the E. coli codon-optimized coding sequence of the MG15 family of enzymes.

配列番号：５６６８－５６７５は、ＭＧ１５ファミリー酵素のペプチド配列特性を示す。 SEQ ID NO: 5668-5675 shows the peptide sequence characteristics of MG15 family enzymes.

ＭＧ１６ MG16

配列番号：１０９３－１３５３は、ＭＧ１６ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 1093-1353 show the full length peptide sequence of MG16 nuclease.

配列番号：２９１４－３１７４は、上記のＭＧ１６ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 2914-3174 shows the peptide sequence of the RuvC_III domain of the above MG16 nuclease.

配列番号：４７２８－４９８８は、上記のＭＧ１６ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 4728-4988 represent peptides in the HNH domain of the above MG16 nuclease.

配列番号：５５０７は、上記のＭＧ３ヌクレアーゼと同じ遺伝子座に由来するＭＧ１６ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5507 shows the nucleotide sequence of MG16 tracrRNA derived from the same locus as the MG3 nuclease above.

配列番号：５４７１は、ＭＧ１６ヌクレアーゼと機能するように操作されたｓｇＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5471 represents the nucleotide sequence of the sgRNA engineered to function with the MG16 nuclease.

配列番号：５５８３は、ＭＧ１６ファミリー酵素の大腸菌のコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5583 shows the nucleotide sequence of the codon-optimized coding sequence of E. coli of the MG16 family of enzymes.

配列番号：５６７６－５６７８は、ＭＧ１６ファミリー酵素のペプチド配列特性を示す。 SEQ ID NO: 5676-5678 show the peptide sequence characteristics of MG16 family enzymes.

ＭＧ１８ MG18

配列番号：１３５４－１５１１は、ＭＧ１８ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 1354-1511 shows the full length peptide sequence of MG18 nuclease.

配列番号：３１７５－３３３０は、上記のＭＧ１８ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 3175-3330 shows the peptide sequence of the RuvC_III domain of the above MG18 nuclease.

配列番号：４９８９－５１４６は、上記のＭＧ１８ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 4989-5146 represent peptides in the HNH domain of the above MG18 nuclease.

配列番号：５５０８は、上記のＭＧ１８ヌクレアーゼと同じ遺伝子座に由来するＭＧ１８ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5508 shows the nucleotide sequence of MG18 tracrRNA derived from the same locus as the above MG18 nuclease.

配列番号：５４７２は、ＭＧ１８ヌクレアーゼと機能するように操作されたｓｇＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5472 represents the nucleotide sequence of the sgRNA engineered to function with the MG18 nuclease.

配列番号：５５８４は、ＭＧ１８ファミリー酵素の大腸菌のコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5584 shows the nucleotide sequence of the codon-optimized coding sequence of E. coli of the MG18 family of enzymes.

配列番号：５６７９－５６８６は、ＭＧ１８ファミリー酵素のペプチド配列特性を示す。 SEQ ID NO: 5679-5686 show peptide sequence characteristics of MG18 family enzymes.

ＭＧ２１ MG21

配列番号：１５１２－１６５５は、ＭＧ２１ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 1512-1655 show the full length peptide sequence of MG21 nuclease.

配列番号：３３３１－３４７４は、上記のＭＧ２１ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 3331-3474 shows the peptide sequence of the RuvC_III domain of the above MG21 nuclease.

配列番号：５１４７－５２９０は、上記のＭＧ２１ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 5147-5290 represent peptides in the HNH domain of the above MG21 nuclease.

配列番号：５５０９は、上記のＭＧ２１ヌクレアーゼと同じ遺伝子座に由来するＭＧ２１ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5509 shows the nucleotide sequence of MG21 tracrRNA derived from the same locus as the above MG21 nuclease.

配列番号：５４７３は、ＭＧ２１ヌクレアーゼと機能するように操作されたｓｇＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5473 represents the nucleotide sequence of the sgRNA engineered to function with the MG21 nuclease.

配列番号：５５８５は、ＭＧ２１ファミリー酵素の大腸菌のコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5585 shows the nucleotide sequence of the codon-optimized coding sequence of E. coli of the MG21 family of enzymes.

配列番号：５６８７－５６９２および５６７４－５６７５は、ＭＧ２１ファミリー酵素のペプチド配列特性を示す。 SEQ ID NOs: 5687-5692 and 5674-5675 show peptide sequence characteristics of MG21 family enzymes.

ＭＧ２２ MG22

配列番号：１６５６－１７５５は、ＭＧ２２ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 1656-1755 shows the full length peptide sequence of MG22 nuclease.

配列番号：３４７５－３５６８は、上記のＭＧ２２ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 3475-3568 show the peptide sequence of the RuvC_III domain of the above MG22 nuclease.

配列番号：５２９１－５３８９は、上記のＭＧ２２ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 5291-5389 represent peptides in the HNH domain of the above MG22 nuclease.

配列番号：５５１０は、上記のＭＧ２２ヌクレアーゼと同じ遺伝子座に由来するＭＧ２２ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5510 shows the nucleotide sequence of MG22 tracrRNA derived from the same locus as the above MG22 nuclease.

配列番号：５４７４は、ＭＧ２２ヌクレアーゼと機能するように操作されたｓｇＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5474 represents the nucleotide sequence of the sgRNA engineered to function with the MG22 nuclease.

配列番号：５５８６は、ＭＧ２２ファミリー酵素の大腸菌のコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5586 shows the nucleotide sequence of the E. coli codon-optimized coding sequence of the MG22 family of enzymes.

配列番号：５６９４－５６９９は、ＭＧ２２ファミリー酵素のペプチド配列特性を示す。 SEQ ID NO: 5694-5699 show the peptide sequence characteristics of the MG22 family of enzymes.

ＭＧ２３ MG23

配列番号：１７５６－１８２６は、ＭＧ２３ヌクレアーゼの完全長ペプチド配列を示す。 SEQ ID NO: 1756-1826 shows the full length peptide sequence of MG23 nuclease.

配列番号：３５６９－３６３７は、上記のＭＧ２３ヌクレアーゼのＲｕｖＣ＿ＩＩＩドメインのペプチド配列を示す。 SEQ ID NO: 3569-3637 shows the peptide sequence of the RuvC_III domain of the above MG23 nuclease.

配列番号：５３９０－５４６０は、上記のＭＧ２３ヌクレアーゼのＨＮＨドメインのペプチドを示す。 SEQ ID NO: 5390-5460 represent peptides in the HNH domain of the above MG23 nuclease.

配列番号：５５１１は、上記のＭＧ２３ヌクレアーゼと同じ遺伝子座に由来するＭＧ２３ｔｒａｃｒＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5511 shows the nucleotide sequence of MG23 tracrRNA derived from the same locus as the above MG23 nuclease.

配列番号：５４７５は、ＭＧ２３ヌクレアーゼと機能するように操作されたｓｇＲＮＡのヌクレオチド配列を示す。 SEQ ID NO: 5475 represents the nucleotide sequence of the sgRNA engineered to function with the MG23 nuclease.

配列番号：５５８７は、ＭＧ２３ファミリー酵素の大腸菌のコドン最適化コード配列のヌクレオチド配列を示す。 SEQ ID NO: 5587 shows the nucleotide sequence of the codon-optimized coding sequence of E. coli of the MG23 family of enzymes.

配列番号：５７００－５７１７は、ＭＧ２３ファミリー酵素のペプチド配列特性を示す。 SEQ ID NO: 5700-5717 show the peptide sequence characteristics of the MG23 family of enzymes.

本発明の様々な実施形態が本明細書中で示され、かつ説明されているが、このような実施形態はほんの一例として提供されるものであることは、当業者には明らかであろう。多数の変形、変更、および置き換えは、本発明から逸脱することなく、当業者によって想到され得る。本明細書に記載される本発明の実施形態の様々な代案が利用され得ることを理解されたい。 Although various embodiments of the invention are shown and described herein, it will be apparent to those skilled in the art that such embodiments are provided by way of example only. Numerous modifications, modifications, and replacements can be conceived by one of ordinary skill in the art without departing from the present invention. It should be understood that various alternatives of the embodiments of the invention described herein may be utilized.

本明細書で開示されるいくつかの方法の実施は、特段の定めのない限り、免疫学、生化学、化学、分子生物学、微生物学、細胞生物学、ゲノミクス、および組換えＤＮＡの技術を利用する。例えば、ＳａｍｂｒｏｏｋａｎｄＧｒｅｅｎ，ＭｏｌｅｃｕｌａｒＣｌｏｎｉｎｇ：ＡＬａｂｏｒａｔｏｒｙＭａｎｕａｌ，４ｔｈＥｄｉｔｉｏｎ（２０１２）；ｔｈｅｓｅｒｉｅｓＣｕｒｒｅｎｔＰｒｏｔｏｃｏｌｓｉｎＭｏｌｅｃｕｌａｒＢｉｏｌｏｇｙ（Ｆ．Ｍ．Ａｕｓｕｂｅｌ，ｅｔａｌ．ｅｄｓ．）；ｔｈｅｓｅｒｉｅｓＭｅｔｈｏｄｓＩｎＥｎｚｙｍｏｌｏｇｙ（ＡｃａｄｅｍｉｃＰｒｅｓｓ，Ｉｎｃ．），ＰＣＲ２：ＡＰｒａｃｔｉｃａｌＡｐｐｒｏａｃｈ（Ｍ．Ｊ．ＭａｃＰｈｅｒｓｏｎ，Ｂ．Ｄ．ＨａｍｅｓａｎｄＧ．Ｒ．Ｔａｙｌｏｒｅｄｓ．（１９９５）），ＨａｒｌｏｗａｎｄＬａｎｅ，ｅｄｓ．（１９８８）Ａｎｔｉｂｏｄｉｅｓ，ＡＬａｂｏｒａｔｏｒｙＭａｎｕａｌ，ａｎｄＣｕｌｔｕｒｅｏｆＡｎｉｍａｌＣｅｌｌｓ：ＡＭａｎｕａｌｏｆＢａｓｉｃＴｅｃｈｎｉｑｕｅａｎｄＳｐｅｃｉａｌｉｚｅｄＡｐｐｌｉｃａｔｉｏｎｓ，６ｔｈＥｄｉｔｉｏｎ（Ｒ．Ｉ．Ｆｒｅｓｈｎｅｙ，ｅｄ．（２０１０）））（これらは、参照によりその全体が本明細書に組み込まれる）。 Implementations of some of the methods disclosed herein include immunology, biochemistry, chemistry, molecular biology, microbiology, cell biology, genomics, and recombinant DNA techniques, unless otherwise specified. Use. For example, Sambrook and Green, Molecular Cloning: A Laboratory Manual, 4th Edition (2012); the series Current Protocols in Molecular System (F. ), PCR 2: A Practical Aproch (MJ MacPherson, BD Hames and GR Taylor eds. (1995)), Harlow and Lane, eds. (1988) Antibodies, A Laboratory Manual, and Culture of Animal Cells: A Manual of Basic Technique and Specialized Applications (see R. Freshney, 20th Incorporated in the statement).

本明細書で使用されるように、単数形「１つ（ａ）」、「１つ（ａｎ）」、および「その（ｔｈｅ）」は、文脈上他の意味を明白に示すものでない限り、同様に複数形を含むことを意図している。さらに、用語「含んでいる（ｉｎｃｌｕｄｉｎｇ）」、「含む（ｉｎｃｌｕｄｅｓ）」、「有している（ｈａｖｉｎｇ）」、「有する（ｈａｓ）」、「含んだ（ｗｉｔｈ）」、または、その変異形態が詳細な記載および／または請求項のいずれかで使用される程度には、上記のような用語は「含んでいる（ｃｏｍｐｒｉｓｉｎｇ）」との用語に類似する手法で包括的であることを意図している。 As used herein, the singular forms "one (a)", "one (an)", and "the" unless the context clearly indicates other meanings. It is also intended to include the plural. In addition, the terms "inclusion", "includes", "having", "has", "with", or variants thereof. To the extent used in any of the details and / or claims, such terms are intended to be inclusive in a manner similar to the term "comprising". There is.

「約」または「およそ」との用語は、当業者によって決定されるような特定の値の許容可能な誤差範囲内であることを意味し、その誤差範囲は、その値がどのように測定または決定されるか、つまり、測定システムの制限に部分的に依存する。例えば、「約」とは、当該技術分野での実践につき１または１を超える標準偏差を意味し得る。代替的に、「約」は、任意の値の最大２０％、最大１５％、最大１０％、最大５％、または最大１％の範囲を意味する場合がある。 The term "about" or "approximately" means that a particular value is within the permissible margin of error as determined by one of ordinary skill in the art, and that error range is how the value is measured or measured. It is determined, that is, it depends in part on the limitations of the measurement system. For example, "about" can mean a standard deviation of 1 or more than 1 per practice in the art. Alternatively, "about" may mean a range of up to 20%, up to 15%, up to 10%, up to 5%, or up to 1% of any value.

本明細書で使用されるように、「細胞」とは一般に、生体細胞を指す。細胞は、生体の基本構造単位、機能単位、および／または生物学的単位であり得る。細胞は、１つ以上の細胞を有する任意の生物に起源を持つ場合がある。いくつかの非限定的な例としては、原核細胞、真核細胞、細菌細胞、古細菌細胞、単一細胞の真核生物の細胞、原生動物細胞、植物の細胞（例えば、作物、果物、野菜、穀類、ダイズ、トウモロコシ（ｃｏｒｎ）、トウモロコシ（ｍａｉｚｅ）、小麦、種子、トマト、イネ、キャッサバ、サトウキビ、カボチャ、干し草、ジャガイモ、綿、アサ、タバコ、顕花植物、針葉樹、裸子植物、シダ、ヒカゲノカズラ類、ツノゴケ類、苔類、蘚類の細胞）、藻細胞（例えば、Ｂｏｔｒｙｏｃｏｃｃｕｓｂｒａｕｎｉｉ、Ｃｈｌａｍｙｄｏｍｏｎａｓｒｅｉｎｈａｒｄｔｉ、Ｎａｎｎｏｃｈｌｏｒｏｐｓｉｓｇａｄｉｔａｎａ、Ｃｈｌｏｒｅｌｌａｐｙｒｅｎｏｉｄｏｓａ、ＳａｒｇａｓｓｕｍｐａｔｅｎｓＣ．Ａｇａｒｄｈなど）、海草（例えば、ケルプ）、真菌細胞（例えば、酵母菌細胞、キノコからの細胞）、動物細胞、無脊髄動物（例えば、ショウジョウバエ、刺胞動物、棘皮動物、線虫など）の細胞、脊椎動物（例えば、魚、両生類、爬虫類、鳥、哺乳動物）の細胞、哺乳動物（例えば、ブタ、雌ウシ、ヤギ、ヒツジ、げっ歯類、ラット、マウス、非ヒト霊長類、ヒトなど）の細胞などが挙げられる。細胞は、天然の生物に起源を持たないこともある（例えば、細胞は合成的に作られ、人工細胞と呼ばれることもある）。 As used herein, "cell" generally refers to a living cell. A cell can be a basic structural unit, a functional unit, and / or a biological unit of an organism. Cells may originate from any organism that has one or more cells. Some non-limiting examples are prokaryotic cells, eukaryotic cells, bacterial cells, paleobacterial cells, single-cell eukaryotic cells, protozoan cells, plant cells (eg, crops, fruits, vegetables). , Grain, soybean, corn, corn, corn, wheat, seed, tomato, rice, cassaba, sugar cane, pumpkin, hay, potato, cotton, asa, tobacco, flowering plant, conifer, nude plant, fern, Hikagenokazura, Tsunogoke, Moss, 蘚 cells), Algae cells (eg, Botryococcus braunii, Chlamydomonas reinhardti, Nannochlopsis gaditana, Chlorella pyre), etc. , Yeast cells, cells from mushrooms), animal cells, spinal cord animals (eg, gypsum flies, spores, thorns, nematodes, etc.), vertebrates (eg, fish, amphibians, reptiles, birds, mammals, etc.) Examples include cells of animals), cells of mammals (eg, pigs, cows, goats, sheep, rodents, rats, mice, non-human primates, humans, etc.). Cells may not have their origin in natural organisms (eg, cells are synthetically produced and are sometimes called artificial cells).

「ヌクレオチド」との用語は、本明細書で使用されるように、一般に、塩基－糖－リン酸塩の組み合わせを指す。ヌクレオチドは合成ヌクレオチドを含むことがある。ヌクレオチドは合成ヌクレオチドアナログを含むことがある。ヌクレオチドは、核酸配列（例えば、デオキシリボ核酸（ＤＮＡ）およびリボ核酸（ＲＮＡ））の単量体単位である場合がある。ヌクレオチドとの用語には、リボヌクレオシド三リン酸アデノシン三リン酸（ＡＴＰ）、ウリジン三リン酸（ＵＴＰ）、シトシン三リン酸（ＣＴＰ）、グアノシン三リン酸（ＧＴＰ）、およびデオキシリボヌクレオシド三リン酸、例えば、ｄＡＴＰ、ｄＣＴＰ、ｄＩＴＰ、ｄＵＴＰ、ｄＧＴＰ、ｄＴＴＰ、またはそれらの誘導体が含まれ得る。そのような誘導体は、例えば、［αＳ］ｄＡＴＰ、７－デアザ－ｄＧＴＰおよび７－デアザ－ｄＡＴＰ、および、それらを含有する核酸分子にヌクレアーゼ耐性を与えるヌクレオチド誘導体を含む場合がある。ヌクレオチドとの用語は、本明細書に使用されるように、ジデオキシリボヌクレオシド三リン酸（ｄｄＮＴＰ）およびそれらの誘導体を指し得る。ジデオキシリボヌクレオシド三リン酸の例示的な例としては、限定されないが、ｄｄＡＴＰ、ｄｄＣＴＰ、ｄｄＧＴＰ、ｄｄＩＴＰ、およびｄｄＴＴＰが挙げられ得る。ヌクレオチドは標識されない場合があるか、または、光学的に検出可能な部分（例えば、フルオロフォア）を含む部分を使用するなどして、検出できるように標識される場合がある。標識化はまた、量子ドットを用いて実施されてもよい。検出可能な標識としては、例えば、放射性同位元素、蛍光標識、化学発光標識、生物発光標識、および酵素標識が挙げられ得る。ヌクレオチドの蛍光標識としては、限定されないが、フルオレセイン、５－カルボキシフルオセイン（ＦＡＭ）、２’７’－ジメトキシ－４’５－ジクロロ－６－カルボキシフルオセイン（ＪＯＥ）、ローダミン、６－カルボキシローダミン（Ｒ６Ｇ）、Ｎ，Ｎ，Ｎ’，Ｎ’－テトラメチル－６－カルボキシローダミン（ＴＡＭＲＡ）、６－カルボキシ－Ｘ－ローダミン（ＲＯＸ）、４－（４’ジメチルアミノフェニルアゾ）安息香酸（ＤＡＢＣＹＬ）、ＣａｓｃａｄｅＢｌｕｅ、ＯｒｅｇｏｎＧｒｅｅｎ、ＴｅｘａｓＲｅｄ、シアニン、および５－（２’－アミノエチル）アミノナフタレン－１－スルホン酸（ＥＤＡＮＳ）が挙げられ得る。蛍光標識されたヌクレオチドの特定の例としては、ＰｅｒｋｉｎＥｌｍｅｒ（ＦｏｓｔｅｒＣｉｔｙ，Ｃａｌｉｆ）から利用可能な［Ｒ６Ｇ］ｄＵＴＰ、［ＴＡＭＲＡ］ｄＵＴＰ、［Ｒ１１０］ｄＣＴＰ、［Ｒ６Ｇ］ｄＣＴＰ、［ＴＡＭＲＡ］ｄＣＴＰ、［ＪＯＥ］ｄｄＡＴＰ、［Ｒ６Ｇ］ｄｄＡＴＰ、［ＦＡＭ］ｄｄＣＴＰ、［Ｒ１１０］ｄｄＣＴＰ、［ＴＡＭＲＡ］ｄｄＧＴＰ、［ＲＯＸ］ｄｄＴＴＰ、［ｄＲ６Ｇ］ｄｄＡＴＰ、［ｄＲ１１０］ｄｄＣＴＰ、［ｄＴＡＭＲＡ］ｄｄＧＴＰ、および［ｄＲＯＸ］ｄｄＴＴＰ；Ａｍｅｒｓｈａｍ（ＡｒｌｉｎｇｔｏｎＨｅｉｇｈｔｓ，Ｉｌｌ）から利用可能なＦｌｕｏｒｏＬｉｎｋＤｅｏｘｙＮｕｃｌｅｏｔｉｄｅｓ、ＦｌｕｏｒｏＬｉｎｋＣｙ３－ｄＣＴＰ、ＦｌｕｏｒｏＬｉｎｋＣｙ５－ｄＣＴＰ、ＦｌｕｏｒｏＬｉｎｋＦｌｕｏｒＸ－ｄＣＴＰ、ＦｌｕｏｒｏＬｉｎｋＣｙ３－ｄＵＴＰ、およびＦｌｕｏｒｏＬｉｎｋＣｙ５－ｄＵＴＰ；ＢｏｅｈｒｉｎｇｅｒＭａｎｎｈｅｉｍ（Ｉｎｄｉａｎａｐｏｌｉｓ，Ｉｎｄ．）から利用可能なフルオレセイン－１５－ｄＡＴＰ、フルオレセイン－１２－ｄＵＴＰ、テトラメチル－ｒｏｄａｍｉｎｅ－６－ｄＵＴＰ、ＩＲ７７０－９－ｄＡＴＰ、フルオレセイン－１２－ｄｄＵＴＰ、フルオレセイン－１２－ＵＴＰ、およびフルオレセイン－１５－２’－ｄＡＴＰ；および、ＭｏｌｅｃｕｌａｒＰｒｏｂｅｓ（Ｅｕｇｅｎｅ，Ｏｒｅｇ）から利用可能なＣｈｒｏｍｏｓｏｍｅＬａｂｅｌｅｄＮｕｃｌｅｏｔｉｄｅｓ、ＢＯＤＩＰＹ－ＦＬ－１４－ＵＴＰ、ＢＯＤＩＰＹ－ＦＬ－４－ＵＴＰ、ＢＯＤＩＰＹ－ＴＭＲ－１４－ＵＴＰ、ＢＯＤＩＰＹ－ＴＭＲ－１４－ｄＵＴＰ、ＢＯＤＩＰＹ－ＴＲ－１４－ＵＴＰ、ＢＯＤＩＰＹ－ＴＲ－１４－ｄＵＴＰ、ＣａｓｃａｄｅＢｌｕｅ－７－ＵＴＰ、ＣａｓｃａｄｅＢｌｕｅ－７－ｄＵＴＰ、フルオレセイン－１２－ＵＴＰ、フルオレセイン－１２－ｄＵＴＰ、ＯｒｅｇｏｎＧｒｅｅｎ４８８－５－ｄＵＴＰ、ローダミンＧｒｅｅｎ－５－ＵＴＰ、ローダミンＧｒｅｅｎ－５－ｄＵＴＰ、テトラメチルローダミン６－ＵＴＰ、テトラメチルローダミン６－ｄＵＴＰ、ＴｅｘａｓＲｅｄ－５－ＵＴＰ、ＴｅｘａｓＲｅｄ－５－ｄＵＴＰ、およびＴｅｘａｓＲｅｄ－１２－ｄＵＴＰが挙げられ得る。ヌクレオチドも化学修飾によって標識（ｌａｂｅｌｅｄ）または標識（ｍａｒｋｅｄ）され得る。化学的に修飾された単一ヌクレオチドはビオチンｄＮＴＰである場合がある。ビオチン化されたｄＮＴＰのいくつかの非限定的な例としては、ビオチン－ｄＡＴＰ（例えば、ｂｉｏ－Ｎ６－ｄｄＡＴＰ、ｂｉｏｔｉｎ－１４－ｄＡＴＰ）、ビオチン－ｄＣＴＰ（例えば、ビオチン－１１－ｄＣＴＰ、ビオチン－１４－ｄＣＴＰ）、およびビオチン－ｄＵＴＰ（例えば、ビオチン－１１－ｄＵＴＰ、ビオチン－１６－ｄＵＴＰ、ビオチン－２０－ｄＵＴＰ）が挙げられ得る。 The term "nucleotide", as used herein, generally refers to a base-sugar-phosphate combination. Nucleotides may include synthetic nucleotides. Nucleotides may include synthetic nucleotide analogs. Nucleotides can be monomeric units of nucleic acid sequences (eg, deoxyribonucleic acid (DNA) and ribonucleic acid (RNA)). The term nucleotide includes ribonucleoside triphosphate adenosine triphosphate (ATP), uridine triphosphate (UTP), cytosine triphosphate (CTP), guanosine triphosphate (GTP), and deoxyribonucleoside triphosphate. For example, dATP, dCTP, dITP, dUTP, dGTP, dTTP, or derivatives thereof may be included. Such derivatives may include, for example, [αS] dATP, 7-deaza-dGTP and 7-deaza-dATP, and nucleotide derivatives that confer nuclease resistance on nucleic acid molecules containing them. The term nucleotide can refer to dideoxyribonucleoside triphosphate (ddNTP) and its derivatives, as used herein. Exemplary examples of dideoxyribonucleoside triphosphates include, but are not limited to, ddATP, ddCTP, ddGTP, ddITP, and ddTTP. Nucleotides may be unlabeled or labeled for detection, such as by using moieties that contain optically detectable moieties (eg, fluorophores). Labeling may also be performed using quantum dots. Detectable labels may include, for example, radioisotopes, fluorescent labels, chemiluminescent labels, bioluminescent labels, and enzyme labels. Fluorescent labels for nucleotides include, but are not limited to, fluorescein, 5-carboxyfluorescein (FAM), 2'7'-dimethoxy-4'5-dichloro-6-carboxyfluorosein (JOE), rhodamine, 6-carboxyrhodamine. (R6G), N, N, N', N'-Tetramethyl-6-Rhodamine (TAMRA), 6-carboxy-X-Rhodamine (ROX), 4- (4'dimethylaminophenylazo) benzoic acid (DABCYL) ), Cascade Blue, Fluorescein, Texas Red, cyanine, and 5- (2'-aminoethyl) aminonaphthalene-1-sulfonic acid (EDANS). Specific examples of fluorescently labeled nucleotides include [R6G] dUTP, [TAMRA] dUTP, [R110] dCTP, [R6G] dCTP, [TAMRA] dCTP, available from Perkin Elmer (Foster City, Calif). JOE] ddATP, [R6G] ddATP, [FAM] ddCTP, [R110] ddCTP, [TAMRA] ddGTP, [ROX] ddTTP, [dR6G] ddATP, [dR110] ddCTP, [dTAMRA] ddGTP, and [dROX] d Amersham (Arlington Heights, Ill) available from a FluoroLink DeoxyNucleotides, FluoroLink Cy3-dCTP, FluoroLink Cy5-dCTP, FluoroLink Fluor X-dCTP, FluoroLink Cy3-dUTP, and FluoroLink Cy5-dUTP; Boehringer Mannheim (Indianapolis, Ind.) from Available fluorescein-15-dATP, fluorescein-12-dUTP, tetramethyl-rodamine-6-dUTP, IR770-9-dATP, fluorescein-12-ddUTP, fluorescein-12-UTP, and fluorescein-15-2'- dATP; and Chromosome Labeled Nucleotidines, BODICY-FL-14-UTP, BODICY-FL-4-UTP, BODICY-TMR-14-UTP, BODICY-TM available from Molecular Probes (Eugene, Oreg). , BODICY-TR-14-UTP, BODICY-TR-14-dUTP, Cascade Blue-7-UTP, Cascade Blue-7-dUTP, Fluolecein-12-UTP, Fluolecein-12-dUTP, Oregon Green 488-5-dUTP , Rhodamine Green-5-UTP, Rhodamine Green-5-dUTP, Tetramethyl Rhodamine 6-UTP, Tetramethyl Rhodamine 6-dUTP, Texas Red-5-UTP, Texas Red-5-dUTP, and Texas Red-12-dUTP. Listed Can be. Nucleotides can also be labeled or marked by chemical modification. The single chemically modified nucleotide may be biotin dNTP. Some non-limiting examples of biotinylated dNTPs include bio-dATP (eg, bio-N6-ddATP, biotin-14-dATP), biotin-dCTP (eg, biotin-11-dCTP, biotin-). 14-dCTP), and biotin-dUTP (eg, biotin-11-dUTP, biotin-16-dUTP, biotin-20-dUTP) can be mentioned.

「ポリヌクレオチド」、「オリゴヌクレオチド」、および「核酸」との用語は、一般に、一本鎖、二本鎖、あるいは多重鎖（ｍｕｌｔｉ－ｓｔｒａｎｄｅｄ）の形態のいずれかの、任意の長さのヌクレオチドの高分子形態（（デオキシリボヌクレオチドまたはリボヌクレオチドのいずれか）、またはそのアナログを指すために交換可能に使用される。ポリヌクレオチドは、細胞に対して外因性または内因性であり得る。ポリヌクレオチドは、無細胞環境に存在することがある。ポリヌクレオチドは、遺伝子またはその断片であることがある。ポリヌクレオチドはＤＮＡであることがある。ポリヌクレオチドはＲＮＡであることがある。ポリヌクレオチドは、任意の三次元構造も有していてもよく、任意の機能を実施してもよい。ポリヌクレオチドは、１つ以上のアナログ（例えば、改変された骨格、糖、または核酸塩基）を含むことがある。存在する場合、ヌクレオチド構造に対する改変は、ポリマーのアセンブリの前または後で与えられ得る。アナログのいくつかの非限定的な例としては、５－ブロモウラシル、ペプチド核酸、ｘｅｎｏ核酸、モルフォリノ、ロックド核酸、グリコール核酸、トレオース核酸、ジデオキシヌクレオチド、コルジセピン、７－デアザ－ＧＴＰ、フルオロフォア（例えば、糖に結合したローダミンまたはフルオレセイン）、ヌクレオチドを含有するチオール、ビオチン結合ヌクレオチド、蛍光塩基アナログ（ｆｌｕｏｒｅｓｃｅｎｔｂａｓｅａｎａｌｏｇｓ）、ＣｐＧアイランド、メチル－７－グアノシン、メチル化ヌクレオチド、イノシン、チオウリジン、シュードウリジン（ｐｓｅｕｄｏｕｒｄｉｎｅ）、ジヒドロウリジン、キューオシン、およびワイオシンが挙げられる。ポリヌクレオチドの非限定的な例としては、遺伝子あるいは遺伝子断片のコード領域あるいは非コード領域、連鎖解析から定義された遺伝子座、エクソン、イントロン、メッセンジャーＲＮＡ（ｍＲＮＡ）、転移ＲＮＡ（ｔＲＮＡ）、リボソームＲＮＡ（ｒＲＮＡ）、低分子干渉ＲＮＡ（ｓｉＲＮＡ）、低分子ヘアピン型ＲＮＡ（ｓｈＲＮＡ）、マイクロＲＮＡ（ｍｉＲＮＡ）、リボザイム、ｃＤＮＡ、組換えポリヌクレオチド、分岐ポリヌクレオチド、プラスミド、ベクター、任意の配列の単離されたＤＮＡ、任意の配列の単離されたＲＮＡ、無細胞ＤＮＡ（ｃｆＤＮＡ）を含む無細胞のポリヌクレオチド、および無細胞ＲＮＡ（ｃｆＲＮＡ）、核酸プローブ、およびプライマーが挙げられる。ヌクレオチドの配列は、非ヌクレオチド構成要素によって中断される場合がある。 The terms "polynucleotide", "oligonucleotide", and "nucleic acid" generally refer to nucleotides of any length, either in the form of single-stranded, double-stranded, or multi-stranded. Used interchangeably to refer to the high molecular weight form of (either deoxyribonucleotide or ribonucleotide), or its analogs. Polynucleotides can be exogenous or endogenous to cells. The polynucleotide may be a gene or fragment thereof. The polynucleotide may be DNA. The polynucleotide may be RNA. The polynucleotide may be optional. It may also have a three-dimensional structure of, and may perform any function. A polynucleotide may contain one or more analogs (eg, a modified skeleton, sugar, or nucleobase). Modifications to the nucleotide structure, if present, can be given before or after assembly of the polymer. Some non-limiting examples of analogs are 5-bromouracil, peptide nucleic acid, xeno nucleic acid, morpholino, locked. Nucleic Acids, Glycol Nucleic Acids, Treose Nucleic Acids, Dideoxynucleotides, Cordisepines, 7-Daza-GTP, Fluorophores (eg, Sugar-Binding Rhodamine or Fluolecein), Nucleotide-Containing Thiols, Biotin-Binding Nucleotides, Fluorescent Base Analogs ), CpG islands, methyl-7-guanosine, methylated nucleotides, inosin, thiouridine, pseudourdine, dihydrouridine, cuosin, and wyosine. Non-limiting examples of polynucleotides include genes or genes. Coding or non-coding regions of the fragment, nucleotides defined from linkage analysis, exons, introns, messenger RNAs (mRNAs), translocated RNAs (tRNAs), ribosome RNAs (rRNAs), small interfering RNAs (siRNAs), small molecules Hairpin-type RNA (shRNA), microRNA (miRNA), ribozyme, cDNA, recombinant polynucleotide, branched polynucleotide, plasmid, vector, isolated DNA of any sequence, isolated RNA of any sequence, Cell-free polynucleoti containing cell-free DNA (cfDNA) And cell-free RNA (cfRNA), nucleic acid probes, and primers. The sequence of nucleotides may be interrupted by non-nucleotide components.

「トランスフェクション」または「トランスフェクトされた」との用語は、一般に、非ウイルスベースの方法あるいはウイルスベースの方法によって、核酸を細胞内に導入することを指す。核酸分子は、完全タンパク質あるいはその機能性部分をコードする遺伝子配列であり得る。例えば、Ｓａｍｂｒｏｏｋｅｔａｌ．，１９８９，ＭｏｌｅｃｕｌａｒＣｌｏｎｉｎｇ：ＡＬａｂｏｒａｔｏｒｙＭａｎｕａｌ，１８．１－１８．８８を参照されたい。 The term "transfected" or "transfected" generally refers to the introduction of nucleic acid into a cell by a non-virus-based method or a virus-based method. A nucleic acid molecule can be a complete protein or a gene sequence encoding a functional portion thereof. For example, Sambrook et al. , 1989, Molecular Cloning: A Laboratory Manual, 18.1-18.88.

「ペプチド」、「ポリペプチド」、および「タンパク質」との用語は、一般に、ペプチド結合によって結合された少なくとも２つのアミノ酸残基のポリマーを指すために、本明細書において交換可能に使用される。この用語は、ポリマーの特定の長さを暗示せず、また、ペプチドが組換え技術、化学的合成あるいは酵素的合成を使用して産生されるか、または天然に存在するかを暗示または識別することを意図しない。この用語は、天然に存在するアミノ酸ポリマー、ならびに、少なくとも１つの修飾されたアミノ酸を含むアミノ酸ポリマーに適用される。場合によっては、ポリマーが非アミノ酸によって中断される場合がある。この用語には、完全長のタンパク質を含む任意の長さのアミノ酸鎖、ならびに、２次構造および／または３次構造（例えば、ドメイン）を有するまたは有していないタンパク質が含まれる。この用語はまた、例えば、ジスルフィド結合形成、グリコシル化、脂質修飾、アセチル化、リン酸化、酸化、および他の操作、例えば、標識化成分とのコンジュゲートによって修飾されたアミノ酸ポリマーを包含する。「アミノ酸」との用語は、本明細書で使用されるように、一般に、天然アミノ酸、および、修飾されたアミノ酸およびアミノ酸アナログを含む非天然アミノ酸を指す。修飾されたアミノ酸は、天然アミノ酸および非天然アミノ酸を含むことがあり、これは自然に存在しない基あるいは化学的部分をアミノ酸上に含むように化学的に修飾されている。アミノ酸アナログはアミノ酸誘導体を指すこともある。「アミノ酸」との用語には、Ｄ－アミノ酸とＬ－アミノ酸の両方が含まれる。 The terms "peptide", "polypeptide", and "protein" are used interchangeably herein to refer to a polymer of at least two amino acid residues linked by a peptide bond. The term does not imply a particular length of polymer and also implies or identifies whether the peptide is produced using recombinant techniques, chemical or enzymatic synthesis, or is naturally occurring. Not intended. The term applies to naturally occurring amino acid polymers as well as amino acid polymers containing at least one modified amino acid. In some cases, the polymer may be interrupted by non-amino acids. The term includes amino acid chains of any length, including full-length proteins, as well as proteins with or without secondary and / or tertiary structures (eg, domains). The term also includes amino acid polymers modified, for example, by disulfide bond formation, glycosylation, lipid modification, acetylation, phosphorylation, oxidation, and other operations, eg, conjugates with labeled components. The term "amino acid", as used herein, generally refers to natural amino acids and unnatural amino acids, including modified amino acids and amino acid analogs. Modified amino acids may include natural and unnatural amino acids, which are chemically modified to include non-naturally occurring groups or chemical moieties on the amino acids. Amino acid analogs may also refer to amino acid derivatives. The term "amino acid" includes both D-amino acids and L-amino acids.

本明細書で使用されるように、「非天然」とは、一般に、天然の核酸またはタンパク質では見られない核酸またはポリペプチド配列を指す。非天然はアフィニティータグを指すことがある。非天然は融合を指すことがある。非天然は、突然変異、挿入、および／または欠失を含む天然に存在する核酸またはポリペプチド配列を指すことがある。非天然の配列は、非天然の配列が融合される核酸および／またはポリペプチド配列によって示される可能性がある活性（例えば、酵素活性、メチルトランスフェラーゼ活性、アセチルトランスフェラーゼ活性、キナーゼ活性、ユビキチン化活性など）を示す、および／またはコードする場合がある。非天然の核酸またはポリペプチド配列は、遺伝子操作によって、天然に存在する核酸またはポリペプチド配列（あるいは、その変異体）に結合され、キメラ核酸、および／または、キメラ核酸ならびに／あるいはポリペプチドをコードするポリペプチド配列を生成する場合がある。 As used herein, "non-natural" generally refers to nucleic acid or polypeptide sequences not found in natural nucleic acids or proteins. Non-natural may refer to affinity tags. Non-natural may refer to fusion. Non-natural may refer to a naturally occurring nucleic acid or polypeptide sequence containing mutations, insertions, and / or deletions. Non-natural sequences include activities that may be exhibited by nucleic acid and / or polypeptide sequences to which the non-natural sequences are fused (eg, enzyme activity, methyltransferase activity, acetyltransferase activity, kinase activity, ubiquitination activity, etc. ) May be indicated and / or coded. Non-natural nucleic acid or polypeptide sequences are genetically engineered to bind to naturally occurring nucleic acid or polypeptide sequences (or variants thereof) and encode chimeric nucleic acids and / or chimeric nucleic acids and / or polypeptides. May produce a polypeptide sequence.

「プロモーター」との用語は、本明細書で使用されるように、一般に、遺伝子の転写または発現を制御する調節ＤＮＡ領域を指し、ＲＮＡ転写が開始されるヌクレオチドあるいはヌクレオチドの領域に隣接または重複して位置する場合がある。プロモーターは、しばしば転写因子とも呼ばれる、タンパク質因子に結合する特異的ＤＮＡ配列を含有する場合があり、これは、ＤＮＡへのＲＮＡポリメラーゼの結合を促進し、遺伝子転写を引き起こす。「コアプロモーター」とも呼ばれる「基本プロモーター」は、一般に、動作可能に連結されたポリヌクレオチドの転写発現を促進するために必要な基本的な要素をすべて含有しているプロモーターを指す。真核生物の基本プロモーターは典型的に、必ずしもそうとは限らないが、ＴＡＴＡボックスおよび／またはＣＡＡＴボックスを含有している。 As used herein, the term "promoter" generally refers to a regulatory DNA region that controls transcription or expression of a gene, adjacent to or overlapping a nucleotide or region of nucleotide in which RNA transcription is initiated. May be located. Promoters may contain specific DNA sequences that bind to protein factors, often referred to as transcription factors, which facilitate the binding of RNA polymerase to DNA and trigger gene transcription. A "basic promoter," also referred to as a "core promoter," generally refers to a promoter that contains all the basic elements necessary to promote transcriptional expression of operably linked polynucleotides. Eukaryotic basal promoters typically, but not necessarily, contain TATA boxes and / or CAAT boxes.

「発現」との用語は、本明細書で使用されるように、一般に、ＤＮＡ鋳型から核酸配列またはポリヌクレオチドが（ｍＲＮＡあるいは他のＲＮＡ転写物などに）転写されるプロセス、および／または、転写されたｍＲＮＡがその後、ペプチド、ポリペプチド、あるいはタンパク質へと翻訳されるプロセスを指す。転写産物およびコードされたポリペプチドは、まとめて「遺伝子産物」と呼ばれることがある。ポリヌクレオチドがゲノムＤＮＡに由来する場合、発現は、真核細胞におけるｍＲＮＡのスプライシングを含むことがある。 The term "expression", as used herein, generally refers to the process by which a nucleic acid sequence or polynucleotide is transcribed (such as mRNA or other RNA transcript) from a DNA template and / or transcription. Refers to the process by which the resulting mRNA is subsequently translated into a peptide, polypeptide, or protein. Transcripts and encoded polypeptides are sometimes collectively referred to as "gene products." If the polynucleotide is derived from genomic DNA, expression may include splicing of mRNA in eukaryotic cells.

本明細書で使用されるように、「動作可能に連結する」、「動作可能な連結」、または「動作可能なように連結する」、またはその文法的等価物は一般に、遺伝要素、例えば、プロモーター、エンハンサー、ポリアデニル化配列などの並置を指し、これらの要素は、それらが予期された方法で動作することを可能にする関係にある。例えば、プロモーターおよび／またはエンハンサー配列を含み得る調節エレメントは、その調節エレメントがコード配列の転写を始めるのを支援する場合、コード領域に動作可能に連結される。この機能的関係が維持される限り、調節エレメントとコード領域の間に介在する残基が存在する場合がある。 As used herein, "operably concatenated", "operable concatenation", or "operably concatenated", or grammatical equivalents thereof, are generally genetic elements, eg, eg. It refers to juxtaposition of promoters, enhancers, polyadenylation sequences, etc., and these elements are in a relationship that allows them to operate in the expected manner. For example, a regulatory element that may include a promoter and / or enhancer sequence is operably linked to the coding region if it assists the regulatory element in initiating transcription of the coding sequence. As long as this functional relationship is maintained, there may be residues intervening between the regulatory element and the coding region.

「ベクター」とは、本明細書で使用されるように、一般に、ポリヌクレオチドを含むか、あるいはポリヌクレオチドと会合する高分子または高分子の集合体（ａｓｓｏｃｉａｔｉｏｎ）を指し、細胞へのポリヌクレオチドの送達を媒介するために使用され得る。ベクターの例としては、プラスミド、ウイルスベクター、リポソーム、および他の遺伝子送達ビヒクルが挙げられる。ベクターは一般に、標的中の遺伝子の発現を促進するために遺伝子に動作可能に連結された遺伝エレメント、例えば、調節エレメントを含む。 As used herein, "vector" generally refers to a macromolecule or macromolecular assembly that contains or associates with a polynucleotide, which is a polynucleotide to a cell. Can be used to mediate delivery. Examples of vectors include plasmids, viral vectors, liposomes, and other gene delivery vehicles. Vectors generally include genetic elements operably linked to the gene to promote expression of the gene in the target, eg, regulatory elements.

本明細書で使用されるように、「発現カセット」および「核酸カセット」は一般に、ともに発現されるか、あるいは発現のために動作可能に連結される核酸配列または要素の組み合わせを指すために交換可能に使用される。場合によっては、発現カセットは、調節エレメントと、それらが発現のために動作可能に連結される遺伝子との組み合わせを指す。 As used herein, "expression cassette" and "nucleic acid cassette" are generally exchanged to refer to a combination of nucleic acid sequences or elements that are expressed together or operably linked for expression. Used as possible. In some cases, an expression cassette refers to a combination of regulatory elements and genes to which they are operably linked for expression.

ＤＮＡまたはタンパク質配列の「機能的断片」とは一般に、完全長のＤＮＡまたはタンパク質配列の生物学的活性に実質的に類似する生物学的活性（機能的または構造的）を保持する断片を指す。ＤＮＡ配列の生物学的活性は、完全長の配列に起因すると知られている様式で発現に影響を与えるその能力であり得る。 A "functional fragment" of a DNA or protein sequence generally refers to a fragment that retains a biological activity (functional or structural) that is substantially similar to the biological activity of a full-length DNA or protein sequence. The biological activity of a DNA sequence can be its ability to influence expression in a manner known to be due to the full-length sequence.

本明細書で使用されるように、「操作された」対象は一般に、その対象がヒトの介入によって改変されていることを示す。非限定的な例によると、核酸は、その配列を自然界で生じない配列に変更することによって改変される場合がある；核酸は、ライゲーションされた産物がもとの核酸には存在しない機能を保有するように、その核酸を、その核酸が自然界では会合しない核酸にライゲーションすることによって改変される場合がある；操作された核酸は、自然界では存在しない配列とインビトロで合成される場合がある；タンパク質は、そのアミノ酸配列を自然界では存在しない配列に変更することによって改変される場合がある；操作されたタンパク質は、新しい機能あるいは特性を得る場合がある。「操作された」システムは、少なくとも１つの操作された構成要素を含む。 As used herein, a "manipulated" subject generally indicates that the subject has been modified by human intervention. According to a non-limiting example, a nucleic acid may be modified by changing its sequence to a sequence that does not occur in nature; the nucleic acid possesses a function that the ligated product does not exist in the original nucleic acid. As such, the nucleic acid may be modified by ligating the nucleic acid to a nucleic acid that the nucleic acid does not associate with in nature; the engineered nucleic acid may be synthesized in vitro with a sequence that does not exist in nature; May be modified by changing its amino acid sequence to a sequence that does not exist in nature; the engineered protein may gain new functions or properties. An "operated" system comprises at least one manipulated component.

本明細書で使用されるように、「合成」および「人工」は、天然に存在するヒトタンパク質に対して低い配列同一性（例えば、５０％未満の配列同一性、２５％未満の配列同一性、１０％未満の配列同一性、５％未満の配列同一性、１％未満の配列同一性）を有するタンパク質またはそのドメインを指すために交換可能に使用される。例えば、ＶＰＲとＶＰ６４のドメインは、合成トランス活性化ドメインである。 As used herein, "synthetic" and "artificial" have low sequence identity (eg, less than 50% sequence identity, less than 25% sequence identity) with respect to naturally occurring human proteins. Used interchangeably to refer to a protein or domain thereof having less than 10% sequence identity, less than 5% sequence identity, less than 1% sequence identity). For example, the VPR and VP64 domains are synthetic transactivation domains.

「ｔｒａｃｒＲＮＡ」または「ｔｒａｃｒ配列」との用語は、本明細書で使用されるように、一般に、野生型の例示的なｔｒａｃｒＲＮＡ配列（例えば、Ｓ．ｐｙｏｇｅｎｅｓ、黄色ブドウ球菌などからのｔｒａｃｒＲＮＡ、または配列番号：５４７６－５５１１）に対して少なくとも約５％、１０％、２０％、３０％、４０％、５０％、６０％、７０％、８０％、９０％、９５％、または１００％の配列同一性を有する核酸、および／またはその野生型の例示的なｔｒａｃｒＲＮＡ配列に類似する配列を指す場合がある。ｔｒａｃｒＲＮＡは、野生型の例示的なｔｒａｃｒＲＮＡ配列（例えば、Ｓ．ｐｙｏｇｅｎｅｓ、黄色ブドウ球菌などからのｔｒａｃｒＲＮＡ）に対して最大で約５％、１０％、２０％、３０％、４０％、５０％、６０％、７０％、８０％、９０％、あるいは１００％の配列同一性を有する核酸、および／またはその野生型の例示的なｔｒａｃｒＲＮＡ配列に類似する配列を指す場合がある。ｔｒａｃｒＲＮＡは、欠失、挿入、または置換などのヌクレオチド変化、変異体、突然変異、あるいはキメラを含む、ｔｒａｃｒＲＮＡの改変された形態を指す場合がある。ｔｒａｃｒＲＮＡは、少なくとも６つの連続するヌクレオチドのストレッチにわたって、野生型の例示的なｔｒａｃｒＲＮＡ（例えば、Ｓ．ｐｙｏｇｅｎｅｓ、黄色ブドウ球菌などからのｔｒａｃｒＲＮＡ）配列に対して少なくとも約６０％同一である核酸を指す場合がある。例えば、ｔｒａｃｒＲＮＡ配列は、少なくとも６つの連続するヌクレオチドのストレッチにわたって、野生型の例示的なｔｒａｃｒＲＮＡ（例えばＳ．ｐｙｏｇｅｎｅｓ、黄色ブドウ球菌などからのｔｒａｃｒＲＮＡ）配列に対して少なくとも約６０％同一、少なくとも約６５％同一、少なくとも約である７０％同一、少なくとも約である７５％同一、少なくとも約である８０％同一、少なくとも約である８５％同一、少なくとも約である９０％同一、少なくとも約である９５％同一、少なくとも約である９８％、少なくとも約９９％同一、または１００％同一である場合がある。ＩＩ型ｔｒａｃｒＲＮＡ配列は、隣接したＣＲＩＳＰＲアレイ中の反復配列の一部に相補性を有する領域を同定することによって、ゲノム配列上で予測することができる。 The term "tracrRNA" or "tracr sequence", as used herein, is generally a wild-type exemplary tracrRNA sequence (eg, tracrRNA, or sequence from S. pyogenes, Staphylococcus aureus, etc.). Number: 5476-5511) with at least about 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 100% sequence identical. It may refer to a nucleic acid having sex and / or a sequence similar to the exemplary tracrRNA sequence of its wild form. TracrRNA is up to about 5%, 10%, 20%, 30%, 40%, 50%, relative to wild-type exemplary tracrRNA sequences (eg, tracrRNA from S. pyogenes, Staphylococcus aureus, etc.). It may refer to a nucleic acid having 60%, 70%, 80%, 90%, or 100% sequence identity and / or a sequence similar to its wild-type exemplary tracrRNA sequence. TracrRNA may refer to a modified form of tracrRNA, including nucleotide changes such as deletions, insertions, or substitutions, variants, mutations, or chimeras. When a tracrRNA refers to a nucleic acid that is at least about 60% identical to a wild-type exemplary tracrRNA (eg, tracrRNA from S. pyogenes, Staphylococcus aureus, etc.) sequence over a stretch of at least 6 consecutive nucleotides. There is. For example, the tracrRNA sequence is at least about 60% identical to, at least about 65, the wild-type exemplary tracrRNA (eg, tracrRNA from S. pyogenes, Staphylococcus aureus, etc.) sequence over a stretch of at least 6 consecutive nucleotides. % Identical, at least about 70% identical, at least about 75% identical, at least about 80% identical, at least about 85% identical, at least about 90% identical, at least about 95% identical , At least about 98%, at least about 99% identical, or 100% identical. Type II tracrRNA sequences can be predicted on genomic sequences by identifying regions of complementarity to some of the repetitive sequences in adjacent CRISPR arrays.

本明細書で使用されるように、「ガイド核酸」は一般に、別の核酸にハイブリダイズすることができる核酸を指す場合がある。ガイド核酸はＲＮＡであることがある。ガイド核酸はＤＮＡであることがある。ガイド核酸は、核酸の配列に部位特異的に結合するようにプログラムされてもよい。標的とされた核酸または標的核酸は、ヌクレオチドを含むことがある。ガイド核酸はヌクレオチドを含むことがある。標的核酸の一部は、ガイド核酸の一部に相補的であり得る。ガイド核酸に相補的であり、そのガイド核酸とハイブリダイズする二本鎖標的ポリヌクレオチドの鎖は、相補鎖と呼ばれることがある。相補鎖に相補的であり、したがって、ガイド核酸に相補的でない場合がある二本鎖標的ポリヌクレオチドの鎖は、非相補鎖（ｎｏｎｃｏｍｐｌｅｍｅｎｔａｒｙｓｔｒａｎｄ）と呼ばれることがある。ガイド核酸は、１つのポリヌクレオチド鎖を含む場合があり、「単一ガイド核酸（ｓｉｎｇｌｅｇｕｉｄｅｎｕｃｌｅｉｃａｃｉｄ）」と呼ばれることがある。ガイド核酸は、２つのポリヌクレオチド鎖を含む場合があり、「二重ガイド核酸（ｄｏｕｂｌｅｇｕｉｄｅｎｕｃｌｅｉｃａｃｉｄ）」と呼ばれることがある。特に明記しない限り、「ガイド核酸」との用語は包括的であり、単一ガイド核酸および二重ガイド核酸の両方を指す場合がある。ガイド核酸は、「核酸を標的とするセグメント」または「核酸を標的とする配列」と呼ばれることがある、セグメントを含んでいてもよい。核酸を標的とするセグメントは、「タンパク質結合セグメント」または「タンパク質結合配列」または「Ｃａｓタンパク質結合セグメント」と呼ばれることがあるサブセグメントを含んでいてもよい。 As used herein, "guide nucleic acid" may generally refer to a nucleic acid that can hybridize to another nucleic acid. The guide nucleic acid may be RNA. The guide nucleic acid may be DNA. The guide nucleic acid may be programmed to bind site-specifically to the sequence of nucleic acids. The targeted nucleic acid or target nucleic acid may contain nucleotides. The guide nucleic acid may contain nucleotides. The portion of the target nucleic acid may be complementary to the portion of the guide nucleic acid. The strand of the double-stranded target polynucleotide that is complementary to the guide nucleic acid and hybridizes to the guide nucleic acid is sometimes referred to as the complementary strand. The strands of a double-stranded target polynucleotide that are complementary to the complementary strand and therefore may not be complementary to the guide nucleic acid are sometimes referred to as noncomplementary strands. The guide nucleic acid may contain one polynucleotide chain and may be referred to as a "single guide nucleic acid". The guide nucleic acid may contain two polynucleotide chains and may be referred to as a "double guide nucleic acid". Unless otherwise stated, the term "guide nucleic acid" is comprehensive and may refer to both single and double guide nucleic acids. The guide nucleic acid may include a segment, which may be referred to as a "nucleic acid targeting segment" or a "nucleic acid targeting sequence". Nucleic acid targeting segments may include subsegments that may be referred to as "protein binding segments" or "protein binding sequences" or "Cas protein binding segments".

２つ以上の核酸あるいはポリペプチド配列の文脈において、「配列同一性」または「パーセント同一性」との用語は一般に、２つ（例えば、ペアワイズアラインメント）、またはそれ以上（例えば、多重配列アライメント）の配列を指し、それらの配列は、配列比較アルゴリズムを使用して測定されるように、局所的または全体的な比較ウィンドウにわたる最大の対応のために比較または整列されたとき、同じであるか、あるいは同じアミノ酸残基またはヌクレオチドの指定された割合を有する。ポリペプチド配列に適切な配列比較アルゴリズムには、例えば、３のｗｏｒｄｌｅｎｇｔｈ（Ｗ）、１０のｅｘｐｅｃｔａｔｉｏｎ（Ｅ）、および１１のｅｘｉｓｔｅｎｃｅ、１のｅｘｔｅｎｓｉｏｎでギャップコストを設定するＢＬＯＳＵＭ６２スコアリングマトリックスのパラメータを使用する、および３０の残基よりも長いポリペプチド配列の条件付き組成スコアマトリックス調整（ｃｏｎｄｉｔｉｏｎａｌｃｏｍｐｏｓｉｔｉｏｎａｌｓｃｏｒｅｍａｔｒｉｘａｄｊｕｓｔｍｅｎｔ）を使用するＢＬＡＳＴＰ；２のｗｏｒｄｌｅｎｇｔｈ（Ｗ）、１００００００のｅｘｐｅｃｔａｔｉｏｎ（Ｅ）のパラメータ、および３０残基未満の配列に対してギャップを開くために９で、ギャップを拡張するために１でギャップコストを設定するＰＡＭ３０スコアリングマトリックスを使用するＢＬＡＳＴＰ（これらは、ｈｔｔｐｓ：／／ｂｌａｓｔ．ｎｃｂｉ．ｎｌｍ．ｎｉｈ．ｇｏｖで利用可能なＢＬＡＳＴｓｕｉｔｅにおけるＢＬＡＳＴＰのデフォルトパラメータである）；パラメータを用いるＣＬＵＳＴＡＬＷ；２のｍａｔｃｈ、－１のｍｉｓｍａｔｃｈ、および－１のｇａｐのパラメータを用いるＳｍｉｔｈ－Ｗａｔｅｒｍａｎ相同性検索アルゴリズム；デフォルトパラメータを用いるＭＵＳＣＬＥ；２のｒｅｔｒｅｅおよび１０００のｍａｘｉｔｅｒａｔｉｏｎｓのパラメータを用いるＭＡＦＦＴ；デフォルトパラメータを用いるＮｏｖａｆｏｌｄ；デフォルトパラメータを用いるＨＭＭＥＲｈｍｍａｌｉｇｎが含まれる。 In the context of two or more nucleic acid or polypeptide sequences, the term "sequence identity" or "percent identity" generally refers to two (eg, pairwise alignment) or more (eg, multiple sequence alignment). Refers to sequences, which are the same or the same when compared or aligned for maximum correspondence across the local or overall comparison window, as measured using the sequence comparison algorithm. Has the same amino acid residue or specified proportion of nucleotides. Suitable sequence comparison algorithms for polypeptide sequences use, for example, BLASTUM62 scoring matrix parameters that set the gap cost at 3 wordsth (W), 10 exposure (E), and 11 extensions and 1 extensions. And using BLASTP using conditional compositional score matrix algorithm for polypeptide sequences longer than 30 residues; 2 wordlength (W), 10000 extraction (E) parameters, and 30. BLASTP using a PAM30 scoring matrix that sets the gap cost at 9 to open the gap and 1 to extend the gap for sequences less than residues (these are https://blast.ncbi.nlm). It is the default parameter of BLASTP in BLAST suite available in .nih.gov); CLASTALW with parameters; Smith-Waterman homology search algorithm with parameters of 2 match, -1 mismatch, and -1 gap; Includes MUSCLE with default parameters; MAFFT with 2 regions and 1000 maxiterations parameters; Novafold with default parameters; HMMER hmmalign with default parameters.

本明細書で使用されるように、「ＲｕｖＣ＿ＩＩＩドメイン」との用語は一般に、ＲｕｖＣエンドヌクレアーゼドメイン（３つの不連続セグメントであるＲｕｖＣ＿Ｉ、ＲｕｖＣ＿ＩＩ、およびＲｕｖＣ＿ＩＩＩで構成されているＲｕｖＣヌクレアーゼドメイン）の３つ目の不連続セグメントを指す。ＲｕｖＣドメインまたはそのセグメントは一般に、既知のドメイン配列に対するアラインメント、注釈されたドメインを有するタンパク質に対する構造アラインメントによって、あるいは、既知のドメイン配列に基づいて構築された隠れマルコフモデル（ＨＭＭ）（例えば、ＲｕｖＣ＿ＩＩＩではＰｆａｍＨＭＭＰＦ１８５４１）との比較によって、同定することができる。 As used herein, the term "RuvC_III domain" is generally three of the RuvC endonuclease domain (the RuvC nuclease domain composed of the three discontinuous segments RuvC_I, RuvC_II, and RuvC_III). Refers to the discontinuous segment of the eye. A RuvC domain or segment thereof is generally a Hidden Markov Model (HMM) constructed by alignment to a known domain sequence, structural alignment to a protein having an annotated domain, or based on a known domain sequence (eg, in RuvC_III). It can be identified by comparison with Pfam HMM PF18541).

本明細書で使用されるように、「ＨＮＨドメイン」との用語は一般に、特徴的なヒスチジンおよびアスパラギン残基を有するエンドヌクレアーゼドメインを指す。ＨＮＨドメインは一般に、既知のドメイン配列に対するアラインメント、注釈されたドメインを有するタンパク質に対する構造アラインメントによって、あるいは、既知のドメイン配列に基づいて構築された隠れマルコフモデル（ＨＭＭ）（例えば、ドメインＨＮＨではＰｆａｍＨＭＭＰＦ０１８４４）との比較によって同定することができる。 As used herein, the term "HNH domain" generally refers to an endonuclease domain with characteristic histidine and asparagine residues. HNH domains are generally Hidden Markov Models (HMMs) constructed by alignment to known domain sequences, structural alignments to proteins with annotated domains, or based on known domain sequences (eg, Pfam HMM in domain HNH). It can be identified by comparison with PF01844).

概要 Overview

特有の機能および構造を有する新しいＣａｓ酵素の発見は、デオキシリボ核酸（ＤＮＡ）編集技術をさらに混乱させる（ｄｉｓｒｕｐｔ）可能性を提示し、速度、特異性、機能性、および使いやすさを改善することができる。微生物における、クラスター化して規則的な配置の短い回文配列リピート（ＣＲＩＳＰＲ）システムの予測された有病率（ｐｒｅｖａｌｅｎｃｅ）および多種多様な微生物種と比較して、機能的に特徴づけられたＣＲＩＳＰＲ／Ｃａｓ酵素は、文献には比較的ほとんど存在しない。これは部分的に、莫大な数の微生物種が実験室条件で容易に培養されない可能性があるためである。多くの微生物種を表す自然環境的ニッチからのメタゲノム配列決定により、既知の新しいＣＲＩＳＰＲ／Ｃａｓシステムの数が急激に増加し、新しいオリゴヌクレオチド編集機能の発見が促進される可能性が提示され得る。そのようなアプローチの有益さの最近の例は、天然微生物群のメタゲノム解析からのＣａｓＸ／ＣａｓＹＣＲＩＳＰＲシステムの２０１６年の発見によって示される。 The discovery of a new Cas enzyme with a unique function and structure presents the potential to further disrupt deoxyribonucleic acid (DNA) editing techniques, improving speed, specificity, functionality, and ease of use. Can be done. CRISPR / functionally characterized in comparison to the predicted prevalence and diverse microbial species of clustered and regularly arranged short palindromic sequence repeat (CRISPR) systems in microorganisms. Cas enzymes are relatively rare in the literature. This is partly because huge numbers of microbial species may not be easily cultivated in laboratory conditions. Metagenomic sequencing from natural environmental niches representing many microbial species may suggest that the number of known new CRISPR / Cas systems may increase exponentially and facilitate the discovery of new oligonucleotide editing functions. A recent example of the benefits of such an approach is demonstrated by the 2016 discovery of the CasX / CasY CRISPR system from metagenomic analysis of natural microorganisms.

ＣＲＩＳＰＲ／Ｃａｓシステムは、微生物中の適応免疫系として機能すると説明されている、ＲＮＡ指向性ヌクレアーゼ複合体である。それらの自然な文脈では、ＣＲＩＳＰＲ／ＣａｓシステムがＣＲＩＳＰＲ（クラスター化して規則的な配置の短い回文配列リピート）オペロンまたは遺伝子座に生じ、これは一般に２つの部分：（ｉ）ＲＮＡベースの標的化要素をコードする、等しく短いスペーサー配列によって分離された短い反復配列のアレイ（３０－４０ｂｐ）と；（ｉｉ）アクセサリータンパク質／アクセサリー酵素とともに、ＲＮＡベースの標的化要素によって向けられたヌクレアーゼポリペプチドをコードするＣａｓをコードするＯＲＦとを含む。特定の標的核酸配列の効率的なヌクレアーゼ標的化は一般に、（ｉ）標的の最初の６～８の核酸（標的シード（ｔａｒｇｅｔｓｅｅｄ））とｃｒＲＮＡガイドとの間の相補的なハイブリダイゼーションと；（ｉｉ）標的シードの定義された近傍内のプロトスペーサー隣接モチーフ（ＰＡＭ）配列の存在（ＰＡＭは一般に、宿主ゲノム内では一般的に表されない配列である）との両方を必要とする。上記システムの正確な機能および構成に応じて、ＣＲＩＳＰＲ－Ｃａｓシステムは、共有される機能特性および進化の類似性に基づいて、２つのクラス、５つの型、および１６の亜型へと一般的に組織化される。 The CRISPR / Cas system is an RNA-directed nuclease complex described to function as an adaptive immune system in microorganisms. In their natural context, a CRISPR / Cas system occurs at a CRISPR (clustered, regularly arranged short palindromic sequence repeat) operon or locus, which generally has two parts: (i) RNA-based targeting. An array of short repeats (30-40bp) separated by equally short spacer sequences encoding the elements; (ii) encoding CRISPR polypeptides directed by RNA-based targeting elements, along with accessory proteins / accessory enzymes. Includes an ORF that encodes Cas. Efficient nuclease targeting of a particular target nucleic acid sequence generally involves (i) complementary hybridization between the first 6-8 nucleic acids of the target (target seeded) and the crRNA guide; ii) It requires both the presence of a protospacer-adjacent motif (PAM) sequence within the defined neighborhood of the target seed (PAM is generally a sequence that is not commonly represented within the host genome). Depending on the exact function and configuration of the system, the CRISPR-Cas system will generally be divided into two classes, five types, and 16 subtypes, based on shared functional characteristics and evolutionary similarities. Be organized.

クラスＩのＣＲＩＳＰＲ－Ｃａｓシステムは、大きなマルチサブユニットエフェクター複合体を有しており、Ｉ型、ＩＩＩ型、およびＩＶ型を含む。 Class I CRISPR-Cas systems have large multi-subunit effector complexes, including type I, type III, and type IV.

Ｉ型のＣＲＩＳＰＲ－Ｃａｓシステムは、構成要素の観点から中程度の複雑さであると考えられる。Ｉ型のＣＲＩＳＰＲ－Ｃａｓシステムでは、ＲＮＡを標的とする要素のアレイは長い前駆体ｃｒＲＮＡ（プレｃｒＲＮＡ）として転写され、これは反復要素で処理されて、短く成熟したｃｒＲＮＡを遊離し、この短く成熟したｃｒＲＮＡは、それらの後にプロトスペーサー隣接モチーフ（ＰＡＭ）と呼ばれる適切な短いコンセンサス配列が続くと、ヌクレアーゼ複合体を核酸標的に向ける。この処理は、カスケードと呼ばれる大きなエンドヌクレアーゼ複合体のエンドリボヌクレアーゼサブユニット（Ｃａｓ６）を介して行われ、これはさらに、ｃｒＲＮＡ指向性ヌクレアーゼ複合体のヌクレアーゼ（Ｃａｓ３）タンパク質成分を含む。ＣａｓＩヌクレアーゼは、ＤＮＡヌクレアーゼとして主に機能する。 Type I CRISPR-Cas systems are considered to be of moderate complexity in terms of components. In the CRISPR-Cas system of type I, an array of elements targeting RNA is transcribed as a long precursor crRNA (pre-crRNA), which is processed with repetitive elements to release the short mature crRNA, which is short and mature. The resulting crRNAs are followed by a suitable short consensus sequence called a protospacer flanking motif (PAM), which directs the nuclease complex to the nucleic acid target. This treatment is carried out via the endoribonuclease subunit (Cas6) of a large endonuclease complex called cascade, which further comprises the nuclease (Cas3) protein component of the crRNA-directed nuclease complex. Cas I nuclease mainly functions as a DNA nuclease.

ＩＩＩ型のＣＲＩＳＰＲシステムは、ＣｓｍまたはＣｍｒのタンパク質サブユニットを含むリピート関連ミステリアスタンパク質（ｒｅｐｅａｔ－ａｓｓｏｃｉａｔｅｄｍｙｓｔｅｒｉｏｕｓｐｒｏｔｅｉｎ）（ＲＡＭＰ）とともに、Ｃａｓ１０として知られる中央ヌクレアーゼの存在を特徴とする場合がある。Ｉ型のシステムのように、成熟したｃｒＲＮＡは、Ｃａｓ６のような酵素を使用してプレｃｒＲＮＡから処理される。Ｉ型およびＩＩ型のシステムとは異なり、ＩＩＩ型のシステムは、ＤＮＡ－ＲＮＡ二重鎖（ＲＮＡポリメラーゼの鋳型として使用されるＤＮＡ鎖など）を標的とし、切断するように思われる。 The CRISPR system of type III may be characterized by the presence of a central nuclease known as Cas10, along with a repeat-associated mysterious protein (RAMP) containing a protein subunit of Csm or Cmr. Like the type I system, mature crRNA is processed from pre-crRNA using an enzyme such as Cas6. Unlike the Type I and Type II systems, the Type III system appears to target and cleave DNA-RNA duplexes, such as the DNA strand used as a template for RNA polymerase.

ＩＶ型のＣＲＩＳＰＲ－Ｃａｓシステムは、高度に還元された（ｈｉｇｈｌｙｒｅｄｕｃｅｄ）大サブユニットヌクレアーゼ（ｃｓｆ１）と、Ｃａｓ５（ｃｓｆ３）とＣａｓ７（ｃｓｆ２）の群のＲＡＭＰタンパク質の２つの遺伝子と、場合によっては、予測された小サブユニットの１つの遺伝子とからなるエフェクター複合体を持ち；そのようなシステムは一般的に、内因性のプラスミド上で見られる。 The type IV CRISPR-Cas system consists of two genes, a highly reduced large subunit nuclease (csf1) and a plasmid protein in the Cas5 (csf3) and Cas7 (csf2) groups, and in some cases, the RAMP protein. Has an effector complex consisting of one gene of the predicted small subunit; such systems are commonly found on endogenous plasmids.

クラスＩＩのＣＲＩＳＰＲ－Ｃａｓシステムは一般に、単一のポリペプチドのマルチドメインヌクレアーゼエフェクターを有しており、ＩＩ型、Ｖ型、およびＶＩ型を含む。 Class II CRISPR-Cas systems generally have a single polypeptide multidomain nuclease effector, including type II, type V, and type VI.

ＩＩ型のＣＲＩＳＰＲ－Ｃａｓシステムは、構成要素の観点から最も単純であると考えられる。ＩＩ型のＣＲＩＳＰＲ－Ｃａｓシステムでは、ＣＲＩＳＰＲアレイを成熟したｃｒＲＮＡに処理するには、アレイ反復配列に相補的な領域を有する小さなトランスコードされた（ｔｒａｎｓ－ｅｎｃｏｄｅｄ）ｃｒＲＮＡ（ｔｒａｃｒＲＮＡ）ではなく、特別なエンドヌクレアーゼサブユニットの存在が必要となり；ｔｒａｃｒＲＮＡは、その対応するエフェクターヌクレアーゼ（例えば、Ｃａｓ９）と反復配列の両方と相互作用することで前駆体ｄｓＲＮＡ構造を形成し、この前駆体ｄｓＲＮＡ構造は、内因性のＲＮＡｓｅＩＩＩによって切断されて、ｔｒａｃｒＲＮＡとｃｒＲＮＡの両方がロードされた成熟したエフェクター酵素を生成する。ＣａｓＩＩヌクレアーゼはＤＮＡヌクレアーゼとして知られている。２型エフェクターは一般に、無関係なＨＮＨヌクレアーゼドメインがＲｕｖＣ様ヌクレアーゼドメインのフォールド内に挿入されたＲＮａｓｅＨフォールドを採用する、ＲｕｖＣ様エンドヌクレアーゼドメインからなる構造を示す。ＲｕｖＣ様ドメインは、標的（例えば、ｃｒＲＮＡ相補的な）ＤＮＡ鎖の切断の原因となり、一方で、ＨＮＨドメインは置換されたＤＮＡ鎖の切断の原因となる。 The type II CRISPR-Cas system is considered to be the simplest in terms of components. In the type II CRISPR-Cas system, processing a CRISPR array into a mature crRNA is not a small transcoded crRNA (tracrRNA) with regions complementary to the array repeats, but a special one. The presence of an endonuclease subsystem is required; tracrRNA interacts with both its corresponding effector nuclease (eg Cas9) and repetitive sequences to form a precursor dsRNA structure, which is an intrinsic cause. It is cleaved by sex RNAse III to produce a mature effector enzyme loaded with both tracrRNA and crRNA. Cas II nucleases are known as DNA nucleases. Type 2 effectors generally exhibit a structure consisting of a RuvC-like endonuclease domain that employs an RNase H fold in which an irrelevant HNH nuclease domain is inserted within the fold of the RuvC-like nuclease domain. The RuvC-like domain causes cleavage of the target (eg, crRNA-complementary) DNA strand, while the HNH domain causes cleavage of the substituted DNA strand.

Ｖ型のＣＲＩＳＰＲ－Ｃａｓシステムは、ＲｕｖＣ様ドメインを含む、ＩＩ型エフェクターのヌクレアーゼエフェクターと類似するヌクレアーゼエフェクター（例えば、Ｃａｓ１２）構造を特徴とする。ＩＩ型と同様に、ほとんどの（しかし、すべてでない）Ｖ型のＣＲＩＳＰＲシステムは、プレｃｒＲＮＡを成熟したｃｒＲＮＡへと処理するためにｔｒａｃｒＲＮＡを使用する；しかし、プレｃｒＲＮＡを切断して複数のｃｒＲＮＡにするためにＲＮＡｓｅＩＩＩを必要とするＩＩ型のシステムとは異なり、Ｖ型のシステムは、プレｃｒＲＮＡを切断するために、エフェクターヌクレアーゼそれ自体を使用することができる。ＩＩ型のＣＲＩＳＰＲ－Ｃａｓシステムと同様に、Ｖ型のＣＲＩＳＰＲ－ＣａｓシステムはＤＮＡヌクレアーゼとしても知られている。ＩＩ型のＣＲＩＳＰＲ－Ｃａｓシステムとは異なり、いくつかのＶ型酵素（例えば、Ｃａｓ１２ａ）は、二本鎖標的配列の第１のｃｒＲＮＡ指向性切断によって活性化される、頑強な一本鎖の非特異的なデオキシリボヌクレアーゼ活性を有するように思われる。 The V-type CRISPR-Cas system features a nuclease effector (eg, Cas12) structure similar to the nuclease effector of type II effectors, including the RuvC-like domain. Like type II, most (but not all) V-type CRISPR systems use tracrRNAs to process pre-crRNAs into mature crRNAs; however, they cleave the pre-crRNAs into multiple crRNAs. Unlike type II systems, which require RNAse III to do so, type V systems can use the effector nuclease itself to cleave precrRNA. Similar to the type II CRISPR-Cas system, the type V CRISPR-Cas system is also known as a DNA nuclease. Unlike the Type II CRISPR-Cas system, some V-type enzymes (eg, Cas12a) are robust single-stranded non-strands activated by the first crRNA-directed cleavage of the double-stranded target sequence. It appears to have specific deoxyribonuclease activity.

ＶＩ型のＣＲＩＰＳＲ－Ｃａｓシステムは、ＲＮＡガイドＲＮＡエンドヌクレアーゼ（ＲＮＡ－ｇｕｉｄｅｄＲＮＡｅｎｄｏｎｕｃｌｅａｓｅｓ）を有する。ＲｕｖＣ様ドメインの代わりに、ＶＩ型のシステム（例えば、Ｃａｓ１３）の単一のポリペプチドエフェクターは、２つのＨＥＰＮリボヌクレアーゼドメインを含む。ＩＩ型およびＶ型のシステムの両方とは異なり、ＶＩ型のシステムは、プレｃｒＲＮＡをｃｒＲＮＡへと処理するために、ｔｒａｃｒＲＮＡを必要としないように思われる。しかし、Ｖ型のシステムと同様に、いくつかのＶＩ型のシステム（例えば、Ｃ２Ｃ２）は、標的ＲＮＡの第１のｃｒＲＮＡ指向性切断によって活性化された、頑強な一本鎖の非特異的ヌクレアーゼ（リボヌクレアーゼ）活性を持つように思われる。 The VI-type CRISPR-Cas system has an RNA-guided RNA endonuclease. Instead of a RuvC-like domain, a single polypeptide effector in a VI-type system (eg, Cas13) comprises two HEPN ribovocrim domains. Unlike both type II and type V systems, the type VI system does not appear to require tracrRNA to process pre-crRNA into crRNA. However, like the V-type system, some VI-type systems (eg, C2C2) are robust single-stranded non-specific nucleases activated by the first crRNA-directed cleavage of the target RNA. Seems to have (ribonuclease) activity.

クラスＩＩのＣＲＩＳＰＲ－Ｃａｓは、そのより単純な構造ゆえに、デザイナーヌクレアーゼ（ｄｅｓｉｇｎｅｒｎｕｃｌｅａｓｅ）／ゲノム編集用途として、エンジニアリングおよび開発のために最も広く採用されている。 Class II CRISPR-Cas, due to its simpler structure, is most widely adopted for engineering and development as a designer nucleose / genome editing application.

インビトロでの使用のためのそのようなシステムの初期の適応のうちの１つは、Ｊｉｎｅｋら（Ｓｃｉｅｎｃｅ．２０１２Ａｕｇ１７；３３７（６０９６）：８１６－２、これは参照によって完全に本明細書に組み込まれる）において見ることができる。Ｊｉｎｅｋの試験では、（ｉ）Ｓ．ｐｙｏｇｅｎｅｓＳＦ３７０から単離された、組換え的に（ｒｅｃｏｍｂｉｎａｎｔｌｙ）発現されて精製された完全長のＣａｓ９（例えば、クラスＩＩのＩＩ型Ｃａｓ酵素）、（ｉｉ）切断されることが望まれる標的ＤＮＡ配列に相補的な～２０ｎｔ５’配列と、それに続く３’ｔｒａｃｒ結合配列とを有する、精製された成熟～４２ｎｔｃｒＲＮＡ（ｃｒＲＮＡ全体が、Ｔ７プロモーター配列を有する合成ＤＮＡ鋳型からインビトロで転写される）；（ｉｉｉ）Ｔ７プロモーター配列を有する合成ＤＮＡ鋳型からインビトロで転写された、精製されたｔｒａｃｒＲＮＡ、および（ｉｖ）Ｍｇ２＋を含むシステムが最初に説明された。Ｊｉｎｅｋは、その後、改善された操作されたシステムを説明し、そのシステムでは、それ自体でＣａｓ９を標的に向けることができる単一の融合された合成ガイドＲＮＡ（ｓｇＲＮＡ）を形成するために、（ｉｉ）のｃｒＲＮＡが、リンカー（例えば、ＧＡＡＡ）によって、（ｉｉｉ）の５’末端に結合される（図２の上パネルと下パネルを比較する）。 One of the early adaptations of such a system for in vitro use is Jinek et al. (Science. 2012 Aug 17; 337 (6096): 816-2, which is hereby fully by reference. Can be seen in (embedded). In Jinek's test, (i) S. Full-length Cas9 (eg, Class II Type II Casenase) isolated from pyogenes SF370, recombinantly expressed and purified, (ii) a target DNA sequence desired to be cleaved. Purified mature to 42 nt crRNA with a ~ 20 nt 5'sequence complementary to, followed by a 3'tracr binding sequence (the entire crRNA is transcribed in vitro from a synthetic DNA template with the T7 promoter sequence); A system containing purified tracrRNA, transcribed in vitro from a synthetic DNA template with the iii) T7 promoter sequence, and (iv) Mg2 + was first described. Jinek then described an improved engineered system in which to form a single fused synthetic guide RNA (sgRNA) capable of targeting Cas9 on its own. The crRNA of ii) is bound to the 5'end of (iii) by a linker (eg, GAAA) (compare the upper and lower panels of FIG. 2).

Ｍａｌｉら（Ｓｃｉｅｎｃｅ．２０１３Ｆｅｂ１５；３３９（６１２１）：８２３－８２６．）（これは、参照により完全に本明細書に組み込まれる）は、その後、（ｉ）Ｃ末端の核局在化配列（例えば、ＳＶ４０ＮＬＳ）および適切なポリアデニル化シグナル（例えば、ＴＫｐＡシグナル）を有する適切な哺乳動物プロモーター下で、コドン最適化Ｃａｓ９（例えば、クラスＩＩのＩＩ型Ｃａｓ酵素）をコードするＯＲＦと；（ｉｉ）適切なポリメラーゼＩＩＩプロモーター（例えば、Ｕ６プロモーター）下でｓｇＲＮＡをコードするＯＲＦ（Ｇで始まる５’配列と、それに続く相補的な標的化核酸配列の２０ｎｔと、それに結合した３’ｔｒａｃｒ結合配列と、リンカーと、ｔｒａｃｒＲＮＡ配列とを有する）とをコードするＤＮＡベクターを提供することによって、哺乳動物細胞で使用するためにこのシステムを適合させた。 Mari et al. (Science. 2013 Feb 15; 339 (6121): 823-826.), Which is fully incorporated herein by reference, are then described in (i) C-terminal nuclear localization sequences (i) For example, with an ORF encoding codon-optimized Cas9 (eg, Class II type II Cas enzyme) under a suitable mammalian promoter with SV40 NLS) and a suitable polyadenylation signal (eg, TK pA signal); ii) An ORF (starting with G, 5'sequence followed by a complementary targeting nucleic acid sequence, 20 nt, followed by a 3'tracr binding sequence bound to it, under the appropriate polymerase III promoter (eg, U6 promoter). The system was adapted for use in mammalian cells by providing a DNA vector encoding (with a linker and a promoter RNA sequence).

ＭＧ１酵素 MG1 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼはＲｕｖＣ＿ＩＩＩドメインを含む場合があり、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：１８２７－２１４０のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼはＲｕｖＣ＿ＩＩＩドメインを含む場合があり、ここで、上記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：１８２７－２１４０．のいずれか１つに対して少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼはＲｕｖＣ＿ＩＩＩドメインを含む場合があり、ここで、配列番号：１８２７－２１４０のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：１８２７－１８３１のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：１８２７－１８３１のいずれか１つに対して少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：１８２７－１８３１のいずれか１つと実質的に同一のＲｕｖＣ＿ＩＩＩドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：１８２７に対して少なくとも約７０％、、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：１８２８に対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：１８２９に対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：１８３０に対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：１８３１に対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含む場合がある。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NO: 1827-2140. In some cases, the endonuclease may contain a RuvC_III domain, wherein the RuvC_III domain is described in SEQ ID NO: 1827-2140. At least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about about any one of them. 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about about It has 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. In some cases, endonucleases may contain the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 1827-2140. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 1827-1831. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45 relative to any one of SEQ ID NOs: 1827-1831. %, At least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91 %, At least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, including RuvC_III domains having at least about 99% identity. In some cases. In some cases, endonucleases may contain the RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 1827-1831. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92% relative to SEQ ID NO: 1827. May include a RuvC_III domain having at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, relative to SEQ ID NO: 1828. It may contain a RuvC_III domain having at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, relative to SEQ ID NO: 1829. It may contain a RuvC_III domain having at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, relative to SEQ ID NO: 1830. It may contain a RuvC_III domain having at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, relative to SEQ ID NO: 1831. It may contain a RuvC_III domain having at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity.

エンドヌクレアーゼは、配列番号：３６３８－３９５５のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：３６３８－３９５５のいずれか１つに対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６３８－３９５５のいずれか１つと実質的に同一のＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６３８－３９５５のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：３６３８－３９５５のいずれか１つに対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６３８－３９５５のいずれか１つと実質的に同一のＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６３８－３６４１のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：３６３８－３６４１のいずれか１つに対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６３８－３６４１のいずれか１つと実質的に同一のＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６３８のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：３６３８のいずれか１つに対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６３８のいずれか１つと実質的に同一のＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６３９のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：３６３９のいずれか１つに対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６３９のいずれか１つと実質的に同一のＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６４０のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：３６４０のいずれか１つに対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６４０のいずれか１つと実質的に同一のＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６４１のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：３６４１のいずれか１つに対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３６４１のいずれか１つと実質的に同一のＨＮＨドメインを含む場合がある。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 3638-3955. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91 relative to any one of SEQ ID NOs: 3638-3955. %, At least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% HNH domains with identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 3638-3955. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 3638-3955. In some cases, the endonuclease is at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91 relative to any one of SEQ ID NOs: 3638-3955. %, At least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% HNH domains with identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 3638-3955. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 3638-3461. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91 relative to any one of SEQ ID NOs: 3638-3461. %, At least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% HNH domains with identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 3638-3461. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NO: 3638. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, relative to any one of SEQ ID NO: 3638. When containing an HNH domain having at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. There is. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NO: 3638. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NO: 3639. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, relative to any one of SEQ ID NO: 3639. When containing an HNH domain having at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. There is. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NO: 3639. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NO: 3640. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, relative to any one of SEQ ID NO: 3640. When containing an HNH domain having at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. There is. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NO: 3640. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NO: 3641. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, relative to any one of SEQ ID NO: 3641. When containing an HNH domain having at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. There is. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NO: 3641.

場合によっては、エンドヌクレアーゼは、配列番号：１－６あるいは９－３１９のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：１－６あるいは９－３１９のいずれか１つと実質的に同一である場合がある。場合によっては、エンドヌクレアーゼは、配列番号：１－４のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：１－４のいずれか１つと実質的に同一の場合がある。場合によっては、エンドヌクレアーゼは、配列番号：５６１５、５６１６、あるいは５６１７のいずれか１つと実質的に同一のペプチドモチーフを含む場合がある。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50% with respect to any one of SEQ ID NOs: 1-6 or 9-319. , At least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96% May contain variants having at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1-6 or 9-319. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, and at least about 55 relative to any one of SEQ ID NOs: 1-4. %, At least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97 %, At least about 98%, or at least about 99% of variants may be included. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1-4. In some cases, endonucleases may contain a peptide motif that is substantially identical to any one of SEQ ID NOs: 5615, 5616, or 5617.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含むことがある。ＮＬＳは、前記エンドヌクレアーゼのＮ末端またはＣ末端の近位にあり得る。ＮＬＳは、配列番号：１－６あるいは９－３１９のいずれか１つのＮ末端あるいはＣ末端に付加され得るか、あるいは、配列番号：１－３１９のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端あるいはＣ末端に付加され得る。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳである場合がある。ＮＬＳはｃ－ｍｙｃＮＬＳである場合がある。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含む場合がある。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含む場合がる。ＮＬＳは、下記の表１中の配列のいずれか、またはその組み合わせを含み得る。 In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS can be proximal to the N-terminus or C-terminus of the endonuclease. NLS can be added to the N-terminus or C-terminus of any one of SEQ ID NOs: 1-6 or 9-319, or at least about 30% of any one of SEQ ID NOs: 1-319. At least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, Addition to the N-terminus or C-terminus of a variant having at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity Can be done. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. When the NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 99% identity to any one of SEQ ID NOs: 5593-5608. There is. The NLS may contain a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS may include any or a combination of the sequences in Table 1 below.

場合によっては、エンドヌクレアーゼは、組換え型であり得る（例えば、大腸菌中の発現とそれに続くエピトープタグ精製などの適切な方法によって、クローン化され、発現され、および精製される）。場合によっては、エンドヌクレアーゼは、配列番号：５５９２－５５９５のいずれか１つに対して少なくとも約９０％の同一性を有する１６ＳｒＲＮＡ遺伝子を有する細菌に由来する場合がある。エンドヌクレアーゼは、配列番号：５５９２－５５９５のいずれか１つと少なくとも約８０％、少なくとも約８２％、少なくとも約８３％、少なくとも約８４％、少なくとも約８５％、少なくとも約８６％、少なくとも約８７％、少なくとも約８８％、少なくとも約８９％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％同一の１６ＳｒＲＮＡ遺伝子を有する種に由来する場合がある。エンドヌクレアーゼは、配列番号：５５９２－５５９５のいずれか１つと実質的に同一の１６ＳｒＲＮＡ遺伝子を有する種に由来する場合がある。エンドヌクレアーゼは、Ｖｅｒｒｕｃｏｍｉｃｒｏｂｉａ門またはＣａｎｄｉｄａｔｕｓＰｅｒｅｇｒｉｎｉｂａｃｔｅｒｉａ門に属する細菌に由来する場合がある。

In some cases, endonucleases can be recombinant (eg, cloned, expressed, and purified by appropriate methods such as expression in E. coli followed by epitope tag purification). In some cases, the endonuclease may be derived from a bacterium having a 16S rRNA gene with at least about 90% identity to any one of SEQ ID NOs: 5592-5595. Endonucleases are at least about 80%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, with any one of SEQ ID NOs: 5592-5595. At least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, It may be derived from a species that has at least about 98%, or at least about 99%, the same 16S rRNA gene. The endonuclease may be derived from a species having a 16S rRNA gene that is substantially identical to any one of SEQ ID NOs: 5592-5595. Endonucleases may be derived from bacteria belonging to the phylum Verrucomicrobiota or the phylum Candidatus Peregrinibacteria.

場合によっては、配列同一性は、ＢＬＡＳＴＰ、ＣＬＵＳＴＡＬＷ、ＭＵＳＣＬＥ、ＭＡＦＦＴ、Ｎｏｖａｆｏｌｄ、またはＳｍｉｔｈ－Ｗａｔｅｒｍａｎ相同性検索アルゴリズムによって決定され得る。配列同一性は、３のｗｏｒｄｌｅｎｇｔｈ（Ｗ）、１０のｅｘｐｅｃｔａｔｉｏｎ（Ｅ）のパラメータを使用して、および１１のｅｘｉｓｔｅｎｃｅ、１のｅｘｔｅｎｓｉｏｎでギャップコストを設定するＢＬＯＳＵＭ６２スコアリングマトリックスを使用して、および条件付き組成スコアマトリックス調整（ｃｏｎｄｉｔｉｏｎａｌｃｏｍｐｏｓｉｔｉｏｎａｌｓｃｏｒｅｍａｔｒｉｘａｄｊｕｓｔｍｅｎｔ）を使用して、ＢＬＡＳＴＰアルゴリズムによって決定される場合がある。 In some cases, sequence identity can be determined by BLASTP, CLUSTALW, MUSCLE, MAFFT, Novafold, or Smith-Waterman homology search algorithms. Sequence identity is determined using the BLASTUM62 scoring matrix, which sets the gap cost using the parameters of 3 wordlength (W), 10 extraction (E), and 11 exhibitions, 1 extension, and conditions. It may be determined by the BLASTP algorithm using a combined composition score matrix adjustment.

場合によっては、上記のシステムは、（ｂ）所望の切断配列に相補的な５’標的化領域を有するエンドヌクレアーゼと複合体を形成することができる、少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）を含む場合がある。場合によっては、５’標的化領域は、エンドヌクレアーゼと適合可能なＰＡＭ配列を含み得る。場合によっては、標的化領域の最も５’側（５’ｍｏｓｔｎｕｃｌｅｏｔｉｄｅ）のヌクレオチドはＧである場合がある。場合によっては、５’標的化領域は１５～２３ヌクレオチド長である場合がある。ガイド配列およびｔｒａｃｒ配列は、別個のリボ核酸（ＲＮＡ）あるいは単一のリボ核酸（ＲＮＡ）として供給され得る。ガイドＲＮＡは、標的化領域の３’にｃｒＲＮＡｔｒａｃｒＲＮＡ結合配列を含んでもよい。ガイドＲＮＡは、ｃｒＲＮＡｔｒａｃｒＲＮＡ結合領域の３’に４－ヌクレオチドリンカーが先行するｔｒａｃｒＲＮＡ配列を含んでいてもよい。ｓｇＲＮＡは、５’から３’に：細胞中の標的配列にハイブリダイズすることができる非天然のガイド核酸配列と；ｔｒａｃｒ配列とを含み得る。場合によっては、非天然ガイド核酸配列およびｔｒａｃｒ配列は共有結合される。 In some cases, the system described above may (b) form a complex with an endonuclease having a 5'targeting region complementary to the desired cleavage sequence, at least one engineered synthetic guide ribonucleic acid (sgRNA). ) May be included. In some cases, the 5'targeting region may contain PAM sequences compatible with endonucleases. In some cases, the nucleotide on the 5'most side of the targeting region (5'most nucleotide) may be G. In some cases, the 5'targeting region may be 15-23 nucleotides in length. The guide sequence and tracr sequence can be supplied as separate ribonucleic acid (RNA) or single ribonucleic acid (RNA). The guide RNA may contain a crRNA tracrRNA binding sequence in 3'of the targeting region. The guide RNA may contain a tracrRNA sequence in which the 4-nucleotide linker precedes the 3'of the crRNA tracrRNA binding region. The sgRNA can include from 5'to 3': an unnatural guide nucleic acid sequence capable of hybridizing to a target sequence in the cell; and a tracr sequence. In some cases, the unnatural guide nucleic acid sequence and the tracr sequence are covalently linked.

場合によっては、ｔｒａｃｒ配列は特定の配列を有する場合がある。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％を有する。ｔｒａｃｒ配列は、配列番号：５４７６－５４８９のいずれか１つの少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有することがある。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５４７６－５４８９のいずれか１つの少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する場合がある。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５４７６－５４８９のいずれか１つの少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一である場合がある。ｔｒａｃｒＲＮＡは、配列番号：５４７６－５４８９のいずれかを含む場合がある。 In some cases, the tracr sequence may have a particular sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). With respect to at least about 80%. The tracr sequence is at least about 60-100 of any one of SEQ ID NOs: 5476-5489 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about about. It may have at least about 80% sequence identity to the contiguous nucleotides of 90). In some cases, the tracrRNA is at least about 60-90 of any one of SEQ ID NOs: 5476-5489 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, Or at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about about 90) contiguous nucleotides. They may have 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the tracrRNA is at least about 60-100 of any one of SEQ ID NOs: 5476-5489 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, Or it may be substantially identical to at least about 90) contiguous nucleotides. The tracrRNA may contain any of SEQ ID NOs: 5476-5489.

場合によっては、エンドヌクレアーゼと複合体を形成することができる、少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４６１－５４６４のいずれか１つに対して少なくとも約８０％の同一性を有する配列を含む場合がある。ｓｇＲＮＡは、配列番号：５４６１－５４６４のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含む場合がる。ｓｇＲＮＡは、配列番号：５４６１－５４６４のいずれか１つと実質的に同一の配列を含む場合がある。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease is at least about 80% identical to any one of SEQ ID NOs: 5461-5464. May contain sequences of sex. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94% with respect to any one of SEQ ID NOs: 5461-5464. %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% of sequences having the same identity. The sgRNA may contain a sequence that is substantially identical to any one of SEQ ID NOs: 5461-5464.

場合によっては、上記のシステムは、標的ＤＮＡ遺伝子座における切断のために第１の領域および第２の領域を標的とする２つの異なるｓｇＲＮＡを含む場合があり、ここで、第２の部位は第１の部位の３’である。場合によっては、上記のシステムは、５’～３’まで、第１の領域の５’に少なくとも約２０（例えば、少なくとも約４０、８０、１２０、１５０、２００、３００、５００、あるいは１ｋｂ）のヌクレオチドの配列を含む第１の相同性アームと、少なくとも約１０のヌクレオチドの合成ＤＮＡ配列と、第２の領域の３’に少なくとも約２０（例えば、少なくとも約４０、８０、１２０、１５０、２００、３００、５００、あるいは１ｋｂ）のヌクレオチドの配列を含む第２の相同性アームとを含む、一本鎖または二本鎖のＤＮＡ修復鋳型を含む場合がある。 In some cases, the above system may contain two different sgRNAs that target the first and second regions for cleavage at the target DNA locus, where the second site is the second. It is 3'of 1 part. In some cases, the above system may have at least about 20 (eg, at least about 40, 80, 120, 150, 200, 300, 500, or 1 kb) in 5'of the first region, from 5'to 3'. A first homology arm containing a sequence of nucleotides, a synthetic DNA sequence of at least about 10 nucleotides, and at least about 20 (eg, at least about 40, 80, 120, 150, 200, 3'in the second region. It may contain a single-stranded or double-stranded DNA repair template, including a second homology arm containing a sequence of 300, 500, or 1 kb) nucleotides.

他の態様では、本開示は、標的核酸遺伝子座を改変するための方法を提供する。上記方法は、酵素および本明細書で開示される少なくとも１つの合成ガイドＲＮＡ（ｓｇＲＮＡ）を含む、本明細書で開示される非天然のシステムのいずれかを、標的核酸遺伝子座に送達する工程を含み得る。酵素は、少なくとも１つのｓｇＲＮＡと複合体を形成し、その複合体が標的核酸遺伝子座に結合すると、標的核酸遺伝子座を改変する場合がある。前記遺伝子座に酵素を送達することは、上記システムまたは上記システムをコードする核酸で細胞をトランスフェクトすることを含み得る。前記遺伝子座にヌクレアーゼを送達することは、上記システムまたは上記システムをコードする核酸で細胞を電気穿孔することを含み得る。前記遺伝子座にヌクレアーゼを送達することは、対象の遺伝子座を含む核酸と上記システムを緩衝液中でインキュベートすることを含み得る。場合によっては、標的核酸遺伝子座は、デオキシリボ核酸（ＤＮＡ）またはリボ核酸（ＲＮＡ）を含む。標的核酸遺伝子座は、ゲノムＤＮＡ、ウイルスＤＮＡ、ウイルスＲＮＡ、または細菌ＤＮＡを含むことがある。標的核酸遺伝子座は細胞内にあることがある。標的核酸遺伝子座はインビトロであることがある。標的核酸遺伝子座は、真核細胞または原核細胞内にあることがある。細胞は、動物細胞、ヒト細胞、細菌細胞、古細菌細胞、または植物細胞であることがある。酵素は、対象の標的遺伝子座で、またはその近位で、一本鎖切断または二本鎖切断を引き起こすことがある。 In another aspect, the present disclosure provides a method for modifying a target nucleic acid locus. The method comprises delivering any of the unnatural systems disclosed herein to a target nucleic acid locus, comprising an enzyme and at least one synthetic guide RNA (sgRNA) disclosed herein. Can include. The enzyme forms a complex with at least one sgRNA, and when the complex binds to the target nucleic acid locus, it may modify the target nucleic acid locus. Delivering an enzyme to the locus may include transfecting the cell with the system or nucleic acid encoding the system. Delivering a nuclease to the locus may include electroporating the cell with the system or nucleic acid encoding the system. Delivering a nuclease to the locus may include incubating the nucleic acid containing the locus of interest with the system in buffer. In some cases, the target nucleic acid locus comprises deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). The target nucleic acid locus may include genomic DNA, viral DNA, viral RNA, or bacterial DNA. The target nucleic acid locus may be intracellular. The target nucleic acid locus may be in vitro. The target nucleic acid locus may be in eukaryotic or prokaryotic cells. The cell may be an animal cell, a human cell, a bacterial cell, a paleobacterial cell, or a plant cell. Enzymes can cause single-strand or double-strand breaks at or proximal to the target locus of interest.

標的核酸遺伝子座が細胞内にある可能性がある場合、酵素は、配列番号：１８２７－２１４０のいずれか１つに対して少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含有する核酸として供給されてもよい。前記エンドヌクレアーゼをコードするオープンリーディングフレームを含有するデオキシリボ核酸（ＤＮＡ）は、配列番号：５５７２－５５７５のいずれか１つ、または、配列番号：５５７２－５５７５のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、あるいは少なくとも約９９％の同一性を有するその変異体と実質的に同一の配列を含み得る。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されるプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターである場合がある。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含有する、キャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして供給されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含有するデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物を真核生物であってもよい。場合によっては、生物を真菌であってもよい。場合によっては、生物をヒトであってもよい。 If the target nucleic acid locus may be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NO: 1827-2140. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. The deoxyribonucleic acid (DNA) containing the open reading frame encoding the endonuclease is at least about 30 relative to any one of SEQ ID NO: 5571-5575 or any one of SEQ ID NO: 5571-5575. %, At least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80 %, At least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% of the variant having the same identity. Can contain an array of. In some cases, the nucleic acid comprises a promoter to which the open reading frame encoding the endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be supplied as a translated polypeptide. Even if the at least one engineered sgRNA is supplied as a deoxyribonucleic acid (DNA) containing a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. good. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

場合によっては、本開示は、本明細書で開示されるシステムを含む発現カセット、または本明細書に記載される核酸を提供し得る。場合によっては、発現カセットまたは核酸はベクターとして供給されてもよい。場合によっては、発現カセット、核酸、またはベクターは、細胞中で供給されてもよい。場合によっては、細胞は、配列番号：５５９２－５５９５のいずれか１つに対して少なくとも約９０％（例えば、少なくとも約９９％）の同一性を有する１６ＳｒＲＮＡ遺伝子を有する、細菌細胞である。 In some cases, the disclosure may provide an expression cassette comprising the systems disclosed herein, or the nucleic acids described herein. In some cases, the expression cassette or nucleic acid may be supplied as a vector. In some cases, the expression cassette, nucleic acid, or vector may be supplied intracellularly. In some cases, the cell is a bacterial cell having a 16S rRNA gene with at least about 90% (eg, at least about 99%) identity to any one of SEQ ID NOs: 5592-5595.

ＭＧ２酵素 MG2 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む、操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼはＲｕｖＣ＿ＩＩＩドメインを含む場合があり、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２１４１－２２４１のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼはＲｕｖＣ＿ＩＩＩドメインを含む場合があり、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２１４１－２２４１のいずれか１つに対して少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼはＲｕｖＣ＿ＩＩＩドメインを含む場合があり、配列番号：２１４１－２１４２のいずれか１つと実質的に同一である。エンドヌクレアーゼは、配列番号：２１４１－２１４２のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：２１４１－２１４２のいずれか１つに対して少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する、ＲｕｖＣ＿ＩＩＩドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：２１４１－２１４２のいずれか１つと実質的に同一のＲｕｖＣ＿ＩＩＩドメインを含む場合がある。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NOs: 2141-2241. In some cases, the endonuclease may contain a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least about 30% with respect to any one of SEQ ID NOs: 2141-2241. At least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80% , At least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%. , Have at least about 99% identity. In some cases, the endonuclease may contain the RuvC_III domain and is substantially identical to any one of SEQ ID NOs: 2141-2142. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 2141-2142. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45 relative to any one of SEQ ID NOs: 2141-2142. %, At least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91 RuvC_III domain having an identity of at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%. May include. In some cases, endonucleases may contain the RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 2141-2142.

エンドヌクレアーゼは、配列番号：３９５５－４０５５のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：３９５５－４０５５のいずれか１つに対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％同一であるＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３９５５－４０５５のいずれか１つと実質的に同一のＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３９５５－３９５６のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含む場合がある。場合によっては、エンドヌクレアーゼは、配列番号：３９５５－３９５６のうちのいずれか１つに対して少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％同一であるＨＮＨドメインを含む場合がある。エンドヌクレアーゼは、配列番号：３９５５－３９５６のいずれか１つと実質的に同一のＨＮＨドメインを含む場合がある。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 3955-4055. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91 relative to any one of SEQ ID NOs: 3955-4055. %, At least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% if it contains HNH domains that are identical. There is. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 3955-4055. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 3955-3956. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least for any one of SEQ ID NOs: 3955-3956. HNH domains that are about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identical. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 3955-3956.

場合によっては、エンドヌクレアーゼは、配列番号：３２０－４２０のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：３２０－４２０のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：３２０－３２１のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：３２０－３２１のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 320-420. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 320-420. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 320-321. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 320-321.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：３２０－４２０のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：３２０－４２０のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 for any one of SEQ ID NOs: 320-420 at the N-terminus or C-terminus, or for any one of SEQ ID NOs: 320-420. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、配列同一性は、ＢＬＡＳＴＰ、ＣＬＵＳＴＡＬＷ、ＭＵＳＣＬＥ、ＭＡＦＦＴ、Ｎｏｖａｆｏｌｄ、またはＳｍｉｔｈ－Ｗａｔｅｒｍａｎ相同性検索アルゴリズムによって決定されてもよい。配列同一性は、３のｗｏｒｄｌｅｎｇｔｈ（Ｗ）、１０のｅｘｐｅｃｔａｔｉｏｎ（Ｅ）のパラメータを使用し、および１１のｅｘｉｓｔｅｎｃｅ、１のｅｘｔｅｎｓｉｏｎでギャップコストを設定するＢＬＯＳＵＭ６２スコアリングマトリクスを使用し、および条件付き組成スコアマトリクス調整を使用して、ＢＬＡＳＴＰアルゴリズムで決定することができる。 In some cases, sequence identity may be determined by BLASTP, CLUSTALW, MUSCLE, MAFFT, Novafold, or Smith-Waterman homology search algorithms. Sequence identity uses the BLASTUM62 scoring matrix, which uses parameters of 3 wordlength (W), 10 extraction (E), and sets the gap cost at 11 exhibitions, 1 extension, and conditional composition. Score matrix adjustments can be used to determine with the BLASTP algorithm.

場合によっては、上記システムは、（ｂ）所望の切断配列に相補的な５’標的化領域を有するエンドヌクレアーゼと複合体を形成することができる、少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）を含んでもよい。場合によっては、５’標的化領域は、エンドヌクレアーゼと適合可能なＰＡＭ配列を含んでもよい。場合によっては、標的化領域の５’のほとんどのヌクレオチドがＧであってもよい。場合によっては、５’標的化領域は１５～２３ヌクレオチド長であってもよい。ガイド配列およびｔｒａｃｒ配列は、別のリボ核酸（ＲＮＡ）または単一リボ核酸（ＲＮＡ）として供給されてもよい。ガイドＲＮＡは、標的化領域の３’にｃｒＲＮＡｔｒａｃｒＲＮＡ結合配列を含んでもよい。ガイドＲＮＡは、ｃｒＲＮＡｔｒａｃｒＲＮＡ結合領域の３’に４－ヌクレオチドリンカーが先行するｔｒａｃｒＲＮＡ配列を含んでもよい。ｓｇＲＮＡは、５’から３’に、細胞の標的配列にハイブリダイズすることができる非天然ガイド核酸配列、および、ｔｒａｃｒ配列を含んでもよい。場合によっては、非天然ガイド核酸配列およびｔｒａｃｒ配列は共有結合される。 In some cases, the system may (b) form a complex with an endonuclease having a 5'targeting region complementary to the desired cleavage sequence, at least one engineered synthetic guide ribonucleic acid (sgRNA). May include. In some cases, the 5'targeting region may contain PAM sequences compatible with endonucleases. In some cases, most of the 5'nucleotides in the targeting region may be G. In some cases, the 5'targeting region may be 15-23 nucleotides in length. The guide sequence and tracr sequence may be supplied as separate ribonucleic acid (RNA) or single ribonucleic acid (RNA). The guide RNA may contain a crRNA tracrRNA binding sequence in 3'of the targeting region. The guide RNA may contain a tracrRNA sequence in which the 4-nucleotide linker precedes the 3'of the crRNA tracrRNA binding region. The sgRNA may contain, from 5'to 3', an unnatural guide nucleic acid sequence capable of hybridizing to the cell's target sequence, and a tracr sequence. In some cases, the unnatural guide nucleic acid sequence and the tracr sequence are covalently linked.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５４９０－５４９４のいずれか１つの少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５４９０－５４９４のいずれか１つの少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５４９０－５４９４のいずれか１つの少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５４９０－５４９４のいずれかを含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is at least about 60-100 of any one of SEQ ID NOs: 5490-5494 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about about. It may have at least about 80% sequence identity to the contiguous nucleotides of 90). In some cases, the tracrRNA is at least about 60-90 of any one of SEQ ID NOs: 5490-5494 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, Or at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about about 90) contiguous nucleotides. It may have 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the tracrRNA is at least about 60-100 of any one of SEQ ID NOs: 5490-5494 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, Alternatively, it may be substantially identical to at least about 90) contiguous nucleotides. The tracrRNA may comprise any of SEQ ID NOs: 5490-5494.

場合によっては、エンドヌクレアーゼと複合体を形成することができる少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４６５に対して少なくとも約８０％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは、配列番号：５４６５に対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは配列番号：５４６５と実質的に同一の配列を含んでもよい。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease may contain a sequence having at least about 80% identity to SEQ ID NO: 5465. .. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, relative to SEQ ID NO: 5465. It may contain sequences having at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. The sgRNA may contain a sequence that is substantially identical to SEQ ID NO: 5465.

いくつかの態様では、上記システムは、標的ＤＮＡ遺伝子座における切断のために第１の領域および第２の領域を標的とする２つの異なるｓｇＲＮＡを含んでいてもよく、ここで、第２の領域は第１の領域に対して３’である。場合によっては、上記システムは、５’から３’に、第１の領域の５’に少なくとも約２０（例えば、少なくとも約４０、８０、１２０、１５０、２００、３００、５００、または１ｋｂ）のヌクレオチドの配列を含む第１の相同性アーム、少なくとも約１０ヌクレオチドの合成ＤＮＡ配列、および、第２の領域の３’に少なくとも約２０（例えば、少なくとも約４０、８０、１２０、１５０、２００、３００、５００、または１ｋｂ）のヌクレオチドの配列を含む第２の相同性アームを含む、一本鎖または二本鎖のＤＮＡ修復鋳型を含んでいてもよい。 In some embodiments, the system may include two different sgRNAs that target the first and second regions for cleavage at the target DNA locus, where the second region. Is 3'for the first region. In some cases, the system may have at least about 20 (eg, at least about 40, 80, 120, 150, 200, 300, 500, or 1 kb) nucleotides from 5'to 3'and 5'in the first region. A first homology arm containing the sequence of, a synthetic DNA sequence of at least about 10 nucleotides, and at least about 20 (eg, at least about 40, 80, 120, 150, 200, 300, 3'in the second region. It may include a single-stranded or double-stranded DNA repair template containing a second homology arm containing a sequence of 500, or 1 kb) nucleotides.

他の態様では、本開示は、所望の標的核酸遺伝子座を改変するための方法を提供する。本方法は、本明細書に開示された酵素および少なくとも１つの合成ガイドＲＮＡ（ｓｇＲＮＡ）を含む本明細書で開示された非天然のシステムのいずれかを、標的核酸遺伝子座に送達する工程を含んでもよい。酵素は、少なくとも１つのｓｇＲＮＡと複合体を形成してもよく、複合体が所望の標的核酸遺伝子座に結合すると、所望の標的核酸遺伝子座を改変してもよい。酵素を前記遺伝子座に送達することは、上記システムまたは上記システムをコードする核酸で細胞をトランスフェクトすることを含んでもよい。ヌクレアーゼを前記遺伝子座に送達することは、上記システムまたは上記システムをコードする核酸で細胞をエレクトロポレーションすることを含んでもよい。ヌクレアーゼを前記遺伝子座に送達することは、所望の遺伝子座を含む核酸を有する緩衝液中でシステムをインキュベートすることを含んでもよい。場合によっては、標的核酸遺伝子座は、デオキシリボ核酸（ＤＮＡ）またはリボ核酸（ＲＮＡ）を含む。標的核酸遺伝子座は、ゲノムＤＮＡ、ウイルスＤＮＡ、ウイルスＲＮＡ、または細菌ＤＮＡを含んでもよい。標的核酸遺伝子座は細胞内にあってもよい。標的核酸遺伝子座はインビトロであってもよい。標的核酸遺伝子座は、真核細胞内にあっても、原核細胞内にあってもよい。細胞は、動物細胞、ヒト細胞、細菌細胞、古細菌細胞、または植物細胞であってもよい。酵素は、所望の標的遺伝子座で、またはその近辺で、一本鎖または二本鎖の切断を誘導してもよい。 In another aspect, the present disclosure provides a method for modifying a desired target nucleic acid locus. The method comprises delivering any of the non-natural systems disclosed herein, including the enzyme disclosed herein and at least one synthetic guide RNA (sgRNA), to the target nucleic acid locus. But it may be. The enzyme may form a complex with at least one sgRNA and may modify the desired target nucleic acid locus when the complex binds to the desired target nucleic acid locus. Delivering the enzyme to the locus may include transfecting the cell with the system or nucleic acid encoding the system. Delivering the nuclease to the locus may include electroporating the cell with the system or nucleic acid encoding the system. Delivering the nuclease to the locus may include incubating the system in a buffer containing the nucleic acid containing the desired locus. In some cases, the target nucleic acid locus comprises deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). The target nucleic acid locus may include genomic DNA, viral DNA, viral RNA, or bacterial DNA. The target nucleic acid locus may be intracellular. The target nucleic acid locus may be in vitro. The target nucleic acid locus may be in eukaryotic cells or in prokaryotic cells. The cells may be animal cells, human cells, bacterial cells, paleobacterial cells, or plant cells. The enzyme may induce single-strand or double-strand breaks at or near the desired target locus.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：２１４１－２２４１のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。前記エンドヌクレアーゼをコードするオープンリーディングフレームを含むデオキシリボ核酸（ＤＮＡ）は、配列番号：５５７６－５５７７のいずれか１つと実質的に同一の配列、あるいは、配列番号：５５７６－５５７７のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NOs: 2141-2241. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. The deoxyribonucleic acid (DNA) containing the open reading frame encoding the endonuclease has a sequence substantially the same as any one of SEQ ID NO: 5576-5571, or any one of SEQ ID NO: 5576-5571. At least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75. %, At least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% of variants having identity. It may be included. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ３酵素 MG3 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２２４２－２２５１のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２２４２－２２５１のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：２２４２－２２５１のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：２２４２－２２４４のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２２４２－２２４４のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２２４２－２２４４のいずれか１つに実質的に同一であるＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NOs: 2242-2251. In some cases, the endonuclease may include a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NOs: 2242-2251. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 2242-2251. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 2242-2244. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about about any one of SEQ ID NOs: 2242-2244. 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about RuvC_III domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. It may be included. In some cases, the endonuclease may contain a RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 2242-2244.

エンドヌクレアーゼは、配列番号：４０５６－４０６６のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４０５６－４０６６のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４０５６－４０６６のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは、配列番号：４０５６－４０５８のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４０５６－４０５８のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４０５６－４０５８のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4056-4066. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4056-4066. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4056-4066. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4056-4058. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4056-4058. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4056-4058.

場合によっては、エンドヌクレアーゼは、配列番号：４２１－４３１のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４２１－４３１のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：４２１－４２３のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４２１－４２３のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 421-431. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 421-431. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 421-423. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 421-423.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：４２１－４３１のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：４２１－４３１のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 to any one of SEQ ID NOs: 421-431 at the N-terminus or C-terminus, or to any one of SEQ ID NO: 421-431. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５４９５－５５０２のいずれか１つの少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５４９５－５５０２のいずれか１つの少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５４９５－５５０２のいずれか１つの少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５４９５－５５０２のいずれかを含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is at least about 60-100 of any one of SEQ ID NOs: 5495-5502 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about about. It may have at least about 80% sequence identity to the contiguous nucleotides of 90). In some cases, the tracrRNA is at least about 60-90 of any one of SEQ ID NOs: 5495-5502 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, Or at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about about 90) contiguous nucleotides. It may have 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the tracrRNA is at least about 60-100 of any one of SEQ ID NOs: 5495-5502 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, Alternatively, it may be substantially identical to at least about 90) contiguous nucleotides. The tracrRNA may comprise any of SEQ ID NOs: 5495-5502.

場合によっては、エンドヌクレアーゼと複合体を形成することができる少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４６６－５４６７のいずれか１つに対して少なくとも約８０％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは、配列番号：５４６６－５４６７のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは配列番号：５４６６－５４６７のいずれか１つと実質的に同一の配列を含んでもよい。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease has at least about 80% identity to any one of SEQ ID NOs: 5466-5467. May include sequences having. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94% with respect to any one of SEQ ID NOs: 5466-5467. %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% of sequences having the same identity. The sgRNA may contain a sequence that is substantially identical to any one of SEQ ID NOs: 5466-5467.

場合によっては、上記システムは、標的ＤＮＡ遺伝子座における切断のために第１の領域および第２の領域を標的とする２つの異なるｓｇＲＮＡを含んでいてもよく、ここで、第２の領域は第１の領域に対して３’である。場合によっては、上記システムは、５’から３’に、第１の領域の５’に少なくとも約２０（例えば、少なくとも約４０、８０、１２０、１５０、２００、３００、５００、または１ｋｂ）のヌクレオチドの配列を含む第１の相同性アーム、少なくとも約１０ヌクレオチドの合成ＤＮＡ配列、および、第２の領域の３’に少なくとも約２０（例えば、少なくとも約４０、８０、１２０、１５０、２００、３００、５００、または１ｋｂ）のヌクレオチドの配列を含む第２の相同性アームを含む、一本鎖または二本鎖のＤＮＡ修復鋳型を含んでいてもよい。 In some cases, the system may contain two different sgRNAs that target the first and second regions for cleavage at the target DNA locus, where the second region is the second. It is 3'for the region of 1. In some cases, the system may have at least about 20 (eg, at least about 40, 80, 120, 150, 200, 300, 500, or 1 kb) nucleotides from 5'to 3'and 5'in the first region. A first homology arm containing the sequence of, a synthetic DNA sequence of at least about 10 nucleotides, and at least about 20 (eg, at least about 40, 80, 120, 150, 200, 300, 3'in the second region. It may include a single-stranded or double-stranded DNA repair template containing a second homology arm containing a sequence of 500, or 1 kb) nucleotides.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：２２４２－２２５１のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。前記エンドヌクレアーゼをコードするオープンリーディングフレームを含むデオキシリボ核酸（ＤＮＡ）は、配列番号：５５７８－５５８０のいずれか１つと実質的に同一の配列、あるいは、配列番号：５５７８－５５８０のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NOs: 2242-2251. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. The deoxyribonucleic acid (DNA) containing the open reading frame encoding the endonuclease has a sequence substantially the same as any one of SEQ ID NO: 5578-5580, or any one of SEQ ID NO: 5578-5580. At least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75. %, At least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% of variants having identity. It may be included. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ４酵素 MG4 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２２５３－２４８１のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２２５３－２４８１のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：２２５３－２４８１のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：２２５３－２４８１のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２２５３－２４８１のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２２５３－２４８１のいずれか１つに実質的に同一であるＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NO: 2253-2481. In some cases, the endonuclease may comprise a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NO: 2253-2481. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 2253-2481. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 2253-2481. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about about any one of SEQ ID NOs: 2253-2481. 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about RuvC_III domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. It may be included. In some cases, the endonuclease may contain a RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 2253-2481.

エンドヌクレアーゼは、配列番号：４０６７－４２９５のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４０６７－４２９５のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４０６７－４２９５のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは、配列番号：４０６７－４２９５のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４０６７－４２９５のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４０６７－４２９５のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4067-4295. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4067-4295. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4067-4295. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4067-4295. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4067-4295. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4067-4295.

場合によっては、エンドヌクレアーゼは、配列番号：４３２－６６０のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４３２－６６０のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：４３２－６６０のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４３２－６６０のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 432-660. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 432-660. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 432-660. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 432-660.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：４３２－６６０のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：４３２－６６０のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 to any one of SEQ ID NOs: 432-660 at the N-terminus or C-terminus, or to any one of SEQ ID NO: 432-660. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５５０３の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０３の少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０３の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５５０３を含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is a contiguous nucleotide of SEQ ID NO: 5503 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may have at least about 80% sequence identity with respect to. In some cases, the tracrRNA is at least about 60-90 of SEQ ID NO: 5503 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). At least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96 relative to consecutive nucleotides. %, At least about 97%, at least about 98%, or at least about 99%. In some cases, the tracrRNA is at least about 60-100 of SEQ ID NO: 5503 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may be substantially identical to a contiguous nucleotide. The tracrRNA may comprise SEQ ID NO: 5503.

場合によっては、エンドヌクレアーゼと複合体を形成することができる少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４６８に対して少なくとも約８０％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは、配列番号：５４６８に対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは配列番号：５４６８と実質的に同一の配列を含んでもよい。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease may contain a sequence having at least about 80% identity to SEQ ID NO: 5468. .. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, relative to SEQ ID NO: 5468. It may contain sequences having at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. The sgRNA may contain a sequence that is substantially identical to SEQ ID NO: 5468.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：２２５３－２４８１のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NO: 2253-2481. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ６酵素 MG6 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２４８２－２４８９のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２４８２－２４８９のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：２４８２－２４８９のいずれか１つに実質的に同一である。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NO: 2482-2489. In some cases, the endonuclease may comprise a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NOs: 2482-2489. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 2482-2489.

エンドヌクレアーゼは、配列番号：４２９６－４３０３のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４２９６－４３０３のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４０５６－４０６６のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4296-4303. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4296-4303. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4056-4066.

場合によっては、エンドヌクレアーゼは、配列番号：６６１－６６８のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：６６１－６６８のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 661-668. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 661-668.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：６６１－６６８のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：６６１－６６８のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 to any one of SEQ ID NOs: 661-668 at the N-terminus or C-terminus, or to any one of SEQ ID NO: 661-668. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to.

場合によっては、上記システムは、標的ＤＮＡ遺伝子座における切断のために第１の領域および第２の領域を標的とする２つの異なるガイドＲＮＡを含んでいてもよく、ここで、第２の領域は第１の領域に対して３’である。場合によっては、上記システムは、５’から３’に、第１の領域の５’に少なくとも約２０（例えば、少なくとも約４０、８０、１２０、１５０、２００、３００、５００、または１ｋｂ）のヌクレオチドの配列を含む第１の相同性アーム、少なくとも約１０ヌクレオチドの合成ＤＮＡ配列、および、第２の領域の３’に少なくとも約２０（例えば、少なくとも約４０、８０、１２０、１５０、２００、３００、５００、または１ｋｂ）のヌクレオチドの配列を含む第２の相同性アームを含む、一本鎖または二本鎖のＤＮＡ修復鋳型を含んでいてもよい。 In some cases, the system may contain two different guide RNAs that target the first and second regions for cleavage at the target DNA locus, where the second region is. 3'for the first region. In some cases, the system may have at least about 20 (eg, at least about 40, 80, 120, 150, 200, 300, 500, or 1 kb) nucleotides from 5'to 3'and 5'in the first region. A first homology arm containing the sequence of, a synthetic DNA sequence of at least about 10 nucleotides, and at least about 20 (eg, at least about 40, 80, 120, 150, 200, 300, 3'in the second region. It may include a single-stranded or double-stranded DNA repair template containing a second homology arm containing a sequence of 500, or 1 kb) nucleotides.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：２４８２－２４８９のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NO: 2482-2489. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ７酵素 MG7 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２４９０－２４９８のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２４９０－２４９８のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：２４９０－２４９８のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：２４９０－２４９８のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２４９０－２４９８のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２４９０－２４９８のいずれか１つに実質的に同一であるＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NOs: 2490-2498. In some cases, the endonuclease may include a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NOs: 2490-2498. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 2490-2498. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 2490-2498. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about about any one of SEQ ID NOs: 2490-2498. 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about RuvC_III domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. It may be included. In some cases, the endonuclease may contain a RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 2490-2498.

エンドヌクレアーゼは、配列番号：４３０４－４３１２のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４３０４－４３１２のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４３０４－４３１２のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは、配列番号：４３０４－４３１２のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４３０４－４３１２のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４３０４－４３１２のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4304-4212. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4304-4212. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4304-4212. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4304-4212. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4304-4212. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4304-4212.

場合によっては、エンドヌクレアーゼは、配列番号：６６９－６７７のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：６６９－６７７のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：６６９－６７７のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：６６９－６７７のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 669-677. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 669-677. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 669-677. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 669-677.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：６６９－６７７のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：６６９－６７７のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 to any one of SEQ ID NOs: 669-677 at the N-terminus or C-terminus, or to any one of SEQ ID NO: 669-677. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５５０４の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０４の少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０４の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５５０４を含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is a contiguous nucleotide of SEQ ID NO: 5504 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may have at least about 80% sequence identity with respect to. In some cases, the tracrRNA is at least about 60-90 of SEQ ID NO: 5504 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). At least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96 relative to consecutive nucleotides. %, At least about 97%, at least about 98%, or at least about 99%. In some cases, the tracrRNA is at least about 60-100 of SEQ ID NO: 5504 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may be substantially identical to a contiguous nucleotide. The tracrRNA may comprise SEQ ID NO: 5504.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：２４９０－２４９８のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NOs: 2490-2498. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ１４酵素 MG14 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２４９９－２７５０のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２４９９－２７５０のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：２４９９－２７５０のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：２４９９－２７５０のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２４９９－２７５０のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２４９９－２７５０のいずれか１つに実質的に同一であるＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NOs: 2499-2750. In some cases, the endonuclease may comprise a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NOs: 2499-2750. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 2499-2750. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 2499-2750. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about about any one of SEQ ID NOs: 2499-2750. 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about RuvC_III domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. It may be included. In some cases, the endonuclease may contain a RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 2499-2750.

エンドヌクレアーゼは、配列番号：４３１３－４５６４のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４３１３－４５６４のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４３１３－４５６４のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは、配列番号：４３１３－４５６４のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４０６７－４２９５のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４３１３－４５６４のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4313-4564. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4313-4564. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4313-4564. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4313-4564. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4067-4295. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4313-4564.

場合によっては、エンドヌクレアーゼは、配列番号：６７８－９２９のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：６７８－９２９のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：６７８－９２９のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：６７８－９２９のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 678-929. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 678-929. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 678-929. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 678-929.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：６７８－９２９のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：６７８－９２９のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 to any one of SEQ ID NOs: 678-929 at the N-terminus or C-terminus, or to any one of SEQ ID NO: 678-929. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５５０５の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０５の少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０５の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５５０５を含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is a contiguous nucleotide of SEQ ID NO: 5505 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may have at least about 80% sequence identity with respect to. In some cases, the tracrRNA is at least about 60-90 of SEQ ID NO: 5505 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). At least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96 relative to consecutive nucleotides. %, At least about 97%, at least about 98%, or at least about 99%. In some cases, the tracrRNA is at least about 60-100 of SEQ ID NO: 5505 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may be substantially identical to a contiguous nucleotide. The tracrRNA may comprise SEQ ID NO: 5505.

場合によっては、エンドヌクレアーゼと複合体を形成することができる少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４６９に対して少なくとも約８０％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは、配列番号：５４６９に対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは配列番号：５４６９と実質的に同一の配列を含んでもよい。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease may contain a sequence having at least about 80% identity to SEQ ID NO: 5469. .. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, relative to SEQ ID NO: 5469. It may contain sequences having at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. The sgRNA may contain a sequence that is substantially identical to SEQ ID NO: 5469.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：２４９９－２７５０のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。前記エンドヌクレアーゼをコードするオープンリーディングフレームを含むデオキシリボ核酸（ＤＮＡ）は、配列番号：５５８１と実質的に同一の配列、あるいは、配列番号：５５８１に対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NOs: 2499-2750. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. The deoxyribonucleic acid (DNA) containing the open reading frame encoding the endonuclease has a sequence substantially identical to SEQ ID NO: 5581, or at least about 30%, at least about 35%, or at least relative to SEQ ID NO: 5581. About 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least It may contain variants having about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ１５酵素 MG15 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２７５１－２９１３のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２７５１－２９１３のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：２７５１－２９１３のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：２７５１－２９１３のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２７５１－２９１３のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２７５１－２９１３のいずれか１つに実質的に同一であるＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NOs: 2751-2913. In some cases, the endonuclease may comprise a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NOs: 2751-2913. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 2751-2913. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 2751-2913. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about about any one of SEQ ID NOs: 2751-2913. 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about RuvC_III domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. It may be included. In some cases, the endonuclease may contain a RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 2751-2913.

エンドヌクレアーゼは、配列番号：４５６５－４７２７のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４５６５－４７２７のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４５６５－４７２７のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは、配列番号：４５６５－４７２７のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４５６５－４７２７のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４５６５－４７２７のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4565-4727. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4565-4727. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4565-4727. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4565-4727. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4565-4727. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4565-4727.

場合によっては、エンドヌクレアーゼは、配列番号：９３０－１０９２のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：９３０－１０９２のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：９３０－１０９２のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：９３０－１０９２のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 930-1092. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 930-1092. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 930-1092. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 930-1092.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：９３０－１０９２のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：９３０－１０９２のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 to any one of SEQ ID NOs: 930-1092 at the N-terminus or C-terminus, or to any one of SEQ ID NOs: 930-1092. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５５０６の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０６の少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０６の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５５０６を含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is a contiguous nucleotide of SEQ ID NO: 5506 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may have at least about 80% sequence identity with respect to. In some cases, the tracrRNA is at least about 60-90 of SEQ ID NO: 5506 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). At least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96 relative to consecutive nucleotides. %, At least about 97%, at least about 98%, or at least about 99%. In some cases, the tracrRNA is at least about 60-100 of SEQ ID NO: 5506 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may be substantially identical to a contiguous nucleotide. The tracrRNA may comprise SEQ ID NO: 5506.

場合によっては、エンドヌクレアーゼと複合体を形成することができる少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４７０に対して少なくとも約８０％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは、配列番号：５４７０に対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは配列番号：５４７０と実質的に同一の配列を含んでもよい。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease may contain a sequence having at least about 80% identity to SEQ ID NO: 5470. .. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, relative to SEQ ID NO: 5470. It may contain sequences having at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. The sgRNA may contain a sequence that is substantially identical to SEQ ID NO: 5470.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：２７５１－２９１３のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。前記エンドヌクレアーゼをコードするオープンリーディングフレームを含むデオキシリボ核酸（ＤＮＡ）は、配列番号：５５８２と実質的に同一の配列、あるいは、配列番号：５５８２に対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NOs: 2751-2913. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. The deoxyribonucleic acid (DNA) containing the open reading frame encoding the endonuclease has a sequence substantially identical to SEQ ID NO: 5582, or at least about 30%, at least about 35%, or at least relative to SEQ ID NO: 5582. About 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least It may contain variants having about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ１６酵素 MG16 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２９１４－３１７４のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：２９１４－３１７４のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：２９１４－３１７４のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：２９１４－３１７４のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２９１４－３１７４のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：２９１４－３１７４のいずれか１つに実質的に同一であるＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NOs: 2914-3174. In some cases, the endonuclease may comprise a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NOs: 2914-3174. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 2914-3174. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 2914-3174. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about about any one of SEQ ID NOs: 2914-3174. 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about RuvC_III domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. It may be included. In some cases, the endonuclease may contain a RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 2914-3174.

エンドヌクレアーゼは、配列番号：４７２８－４９８８のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４７２８－４９８８のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４７２８－４９８８のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは、配列番号：４７２８－４９８８のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４７２８－４９８８のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４７２８－４９８８のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4728-4988. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4728-4988. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4728-4988. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4728-4988. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4728-4988. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4728-4988.

場合によっては、エンドヌクレアーゼは、配列番号：１０９３－１３５３のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：１０９３－１３５３のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：１０９３－１３５３のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：１０９３－１３５３のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 1093-1353. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1093-1353. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 1093-1353. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1093-1353.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：１０９３－１３５３のいずれか１つに対してＮ末端またはＣ末端で、あるいは、配列番号：１０９３－１３５３のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体に、付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at the N-terminus or C-terminus to any one of SEQ ID NO: 1093-1353, or at least about 30%, at least about 35%, to any one of SEQ ID NO: 1093-1353. At least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, It may be added to a variant having at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５５０７の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０７の少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０７の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５５０７を含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is a contiguous nucleotide of SEQ ID NO: 5507 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may have at least about 80% sequence identity with respect to. In some cases, the tracrRNA is at least about 60-90 of SEQ ID NO: 5507 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). At least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96 relative to consecutive nucleotides. %, At least about 97%, at least about 98%, or at least about 99%. In some cases, the tracrRNA is at least about 60-100 of SEQ ID NO: 5507 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may be substantially identical to a contiguous nucleotide. The tracrRNA may comprise SEQ ID NO: 5507.

場合によっては、エンドヌクレアーゼと複合体を形成することができる少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４７１に対して少なくとも約８０％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは、配列番号：５４７１に対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは配列番号：５４７１と実質的に同一の配列を含んでもよい。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease may contain a sequence having at least about 80% identity to SEQ ID NO: 5471. .. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, relative to SEQ ID NO: 5471. It may contain sequences having at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. The sgRNA may contain a sequence substantially identical to SEQ ID NO: 5471.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：２９１４－３１７４のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。前記エンドヌクレアーゼをコードするオープンリーディングフレームを含むデオキシリボ核酸（ＤＮＡ）は、配列番号：５５８３と実質的に同一の配列、あるいは、配列番号：５５８３に対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NOs: 2914-3174. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. The deoxyribonucleic acid (DNA) containing the open reading frame encoding the endonuclease has a sequence substantially identical to SEQ ID NO: 5583, or at least about 30%, at least about 35%, or at least relative to SEQ ID NO: 5583. About 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least It may contain variants having about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ１８酵素 MG18 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：３１７５－３３００のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：３１７５－３３００のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：３１７５－３３００のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：３１７５－３３００のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：３１７５－３３００のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：３１７５－３３００のいずれか１つに実質的に同一であるＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NOs: 3175-3300. In some cases, the endonuclease may include a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NOs: 3175-3300. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 3175-3300. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 3175-3300. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about about any one of SEQ ID NOs: 3175-3300. 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about RuvC_III domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. It may be included. In some cases, the endonuclease may contain a RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 3175-3300.

エンドヌクレアーゼは、配列番号：４９８９－５１４６のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４９８９－５１４６のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４９８９－５１４６のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは、配列番号：４９８９－５１４６のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：４９８９－５１４６のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：４９８９－５１４６のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4989-5146. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4989-5146. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4989-5146. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 4989-5146. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 4989-5146. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 4989-5146.

場合によっては、エンドヌクレアーゼは、配列番号：１３５４－１５１１のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：１３５４－１５１１のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：１３５４－１５１１のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：１３５４－１５１１のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 1354-1511. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1354-1511. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 1354-1511. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1354-1511.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：１３５４－１５１１のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：１３５４－１５１１のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 to any one of SEQ ID NOs: 1354-1511 at the N-terminus or C-terminus, or to any one of SEQ ID NO: 1354-1511. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５５０８の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０８の少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０８の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５５０８を含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is a contiguous nucleotide of SEQ ID NO: 5508 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may have at least about 80% sequence identity with respect to. In some cases, the tracrRNA is at least about 60-90 of SEQ ID NO: 5508 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). At least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96 relative to consecutive nucleotides. %, At least about 97%, at least about 98%, or at least about 99%. In some cases, the tracrRNA is at least about 60-100 of SEQ ID NO: 5508 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may be substantially identical to a contiguous nucleotide. The tracrRNA may comprise SEQ ID NO: 5508.

場合によっては、エンドヌクレアーゼと複合体を形成することができる少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４７２に対して少なくとも約８０％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは、配列番号：５４７２に対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは配列番号：５４７２と実質的に同一の配列を含んでもよい。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease may contain a sequence having at least about 80% identity to SEQ ID NO: 5472. .. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, relative to SEQ ID NO: 5472. It may contain sequences having at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. The sgRNA may contain a sequence that is substantially identical to SEQ ID NO: 5472.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：３１７５－３３００のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。前記エンドヌクレアーゼをコードするオープンリーディングフレームを含むデオキシリボ核酸（ＤＮＡ）は、配列番号：５５８４と実質的に同一の配列、あるいは、配列番号：５５８４に対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NOs: 3175-3300. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. The deoxyribonucleic acid (DNA) containing the open reading frame encoding the endonuclease has a sequence substantially identical to SEQ ID NO: 5584, or at least about 30%, at least about 35%, or at least relative to SEQ ID NO: 5584. About 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least It may contain variants having about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ２１酵素 MG21 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：３３３１－３４７４のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：３３３１－３４７４のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：３３３１－３４７４のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：３３３１－３４７４のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：３３３１－３４７４のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：３３３１－３４７４のいずれか１つに実質的に同一であるＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NOs: 3331-3474. In some cases, the endonuclease may comprise a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NOs: 3331-3474. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 3331-3474. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 3331-3474. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about about any one of SEQ ID NOs: 3331-3474. 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about RuvC_III domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. It may be included. In some cases, the endonuclease may contain a RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 3331-3474.

エンドヌクレアーゼは、配列番号：５１４７－５２９０のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：５１４７－５２９０のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：５１４７－５２９０のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは、配列番号：５１４７－５２９０のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：５１４７－５２９０のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：５１４７－５２９０のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 5147-5290. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 5147-5290. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 5147-5290. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 5147-5290. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 5147-5290. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 5147-5290.

場合によっては、エンドヌクレアーゼは、配列番号：１５１２－１６５５のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：１５１２－１６５５のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：１５１２－１６５５のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：１５１２－１６５５のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 1512-1655. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1512-1655. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 1512-1655. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1512-1655.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：１５１２－１６５５のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：１５１２－１６５５のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 to any one of SEQ ID NOs: 1512-1655 at the N-terminus or C-terminus, or to any one of SEQ ID NOs: 1512-1655. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５５０９の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０９の少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５０９の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５５０９を含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is a contiguous nucleotide of SEQ ID NO: 5509 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may have at least about 80% sequence identity with respect to. In some cases, the tracrRNA is at least about 60-90 of SEQ ID NO: 5509 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). At least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96 relative to consecutive nucleotides. %, At least about 97%, at least about 98%, or at least about 99%. In some cases, the tracrRNA is at least about 60-100 of SEQ ID NO: 5509 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may be substantially identical to a contiguous nucleotide. The tracrRNA may comprise SEQ ID NO: 5509.

場合によっては、エンドヌクレアーゼと複合体を形成することができる少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４７３に対して少なくとも約８０％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは、配列番号：５４７３に対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは配列番号：５４７３と実質的に同一の配列を含んでもよい。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease may contain a sequence having at least about 80% identity to SEQ ID NO: 5473. .. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, relative to SEQ ID NO: 5473, It may contain sequences having at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. The sgRNA may contain a sequence that is substantially identical to SEQ ID NO: 5473.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：３３３１－３４７４のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。前記エンドヌクレアーゼをコードするオープンリーディングフレームを含むデオキシリボ核酸（ＤＮＡ）は、配列番号：５５８５と実質的に同一の配列、あるいは、配列番号：５５８５に対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NOs: 3331-3474. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. The deoxyribonucleic acid (DNA) containing the open reading frame encoding the endonuclease has a sequence substantially identical to SEQ ID NO: 5585, or at least about 30%, at least about 35%, or at least relative to SEQ ID NO: 5585. About 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least It may contain variants having about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ２２酵素 MG22 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：３４７５－３５６８のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：３４７５－３５６８のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：３４７５－３５６８のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：３４７５－３５６８のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：３４７５－３５６８のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：３４７５－３５６８のいずれか１つに実質的に同一であるＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NOs: 3475-3568. In some cases, the endonuclease may comprise a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NOs: 3475-3568. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 3475-3568. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 3475-3568. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about about any one of SEQ ID NOs: 3475-3568. 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about RuvC_III domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. It may be included. In some cases, the endonuclease may contain a RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 3475-3568.

エンドヌクレアーゼは、配列番号：５２９１－５３８９のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：５２９１－５３８９のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：５２９１－５３８９のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは、配列番号：５２９１－５３８９のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：５２９１－５３８９のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：５２９１－５３８９のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 5291-5389. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 5291-5389. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 5291-5389. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 5291-5389. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 5291-5389. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 5291-5389.

場合によっては、エンドヌクレアーゼは、配列番号：１６５６－１７５５のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：１６５６－１７５５のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：１６５６－１７５５のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：１６５６－１７５５のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 1656-1755. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1656-1755. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 1656-1755. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1656-1755.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：４３２－６６０のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：１６５６－１７５５のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 to any one of SEQ ID NOs: 432-660 at the N-terminus or C-terminus, or to any one of SEQ ID NOs: 1656-1755. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５５１０の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５１０の少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５１０の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５５１０を含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is a contiguous nucleotide of SEQ ID NO: 5510 of at least about 60-100 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may have at least about 80% sequence identity with respect to. In some cases, the tracrRNA is at least about 60-90 of SEQ ID NO: 5510 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). At least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96 relative to consecutive nucleotides. %, At least about 97%, at least about 98%, or at least about 99%. In some cases, the tracrRNA is at least about 60-100 of SEQ ID NO: 5510 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may be substantially identical to a contiguous nucleotide. The tracrRNA may comprise SEQ ID NO: 5510.

場合によっては、エンドヌクレアーゼと複合体を形成することができる少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４７４に対して少なくとも約８０％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは、配列番号：５４７４に対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは配列番号：５４７４と実質的に同一の配列を含んでもよい。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease may contain a sequence having at least about 80% identity to SEQ ID NO: 5474. .. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, relative to SEQ ID NO: 5474. It may contain sequences having at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. The sgRNA may contain a sequence that is substantially identical to SEQ ID NO: 5474.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：３４７５－３５６８のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。前記エンドヌクレアーゼをコードするオープンリーディングフレームを含むデオキシリボ核酸（ＤＮＡ）は、配列番号：５５８６と実質的に同一の配列、あるいは、配列番号：５５８６に対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NOs: 3475-3568. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. The deoxyribonucleic acid (DNA) containing the open reading frame encoding the endonuclease has a sequence substantially identical to SEQ ID NO: 5586, or at least about 30%, at least about 35%, or at least relative to SEQ ID NO: 5586. About 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least It may contain variants having about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

ＭＧ２３酵素 MG23 enzyme

一態様では、本開示は、（ａ）エンドヌクレアーゼを含む操作されたヌクレアーゼシステムを提供する。場合によっては、エンドヌクレアーゼはＣａｓエンドヌクレアーゼである。場合によっては、エンドヌクレアーゼは、ＩＩ型のクラスＩＩのＣａｓエンドヌクレアーゼである。エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、前記ＲｕｖＣ＿ＩＩＩドメインは、配列番号：３５６９－３６３７のいずれか１つに対して少なくとも約７０％の配列同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、ＲｕｖＣ＿ＩＩＩドメインは、配列番号：３５６９－３６３７のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有する。場合によっては、エンドヌクレアーゼは、ＲｕｖＣ＿ＩＩＩドメインを含んでいてもよく、ここで、配列番号：３５６９－３６３７のいずれか１つに実質的に同一である。エンドヌクレアーゼは、配列番号：３５６９－３６３７のいずれか１つに対して少なくとも約７０％の配列同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：３５６９－３６３７のいずれか１つに対して、少なくとも約２０％、少なくとも約２５％、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％の同一性を有するＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：３５６９－３６３７のいずれか１つに実質的に同一であるＲｕｖＣ＿ＩＩＩドメインを含んでいてもよい。 In one aspect, the present disclosure provides an engineered nuclease system comprising (a) endonucleases. In some cases, the endonuclease is a Cas endonuclease. In some cases, the endonuclease is a type II class II Cas endonuclease. The endonuclease may include a RuvC_III domain, wherein the RuvC_III domain has at least about 70% sequence identity to any one of SEQ ID NOs: 3569-3637. In some cases, the endonuclease may comprise a RuvC_III domain, wherein the RuvC_III domain is at least about 20%, at least about 25%, at least for any one of SEQ ID NOs: 3569-3637. About 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least About 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least It has about 98%, at least about 99% identity. In some cases, the endonuclease may include the RuvC_III domain, which is substantially identical to any one of SEQ ID NOs: 3569-3637. The endonuclease may contain a RuvC_III domain having at least about 70% sequence identity to any one of SEQ ID NOs: 3569-3637. In some cases, endonucleases are at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about about any one of SEQ ID NOs: 3569-3637. 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about RuvC_III domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, and at least about 99% identity. It may be included. In some cases, the endonuclease may contain a RuvC_III domain that is substantially identical to any one of SEQ ID NOs: 3569-3637.

エンドヌクレアーゼは、配列番号：５３９０－５６４０のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：５３９０－５４６０のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：５３９０－５４６０のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは、配列番号：５３９０－５４６０のいずれか１つに対して少なくとも約７０％の同一性を有するＨＮＨドメインを含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：５３９０－５４６０のいずれか１つに対して、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有するＨＮＨドメインを含んでいてもよい。エンドヌクレアーゼは配列番号：５３９０－５４６０のいずれか１つと実質的に同一のＨＮＨドメインを含んでいてもよい。 The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 5390-5640. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 5390-5460. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 5390-5460. The endonuclease may contain an HNH domain having at least about 70% identity to any one of SEQ ID NOs: 5390-5460. In some cases, endonucleases are at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about about any one of SEQ ID NOs: 5390-5460. HNH domains with 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. May include. The endonuclease may contain an HNH domain that is substantially identical to any one of SEQ ID NOs: 5390-5460.

場合によっては、エンドヌクレアーゼは、配列番号：１７５６－１８２６のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：１７５６－１８２６のいずれか１つと実質的に同一であってもよい。場合によっては、エンドヌクレアーゼは、配列番号：１７５６－１８２６のいずれか１つに対して、少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでいてもよい。場合によっては、エンドヌクレアーゼは、配列番号：１７５６－１８２６のいずれか１つと実質的に同一であってもよい。 In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 1756-1826. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1756-1826. In some cases, endonucleases are at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about about any one of SEQ ID NOs: 1756-1826. 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about about It may contain variants having 97%, at least about 98%, or at least about 99% identity. In some cases, the endonuclease may be substantially identical to any one of SEQ ID NOs: 1756-1826.

場合によっては、エンドヌクレアーゼは、１つ以上の核局在化配列（ＮＬＳ）を有する変異体を含んでもよい。ＮＬＳは前記エンドヌクレアーゼのＮ末端またはＣ末端の近位であってもよい。ＮＬＳは、配列番号：１７５６－１８２６のいずれか１つのＮ末端またはＣ末端で、あるいは、配列番号：１７５６－１８２６のいずれか１つに対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体のＮ末端またはＣ末端で付加されてもよい。ＮＬＳはＳＶ４０ラージＴ抗原ＮＬＳであってもよい。ＮＬＳはｃ－ｍｙｃＮＬＳであってもよい。ＮＬＳは、配列番号：５５９３－５６０８のいずれか１つに対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９９％の同一性を有する配列を含むことができる。ＮＬＳは配列番号：５５９３－５６０８のいずれか１つと実質的に同一の配列を含むことができる。ＮＬＳは、表１の配列のいずれか、またはその組み合わせを含むことができる： In some cases, endonucleases may contain variants with one or more nuclear localization sequences (NLS). The NLS may be proximal to the N-terminus or C-terminus of the endonuclease. NLS is at least about 30%, at least about 35%, at least about 40 to any one of SEQ ID NOs: 1756-1826 at the N-terminus or C-terminus, or to any one of SEQ ID NOs: 1756-1826. %, At least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90 %, At least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% may be added at the N-terminus or C-terminus of the variant. The NLS may be the SV40 large T antigen NLS. The NLS may be a c-myc NLS. The NLS comprises a sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, and at least about 99% identity to any one of SEQ ID NOs: 5593-5608. Can be done. The NLS can comprise a sequence that is substantially identical to any one of SEQ ID NOs: 5593-5608. The NLS can include any or a combination of the sequences in Table 1.

場合によっては、ｔｒａｃｒ配列は特別な配列を有していてもよい。ｔｒａｃｒ配列は、天然のｔｒａｃｒＲＮＡ配列の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％を有してもよい。ｔｒａｃｒ配列は、配列番号：５５１１の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して少なくとも約８０％の配列同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５１１の少なくとも約６０～９０（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドに対して、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有してもよい。場合によっては、ｔｒａｃｒＲＮＡは、配列番号：５５１１の少なくとも約６０～１００（例えば、少なくとも約６０、少なくとも約６５、少なくとも約７０、少なくとも約７５、少なくとも約８０、少なくとも約８５、または少なくとも約９０）の連続するヌクレオチドと実質的に同一であってもよい。ｔｒａｃｒＲＮＡは、配列番号：５５１１を含んでもよい。 In some cases, the tracr sequence may have a special sequence. The tracr sequence is a contiguous nucleotide of at least about 60-100 of the native tracrRNA sequence (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). May have at least about 80% with respect to. The tracr sequence is a contiguous nucleotide of SEQ ID NO: 5511 of at least about 60-100 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may have at least about 80% sequence identity with respect to. In some cases, the tracrRNA is at least about 60-90 of SEQ ID NO: 5511 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). At least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96 relative to consecutive nucleotides. %, At least about 97%, at least about 98%, or at least about 99%. In some cases, the tracrRNA is at least about 60-100 of SEQ ID NO: 5511 (eg, at least about 60, at least about 65, at least about 70, at least about 75, at least about 80, at least about 85, or at least about 90). It may be substantially identical to a contiguous nucleotide. The tracrRNA may comprise SEQ ID NO: 5511.

場合によっては、エンドヌクレアーゼと複合体を形成することができる少なくとも１つの操作された合成ガイドリボ核酸（ｓｇＲＮＡ）は、配列番号：５４７５に対して少なくとも約８０％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは、配列番号：５４７５に対して少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する配列を含んでもよい。ｓｇＲＮＡは配列番号：５４７５と実質的に同一の配列を含んでもよい。 In some cases, at least one engineered synthetic guide ribonucleic acid (sgRNA) capable of forming a complex with an endonuclease may contain a sequence having at least about 80% identity to SEQ ID NO: 5475. .. The sgRNA is at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, relative to SEQ ID NO: 5475. It may contain sequences having at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. The sgRNA may contain a sequence that is substantially identical to SEQ ID NO: 5475.

標的核酸の遺伝子座が細胞内にあり得る場合では、酵素は、配列番号：３５６９－３６３７のいずれか１つに対して、少なくとも約７５％（例えば、少なくとも約９０％、少なくとも約９１％、少なくとも約９２％、少なくとも約９３％、少なくとも約９４％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、少なくとも約９９％）の同一性を有するＲｕｖＣ＿ＩＩＩドメインを有する酵素をコードするオープンリーディングフレームを含む核酸として供給されてもよい。前記エンドヌクレアーゼをコードするオープンリーディングフレームを含むデオキシリボ核酸（ＤＮＡ）は、配列番号：５５８７と実質的に同一の配列、あるいは、配列番号：５５８７に対して少なくとも約３０％、少なくとも約３５％、少なくとも約４０％、少なくとも約４５％、少なくとも約５０％、少なくとも約５５％、少なくとも約６０％、少なくとも約６５％、少なくとも約７０％、少なくとも約７５％、少なくとも約８０％、少なくとも約８５％、少なくとも約９０％、少なくとも約９５％、少なくとも約９６％、少なくとも約９７％、少なくとも約９８％、または少なくとも約９９％の同一性を有する変異体を含んでもよい。場合によっては、核酸は、エンドヌクレアーゼをコードするオープンリーディングフレームが動作可能に連結されたプロモーターを含む。プロモーターは、ＣＭＶ、ＥＦ１ａ、ＳＶ４０、ＰＧＫ１、Ｕｂｃ、ヒトβアクチン、ＣＡＧ、ＴＲＥ、またはＣａＭＫＩＩａプロモーターであってもよい。エンドヌクレアーゼは、前記エンドヌクレアーゼをコードする前記オープンリーディングフレームを含むキャップされたｍＲＮＡとして供給されてもよい。エンドヌクレアーゼは、翻訳されたポリペプチドとして提供されてもよい。少なくとも１つの操作されたｓｇＲＮＡは、リボ核酸（ＲＮＡ）ｐｏｌＩＩＩプロモーターに動作可能に連結された前記少なくとも１つの操作されたｓｇＲＮＡをコードする遺伝子配列を含むデオキシリボ核酸（ＤＮＡ）として供給されてもよい。場合によっては、生物は真核生物であってもよい。場合によっては、生物は真菌であってもよい。場合によっては、生物はヒトであってもよい。 Where the locus of the target nucleic acid can be intracellular, the enzyme is at least about 75% (eg, at least about 90%, at least about 91%, at least) relative to any one of SEQ ID NOs: 3569-3637. Enzymes with a RuvC_III domain with identity of about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%). It may be supplied as a nucleic acid containing an open reading frame to encode. The deoxyribonucleic acid (DNA) containing the open reading frame encoding the endonuclease has a sequence substantially identical to SEQ ID NO: 5587, or at least about 30%, at least about 35%, or at least relative to SEQ ID NO: 5587. About 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least It may contain variants having about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity. In some cases, the nucleic acid comprises a promoter to which an open reading frame encoding an endonuclease is operably linked. The promoter may be CMV, EF1a, SV40, PGK1, Ubc, human β-actin, CAG, TRE, or CaMKIIa promoter. The endonuclease may be supplied as a capped mRNA containing the open reading frame encoding the endonuclease. The endonuclease may be provided as a translated polypeptide. The at least one engineered sgRNA may be supplied as a deoxyribonucleic acid (DNA) comprising a gene sequence encoding the at least one engineered sgRNA operably linked to the ribonucleic acid (RNA) pol III promoter. .. In some cases, the organism may be a eukaryote. In some cases, the organism may be a fungus. In some cases, the organism may be human.

実施例１．－新規なタンパク質のためのメタゲノム解析
メタゲノム試料を堆積物、土壌、および動物から集めた。デオキシリボ核酸（ＤＮＡ）を、ＺｙｍｏｂｉｏｍｉｃｓのＤＮＡｍｉｎｉ－ｐｒｅｐｋｉｔを用いて抽出し、ＩｌｌｕｍｉｎａＨｉＳｅｑＲ２５００で配列決定した。所有者の同意を得て試料を採取した。公的な情報源からの生の配列データは、動物のマイクロバイオーム、堆積物、土壌、温泉、熱水噴出孔、海洋、泥炭湿地、永久凍土、および下水のシーケンスを含んでいた。メタゲノム配列データは、タイプＩＩのＣａｓエフェクタータンパク質を含む既知のＣａｓタンパク質配列に基づいて生成された隠れマルコフモデルを用いて検索された。検索により同定された新規なエフェクタータンパク質を、既知のタンパク質にアラインメントして、潜在的な活性部位を特定した。このメタゲノムワークフローにより、本明細書に記載されるクラスＩＩ、タイプＩＩのＣＲＩＳＰＲエンドヌクレアーゼのＭＧ１、ＭＧ２、ＭＧ３、ＭＧ４、ＭＧ６、ＭＧ１４、ＭＧ１５、ＭＧ１６、ＭＧ１８、ＭＧ２１、ＭＧ２２、およびＭＧ２３のファミリーを解明した。 Example 1. -Metagenomic analysis for novel proteins Metagenomic samples were collected from sediments, soil, and animals. Deoxyribonucleic acid (DNA) was extracted using the Zymobilics DNA mini-prep kit and sequenced on Illumina HiSeqR 2500. Samples were taken with the consent of the owner. Raw sequence data from public sources included sequences of animal microbiomes, sediments, soils, hot springs, hydrothermal vents, oceans, peat swamps, permafrost, and sewage. Metagenomic sequence data were retrieved using a hidden Markov model generated based on known Cas protein sequences, including Type II Cas effector proteins. The novel effector proteins identified by the search were aligned with known proteins to identify potential active sites. This metagenomic workflow elucidates a family of class II, type II CRISPR endonucleases MG1, MG2, MG3, MG4, MG6, MG14, MG15, MG16, MG18, MG21, MG22, and MG23 described herein. did.

実施例２Ａ．－ＣＲＩＳＰＲシステムのＭＧ１ファミリーの発見
実施例１のメタゲノム解析のデータを解析すると、当初は６つのメンバー（それぞれ配列番号：５、６、１、２、および３として記録されたＭＧ１－１、ＭＧ１－２、ＭＧ１－３、ＭＧ１－４、ＭＧ１－５、およびＭＧ１－６）を含むこれまでに記載されていない推定ＣＲＩＳＰＲシステムの新しいクラスターが明らかとなった。このファミリーは、ＨＮＨドメインとＲｕｖＣドメインを有する酵素を特徴としている。このファミリーのＲｕｖＣドメインは、これまでに記載されたＣａｓ９ファミリーメンバーとの相同性が低いＲｕｖＣ＿ＩＩＩ部分を有している。当初のファミリーメンバーは最大５６．８％の同一性を有しているが、６つの酵素はすべてＲｕｖＣドメインのＲｕｖＣ＿ＩＩＩ部分が分岐しており、ＲＨＨＡＬＤＡＭＶ（配列番号：５６１５）、ＫＨＨＡＬＤＡＭＣ（配列番号：５６１６）、またはＫＨＨＡＬＤＡＩＣ（配列番号：５６１７）の共通モチーフを持っている。これらのモチーフは、記載されている他のＣａｓ９様酵素には見られないものである。これらの新しい酵素とその関連するサブドメインに対応するタンパク質と核酸の配列は、配列表に提示されている。推定ｔｒａｃｒＲＮＡ配列は、他の遺伝子との相対的な位置関係に基づいて同定され、配列番号：５４７６－５４７９として提示されている。酵素システムは、ＣＲＩＳＰＲシステムを含むゲノムビンからの１６ＳｒＲＮＡの配列に基づいて、ＰｈｙｌｕｍＶｅｒｒｕｃｏｍｉｃｒｏｂｉａ、ＰｈｙｌｕｍＣａｎｄｉｄａｔｕｓＰｅｒｅｇｒｉｎｉｂａｃｔｅｒｉａ、またはＰｈｙｌｕｍＣａｎｄｉｄａｔｕｓＭｅｌａｉｎａｂａｃｔｅｒｉａに由来すると思われる。１６ＳｒＲＮＡの配列は、配列番号：５５９２－５５９６として提示されている。Ｓｈｍａｋｏｖらによって記載された特徴を呼び起こすＣＲＩＳＰＲシステム配列の詳細なドメインレベルのアラインメント（ＭｏｌＣｅｌｌ．２０１５Ｎｏｖ５；６０（３）：３８５－９７）が図９Ａ、９Ｂ、９Ｃ、９Ｄ、９Ｅ、９Ｆ、９Ｇ、および９Ｈに描かれている。ＭＧ１－１、１－２、および１－３と、追加の独自のタンパク質データセットとを比較すると、配列番号：７－３１９として提示されている類似のアーキテクチャを有する追加のタンパク質配列が明らかになった。これらのＭＧ１タンパク質の配列により、配列番号：５６１８－５６３２に示されるような追加のＭＧ１モチーフが発見された。 Example 2 A. -Discovery of the MG1 family of the CRISPR system When the metagenomic analysis data of Example 1 were analyzed, initially 6 members (MG1-1, MG1-, respectively recorded as SEQ ID NOs: 5, 6, 1, 2, and 3 respectively). 2. A new cluster of previously undescribed putative CRISPR systems including MG1-3, MG1-4, MG1-5, and MG1-6) has been identified. This family features enzymes with HNH and RuvC domains. The RuvC domain of this family has a RuvC_III moiety that is less homologous to previously described Cas9 family members. Although the initial family members have up to 56.8% identity, all six enzymes have the RuvC_III portion of the RuvC domain branched, RHHALDAMV (SEQ ID NO: 5615), KHHALDAMC (SEQ ID NO: 5616). ), Or has a common motif of KHHALDAIC (SEQ ID NO: 5617). These motifs are not found in the other Cas9-like enzymes described. The sequences of proteins and nucleic acids corresponding to these new enzymes and their associated subdomains are presented in the sequence listing. The putative tracrRNA sequence has been identified based on its relative positional relationship with other genes and is presented as SEQ ID NO: 5476-5479. The enzymatic system appears to be derived from Phyllum Verrucomicrobiota, Phylum Candidatus Peregrinibacteria, or Phylum Candidatus Melainabacteria, based on the sequence of 16S rRNA from the phylum bin containing the CRISPR system. The sequence of 16S rRNA is presented as SEQ ID NO: 5592-5596. A detailed domain-level alignment of the CRISPR system sequence that evokes the features described by Shmakov et al. (Mol Cell. 2015 Nov 5; 60 (3): 385-97) is shown in FIGS. 9A, 9B, 9C, 9D, 9E, 9F, It is drawn on 9G and 9H. Comparing MG1-1, 1-2, and 1-3 with additional proprietary protein datasets reveals additional protein sequences with a similar architecture presented as SEQ ID NO: 7-319. rice field. The sequences of these MG1 proteins revealed additional MG1 motifs as set forth in SEQ ID NO: 5618-5632.

実施例２Ｂ．－ＣＲＩＳＰＲシステムのＭＧ２ファミリーの発見
実施例１のメタゲノム解析からのデータを解析すると、６つのメンバー（ＭＧ２－１、ＭＧ２－２、ＭＧ２－３、ＭＧ２－５、およびＭＧ２－６）を含むこれまでに記載されていない推定ＣＲＩＳＰＲシステムの新しいクラスターが明らかになった。これらの新しい酵素と例示的なサブドメインに対応するタンパク質と核酸の配列は、配列番号：３２０、３２２－３２５として提示されている。他の遺伝子との相対的な位置関係から、推定ｔｒａｃｒＲＮＡ配列がオペロン内で同定され、配列番号：５４９０、５４９２－５４９４、および５５３８として提示されている。これらの配列対Ｃａｓ９の詳細なドメインレベルのアラインメントは、Ｓｈｍａｋｏｖらにより示されているように（ＭｏｌＣｅｌｌ．２０１５Ｎｏｖ５；６０（３）：３８５－９７．）、図７に描かれている。 Example 2 B. -Discovery of the MG2 family of the CRISPR system Analyzing the data from the metagenomic analysis of Example 1, so far it contains 6 members (MG2-1, MG2-2, MG2-3, MG2-5, and MG2-6). A new cluster of putative CRISPR systems not listed in is revealed. The protein and nucleic acid sequences corresponding to these new enzymes and exemplary subdomains are presented as SEQ ID NOs: 320, 322-325. Estimated tracrRNA sequences have been identified within the operon from their relative positional relationships with other genes and are presented as SEQ ID NOs: 5490, 5492-5494, and 5538. Detailed domain-level alignments of these sequences vs. Cas9 are depicted in FIG. 7 as shown by Shmakov et al. (Mol Cell. 2015 Nov 5; 60 (3): 385-97.).

ＭＧ２－１、ＭＧ２－２、ＭＧ２－３、ＭＧ２－５、およびＭＧ２－６の組み合わせと、追加の独自のタンパク質データセットとを比較すると、配列番号：３２１および３２６－４２０として提示されている類似のアーキテクチャを有する追加のタンパク質配列が明らかになった。ＭＧ２ファミリーメンバーによく見られるモチーフは、配列番号：５６３１－５６３８として提示されている。 Comparing the combination of MG2-1, MG2-2, MG2-3, MG2-5, and MG2-6 with an additional proprietary protein dataset, the similarities presented as SEQ ID NOs: 321 and 326-420. Additional protein sequences with the same architecture have been revealed. A common motif among MG2 family members is presented as SEQ ID NO: 5631-5638.

実施例２Ｃ．－ＣＲＩＳＰＲシステムのＭＧ３ファミリーの発見
実施例１のメタゲノム解析のデータを解析すると、新たにこれまでに記載されていない推定ＣＲＩＳＰＲシステム：ＭＧ３－１が明らかになった。この新しい酵素とその例示的なサブドメインの対応するアミノ酸配列は、配列番号：４２４、２２４５、および４０５９として提示されている。オペロンの他の要素との近接性に基づいて、推定ｔｒａｃｒＲＮＡ含有配列が同定され、配列番号：５４９８として含まれている。この配列対Ａｃｔｉｎｏｍｙｃｅｓｎａｅｓｌｕｎｄｉｉ由来のＣａｓ９とのドメインレベルの詳細なアラインメントが図８に示されている。 Example 2 C.I. -Discovery of MG3 family of CRISPR system Analysis of the data of the metagenome analysis of Example 1 revealed a newly undescribed estimated CRISPR system: MG3-1. The corresponding amino acid sequences of this new enzyme and its exemplary subdomains are presented as SEQ ID NOs: 424, 2245, and 4059. A putative tracrRNA-containing sequence has been identified based on its proximity to other elements of the operon and is included as SEQ ID NO: 5498. A detailed domain-level alignment of this sequence with Cas9 from Actinomyces naeslundii is shown in FIG.

ＭＧ３－１と、その他の独自のタンパク質データセットを比較すると、ＳＥＱＮＯ：４２１－４２３、４２５－４３１として提示されている、類似した構造のさらなるタンパク質配列が明らかになった。 Comparing MG3-1 with other proprietary protein datasets revealed additional protein sequences with similar structures, presented as SEQ NO: 421-423, 425-431.

実施例２Ｄ．－ＣＲＩＳＰＲシステムのＭＧ４、７、１４、１５、１６、１８、２１、２２、２３ファミリーの発見
実施例１のメタゲノム解析のデータを解析すると、１メンバーずつの９ファミリー（ＭＧ４－５、ＭＧ７－２、ＭＧ１４－１、ＭＧ１５－１、ＭＧ１６－２、ＭＧ１８－１、ＭＧ２１－１、ＭＧ２２－１、ＭＧ２３－１）を含む、これまでに記載されていない推定ＣＲＩＳＰＲシステムの新たなクラスターが明らかになった。これらの新しい酵素とその例示的なサブドメインに対応するタンパク質および核酸の配列は、配列番号：４３２、６６９、６７８、９３０、１０９３、１３５４、１５１２、１６５６、１７５６として提示されている。オペロンの他の要素との近接性に基づいて、各ファミリーについて推定ｔｒａｃｒ含有配列が同定された。これらの配列は、それぞれ配列番号：５５０３－５５１１として配列表に提示されている。 Example 2D. -Discovery of MG4, 7, 14, 15, 16, 18, 21, 22, 23 families of CRISPR system When the data of metagenomic analysis of Example 1 is analyzed, 9 families (MG4-5, MG7-2) of 1 member each are analyzed. , MG14-1, MG15-1, MG16-2, MG18-1, MG21-1, MG22-1, MG23-1), revealing a new cluster of previously undescribed estimated CRISPR systems. rice field. The sequences of proteins and nucleic acids corresponding to these new enzymes and their exemplary subdomains are presented as SEQ ID NOs: 432, 669, 678, 930, 1093, 1354, 1512, 1656, 1756. Estimated tracr-containing sequences were identified for each family based on their proximity to other elements of the operon. Each of these sequences is presented in the sequence listing as SEQ ID NO: 5503-5511.

ＭＧ４－５、ＭＧ７－２、ＭＧ１４－１、ＭＧ１５－１、ＭＧ１６－２、ＭＧ１８－１、ＭＧ２１－１、ＭＧ２２－１、ＭＧ２３－１と、その他の独自のタンパク質データセットを比較すると、ＳＥＱＮＯ：４３３－６６０、６７０－６７７、６７９－９２９、９３１－１０９２、１０９４－１３５３、１３５５－１５１１、１５１３－１６５５、１６５７－１７５５、および１７５７－１８２６として提示されている、類似のアーキテクチャを持つ追加のタンパク質配列が明らかになった。これらのセットのＣＲＩＳＰＲシステムのヌクレアーゼに共通するモチーフは、ＭＧ４については配列番号：５６４９、ＭＧ１４については配列番号：５６５０－５６６７、ＭＧ１５については配列番号：５６６８－５６７５、ＭＧ１６については配列番号：５６７６－５６７８、ＭＧ１８については配列番号：５６７９－５６８６、ＭＧ２１については配列番号：５６８７－５６９３および配列番号：５６７４－５６７５、ＭＧ２２については配列番号：５６９４－５６９９、ならびにＭＧ２３については配列番号：５７００－５７１７として提示されている。 Comparing MG4-5, MG7-2, MG14-1, MG15-1, MG16-2, MG18-1, MG21-1, MG22-1, MG23-1 with other proprietary protein datasets, SEQ NO. : Additional with similar architectures presented as 433-660, 670-677, 679-929, 931-1092, 1094-1353, 1355-1511, 1513-1655, 1657-1755, and 1757-1826. The protein sequence was revealed. Common motifs in these sets of CRISPR system nucleases are SEQ ID NO: 5649 for MG4, SEQ ID NO: 5650-5667 for MG14, SEQ ID NO: 5668-5675 for MG15, and SEQ ID NO: 5676- for MG16. 5678, SEQ ID NO: 5679-5686 for MG18, SEQ ID NO: 5678-5693 and SEQ ID NO: 5674-5675 for MG21, SEQ ID NO: 5694-569 for MG22, and SEQ ID NO: 5700-5717 for MG23. It has been presented.

実施例３．－予言的（Ｐｒｏｐｈｅｔｉｃ）－－プロトスペーサー隣接モチーフ（Ｐｒｏｔｏｓｐａｃｅｒ－ＡｄｊａｃｅｎｔＭｏｔｉｆ）の決定．
実験は、Ｋａｒｖｅｌｉｓｅｔａｌ．Ｍｅｔｈｏｄｓ．２０１７Ｍａｙ１５；１２１－１２２：３－８（参照により本明細書に全体として組み込まれる）の例のいずれかのように行われ、本明細書に記載される新規の酵素に対するプロトスペーサー隣接モチーフ（ＰＡＭ）配列特異性を特定して、最適な合成配列の標的化を可能にする。 Example 3. -Prophetic --- Determination of Protospacer-Adjacent Motif.
The experiment was carried out by Karvelis et al. Methods. Protospacer flanking motifs for novel enzymes described herein, performed as in any of the examples of 2017 May 15; 121-122: 3-8 (incorporated herein as a whole by reference). PAM) Identify sequence specificity and allow targeting of optimal synthetic sequences.

一例（インビボスクリーン）では、本明細書に記載されている酵素のいずれかをコードするプラスミドと、プロトスペーサーを標的とするガイドＲＮＡとを有する細胞は、抗生物質耐性遺伝子を含むプラスミドライブラリ、およびランダム化されたＰＡＭ配列と隣接しているプロトスペーサー配列で同時形質転換される。機能的なＰＡＭを含むプラスミドは、酵素によって切断され、細胞死に至る。生き残った細胞から単離された酵素切断抵抗性プラスミドプールをディープシーケンシングすると、機能的な切断を可能にするＰＡＭを含む枯渇したプラスミドのセットが示される。 In one example (in vivo screen), cells having a plasmid encoding any of the enzymes described herein and a guide RNA targeting a protospacer are a plasmid library containing an antibiotic resistance gene, and random. It is co-transformed with a plasmid sequence adjacent to the plasmidized PAM sequence. The plasmid containing the functional PAM is enzymatically cleaved, leading to cell death. Deep sequencing of a pool of enzyme cleavage-resistant plasmids isolated from surviving cells reveals a set of depleted plasmids containing PAM that allows functional cleavage.

別の例（インビトロスクリーニング）では、ＤＮＡプラスミドまたは鎖状体リピートの形態のＰＡＭライブラリは、インビトロまたは細胞溶解物中で組み立てられたＲＮＰ複合体（例えば、酵素、ｔｒａｃｒＲＮＡおよびｃｒＲＮＡ、または酵素およびハイブリッドｓｇＲＮＡを含む）による切断に晒される。成功した切断事象から生じる遊離ＤＮＡ末端は、アダプターライゲーションによって捕捉され、その後、ＰＡＭ側の生成物のＰＣＲ増幅に晒される。増幅された機能的ＰＡＭのライブラリは、ディープシーケンシングにかけられ、ＤＮＡの切断をライセンス化するＰＡＭが同定される。 In another example (in vitro screening), the PAM library in the form of a DNA plasmid or strand repeat is an RNP complex assembled in vitro or in cytolytic (eg, enzyme, tracrRNA and crRNA, or enzyme and hybrid sgRNA). (Including) exposed to cutting. Free DNA ends resulting from a successful cleavage event are captured by adapter ligation and then exposed to PCR amplification of the product on the PAM side. The library of amplified functional PAMs is subjected to deep sequencing to identify PAMs that license DNA cleavage.

実施例４．－予言的－－ゲノム編集のための哺乳動物細胞における本明細書に記載される合成ＣＲＩＳＰＲシステムの使用例
ｉ）細胞適合性のあるＣ末端核局在化配列（例えば、ヒト細胞の場合はＳＶ４０ＮＬＳ）および適切なポリアデニル化シグナル（例えば、ヒト細胞の場合はＴＫｐＡシグナル）を有する細胞適合性のあるプロモーターの下で、コドン最適化酵素をコードするＯＲＦ、および、（ｉｉ）適切なポリメラーゼＩＩＩプロモーター（例えば、哺乳動物細胞の場合はＵ６プロモーター）の下で、ｓｇＲＮＡ（Ｇで始まる５’配列と、続いてゲノムＤＮＡを標的とする相補的な標的化核酸配列２０ｎｔと、その後、実施例３を介して同定された対応する適合性のあるＰＡＭと、３’ｔｒａｃｒ結合配列、リンカー、およびｔｒａｃｒＲＮＡ配列を有する）をコードするＯＲＦ、をコードするＤＮＡ／ＲＮＡ配列が調製される。いくつかの実施形態では、これらの配列は、同じまたは別々のプラスミドベクター上で調製され、これらのプラスミドベクターは、適切な技術を介して真核細胞にトランスフェクトされる。いくつかの実施形態では、これらの配列は、別々のＤＮＡ配列として調製され、これが細胞にトランスフェクトされるか、または微量注入される。いくつかの実施形態では、これらの配列は、細胞にトランスフェクトされるか、微量注入される、合成ＲＮＡまたはインビトロ転写ＲＮＡとして調製される。いくつかの実施形態では、これらの配列は、タンパク質に翻訳され、細胞にトランスフェクトされるか、または微量注入される。 Example 4. -Predictive--Examples of use of the synthetic CRISPR system described herein in mammalian cells for genome editing i) Cell-compatible C-terminal nuclear localization sequences (eg, SV40 for human cells) An ORF encoding a codon-optimizing enzyme under a cell-compatible promoter with NLS) and a suitable polyadenylation signal (eg, TK pA signal for human cells), and (ii) a suitable polymerase III. Under a promoter (eg, U6 promoter in the case of mammalian cells), sgRNA (a 5'sequence starting with G, followed by a complementary targeting nucleic acid sequence 20 nt targeting genomic DNA, followed by Example 3 A DNA / RNA sequence encoding an ORF, which has a 3'tracr binding sequence, a linker, and a tracrRNA sequence) is prepared with the corresponding compatible PAM identified via. In some embodiments, these sequences are prepared on the same or separate plasmid vectors, which are transfected into eukaryotic cells via appropriate techniques. In some embodiments, these sequences are prepared as separate DNA sequences, which are transfected into cells or microinjected. In some embodiments, these sequences are prepared as synthetic RNA or in vitro transcribed RNA that is transfected or microinjected into cells. In some embodiments, these sequences are translated into proteins, transfected into cells, or microinjected.

いずれのトランスフェクション方法が選択されても、（ｉ）と（ｉｉ）は細胞内に導入される。酵素および／またはｓｇＲＮＡが活性形態に転写および／または翻訳されるように、インキュベーション期間を経過させる。インキュベーション期間後、標的化配列の近傍にあるゲノムＤＮＡが（例えば、配列決定によって）解析される。酵素を媒介とした切断と非相同末端結合の結果、標的化配列の近傍にあるゲノムＤＮＡにインデルが導入される。 Regardless of which transfection method is selected, (i) and (ii) are introduced into cells. The incubation period is allowed to elapse so that the enzyme and / or sgRNA is transcribed and / or translated into the active form. After the incubation period, genomic DNA in the vicinity of the targeting sequence is analyzed (eg, by sequencing). As a result of enzyme-mediated cleavage and non-homologous end binding, indels are introduced into genomic DNA near the targeting sequence.

いくつかの実施形態では、（ｉ）および（ｉｉ）は、２５ｂｐ以上のサイズの切断部位に隣接しているゲノムの領域をコードする第３の修復ヌクレオチドとともに細胞に導入されることで、相同性指向の修復が促進される。これらの隣接配列内には、単一の塩基対突然変異、機能的な遺伝子断片、発現のための外来性または天然の遺伝子、あるいは生化学的経路を構成する複数の遺伝子が含まれることがある。 In some embodiments, (i) and (ii) are homologous by being introduced into cells with a third repair nucleotide encoding a region of the genome flanking a cleavage site with a size of 25 bp or greater. Oriented restoration is facilitated. Within these flanking sequences may be a single base pair mutation, a functional gene fragment, an exogenous or natural gene for expression, or multiple genes that make up a biochemical pathway. ..

実施例５．－予言的－－インビトロの本明細書に記載される合成ＣＲＩＳＰＲシステムの使用
本明細書に記載される酵素のいずれかは、精製タグを含む適切な大腸菌発現プラスミドにクローン化され、大腸菌で組換え発現され、組換えタグを用いて精製される。５’Ｇと、続いて２０ｎｔの標的化配列とＰＡＭ配列、適合性のあるｃｒＲＮＡのｔｒａｃｒＲＮＡ結合領域、ＧＡＡＡリンカー、および適合性のあるｔｒａｃｒＲＮＡを含むＲＮＡは、適切な固相ＲＮＡ合成方法によって合成される。組換え酵素とｓｇＲＮＡは、Ｍｇ２＋を含む適切な切断緩衝液（例えば、２０ｍＭのＨＥＰＥＳｐＨ７．５、１００ｍＭのＫＣｌ、５ｍＭのＭｇＣｌ_２、１ｍＭのＤＴＴ、５％グリセロール）に配合され、標的化配列とＰＡＭ配列に相補的な配列を含む標的ＤＮＡを導入することで、反応が開始する。ＤＮＡの切断は、適切なアッセイ（例えば、アガロースゲル電気泳動と、その後の臭化エチジウム染色（または同様に作用するＤＮＡ挿入剤）およびＵＶ可視化）によってモニタリングされる。 Example 5. -Predictive--Use of the synthetic CRISPR system described herein In vitro Any of the enzymes described herein is cloned into a suitable E. coli expression plasmid containing a purification tag and recombined with E. coli. It is expressed and purified using a recombinant tag. RNA containing 5'G followed by a 20 nt targeting and PAM sequence, a compatible crRNA tracrRNA binding region, a GAAA linker, and a compatible tracrRNA was synthesized by a suitable solid phase RNA synthesis method. To. The recombinant enzyme and sgRNA were formulated with a suitable cleavage buffer containing Mg2 + (eg, 20 mM HEPES pH 7.5, 100 mM KCl, 5 mM MgCl ₂ , 1 mM DTT, 5% glycerol) and with the targeting sequence. The reaction is initiated by introducing a target DNA containing a sequence complementary to the PAM sequence. DNA cleavage is monitored by appropriate assays (eg, agarose gel electrophoresis followed by ethidium bromide staining (or similarly acting DNA inserts) and UV visualization).

実施例６．－（一般プロトコル）本明細書に記載されるエンドヌクレアーゼのＰＡＭ配列同定／確認
ＰＡＭ配列は、大腸菌ライセートベースの発現システム（ｍｙＴＸＴＬ、ＡｒｂｏｒＢｉｏｓｃｉｅｎｃｅｓ）で発現した推定エンドヌクレアーゼで切断できるランダムに生成されたＰＡＭ配列を含むプラスミドを配列決定することによって決定された。このシステムでは、大腸菌コドン最適化ヌクレオチド配列が、Ｔ７プロモーターの制御下でＰＣＲ断片から転写および翻訳された。Ｔ７プロモーター下でｔｒａｃｒ配列と、Ｔ７プロモーターとその後のリピート－スペーサー－リピート配列から構成される最小ＣＲＩＳＰＲアレイとを有する第２のＰＣＲ断片が同じ反応で転写された。ＴＸＴＬシステムにおけるエンドヌクレアーゼとｔｒａｃｒ配列の発現に成功し、その後のＣＲＩＳＰＲアレイ処理を行うことで、インビトロで活性のＣＲＩＳＰＲヌクレアーゼ複合体が得られた。 Example 6. -(General Protocol) PAM Sequence Identification / Confirmation of Endonucleases Described herein PAM sequences were randomly generated that could be cleaved with putative endonucleases expressed on E. coli lysate-based expression systems (myTXTL, Arbor Biosciences). It was determined by sequencing a plasmid containing the PAM sequence. In this system, E. coli codon-optimized nucleotide sequences were transcribed and translated from PCR fragments under the control of the T7 promoter. A second PCR fragment with a tracr sequence under the T7 promoter and a minimal CRISPR array consisting of the T7 promoter followed by a repeat-spacer-repeat sequence was transcribed in the same reaction. Successful expression of endonucleases and tracr sequences in the TXTL system and subsequent CRISPR array treatment yielded an active CRISPR nuclease complex in vitro.

最小限のアレイに一致するスペーサー配列と、その後の８Ｎの混合塩基（推定ＰＡＭ配列）とを含む標的プラスミドのライブラリは、ＴＸＴＬ反応の出力と一緒にインキュベートされた。１－３時間後に反応を停止し、ＤＮＡクリーンアップキット、例えば、ＺｙｍｏＤＣＣ、ＡＭＰｕｒｅＸＰｂｅａｄｓ、ＱｉａＱｕｉｃｋなどを用いてＤＮＡを回収した。アダプター配列は、エンドヌクレアーゼによって切断された活性なＰＡＭ配列を持つＤＮＡに平滑末端ライゲーションされ、一方、切断されなかったＤＮＡはライゲーションに利用できなかった。次に、活性なＰＡＭ配列を含むＤＮＡセグメントは、ライブラリとアダプター配列に特異的なプライマーを用いてＰＣＲで増幅された。ＰＣＲ増幅産物をゲル上で分離させ、切断事象に対応するアンプリコンを同定した。切断反応の増幅セグメントは、ＮＧＳライブラリを調製するための鋳型としても使用された。最初の８Ｎライブラリのサブセットであった、この結果として生じたライブラリを配列決定することで、活性なＣＲＩＳＰＲ複合体の正確なＰＡＭを含む配列が明らかになった。単一のＲＮＡコンストラクトを用いたＰＡＭテストでは、インビトロで転写されたＲＮＡがプラスミドライブラリとともに加えられ、ｔｒａｃｒ／最少ＣＲＩＳＰＲアレイ鋳型が省略された以外は、同じ手順が繰り返された。ＮＧＳライブラリが調製されたエンドヌクレアーゼについては、ｓｅｑＬｏｇｏ（例えば、Ｈｕｂｅｒｅｔａｌ．ＮａｔＭｅｔｈｏｄｓ．２０１５Ｆｅｂ；１２（２）：１１５－２１）の表現が構築され、図２７、３８、２９、３０、３１、３２、３３、３４、および３５で提示されている。これらの表現を構築するために使用されるｓｅｑＬｏｇｏモジュールは、ＤＮＡ配列モチーフ（例えば、ＰＡＭ配列）の位置特異的重み行列（ｐｏｓｉｔｉｏｎｗｅｉｇｈｔｍａｔｒｉｘ）を取り、ＳｃｈｎｅｉｄｅｒとＳｔｅｐｈｅｎｓが導入した対応する配列ロゴをプロットする（例えば、Ｓｃｈｎｅｉｄｅｒｅｔａｌ．ＮｕｃｌｅｉｃＡｃｉｄｓＲｅｓ．１９９０Ｏｃｔ２５；１８（２０）：６０９７－１００を参照）。ｓｅｑＬｏｇｏ表現における配列を表す文字は，アラインメントされた配列（例えば、ＰＡＭ配列）の各位置について互いに重ねられている。各文字の高さはその頻度に比例しており、最も多いものが上にくるように文字を並べ替えている。 A library of target plasmids containing a spacer sequence matching the minimal array followed by a mixed base of 8N (estimated PAM sequence) was incubated with the output of the TXTL reaction. After 1-3 hours, the reaction was stopped and DNA was recovered using a DNA cleanup kit such as Zymo DCC, AMPure XP beads, QiaQuick and the like. The adapter sequence was blunt-ended ligated to DNA with the active PAM sequence cleaved by the endonuclease, while the uncleaved DNA was not available for ligation. The DNA segment containing the active PAM sequence was then amplified by PCR using primers specific for the library and adapter sequence. The PCR amplification product was separated on the gel to identify the amplicon corresponding to the cleavage event. The amplified segment of the cleavage reaction was also used as a template for preparing the NGS library. Sequencing the resulting library, which was a subset of the first 8N library, revealed a sequence containing the exact PAM of the active CRISPR complex. In the PAM test with a single RNA construct, the same procedure was repeated except that the RNA transcribed in vitro was added with the plasmid library and the tracr / minimal CRISPR array template was omitted. For the endonucleases for which the NGS library was prepared, the representation of seqLogo (eg, Huber et al. Nat Methods. 2015 Feb; 12 (2): 115-21) was constructed and FIGS. 27, 38, 29, 30, 31 , 32, 33, 34, and 35. The seqLogo module used to construct these representations takes a position weight matrix of DNA sequence motifs (eg, PAM sequences) and plots the corresponding sequence logos introduced by Schneider and Stephens. (See, for example, Schneider et al. Nucleic Acids Res. 1990 Oct 25; 18 (20): 6097-100). The characters representing the sequences in the seqLogo representation are superimposed on each other for each position of the aligned sequence (eg, PAM sequence). The height of each letter is proportional to its frequency, and the letters are rearranged so that the most common one is on top.

実施例７．－（一般的なプロトコル）ｔｒａｃｒＲＮＡとｓｇＲＮＡの構造のＲＮＡ折り畳み
ガイドＲＮＡ配列の３７℃での折り畳み構造は、Ａｎｄｒｏｎｅｓｃｕｅｔａｌ．Ｂｉｏｉｎｆｏｒｍａｔｉｃｓ．２００７Ｊｕｌ１；２３（１３）：ｉ１９－２８の方法を用いて計算された。本明細書に記載される例示的なｓｇＲＮＡの予測構造を、図２１、２２、２３、２４、２５、および２６に示す。 Example 7. -(General protocol) RNA folding of tracrRNA and sgRNA structures The folding structure of the guide RNA sequence at 37 ° C. is described in Andronescu et al. Bioinformatics. 2007 Jul 1; 23 (13): Calculated using the method i19-28. Exemplary sgRNA predictive structures described herein are shown in FIGS. 21, 22, 23, 24, 25, and 26.

実施例８．－（一般的なプロトコル）ＭＧＣＲＩＳＰＲ複合体のインビトロ切断効率
エンドヌクレアーゼは、プロテアーゼ欠損大腸菌Ｂ株において、誘導性Ｔ７プロモーターからＨｉｓタグ付けされた融合タンパク質として発現した。Ｈｉｓタグ付けされたタンパク質を発現する細胞を超音波で溶解し、Ｈｉｓタグ付けされたタンパク質は、ＡＫＴＡＡｖａｎｔＦＰＬＣ（ＧＥＬｉｆｅｓｃｉｅｎｃｅ）上のＨｉｓＴｒａｐＦＦカラム（ＧＥＬｉｆｅｓｃｉｅｎｃｅ）でのＮｉ－ＮＴＡアフィニティークロマトグラフィーによって精製された。溶出物をアクリルアミドゲル（Ｂｉｏ－Ｒａｄ）上でＳＤＳ－ＰＡＧＥにより分解し、ＩｎｓｔａｎｔＢｌｕｅＵｌｔｒａｆａｓｔクーマシー（Ｓｉｇｍａ－Ａｌｄｒｉｃｈ）で染色した。純度は、ＩｍａｇｅＬａｂソフトウェア（Ｂｉｏ－Ｒａｄ）を用いたタンパク質バンドの濃度測定を用いて決定された。精製されたエンドヌクレアーゼは、５０ｍＭのＴｒｉｓ－ＨＣｌ、３００ｍＭのＮａＣｌ、１ｍＭのＴＣＥＰ、５％グリセロール；ｐＨ７．５からなる保存緩衝液に透析され、－８０℃で保存された。 Example 8. -(General protocol) In vitro cleavage efficiency endonuclease of MG CRISPR complex was expressed in protease-deficient E. coli strain B as a His-tagged fusion protein from the inducible T7 promoter. Cells expressing the His-tagged protein are lysed with ultrasound and the His-tagged protein is subjected to Ni-NTA affinity chromatography on a HisTrap FF column (GE Lifescience) on an AKTA Avant FPLC (GE Lifecare). Purified. The eluate was decomposed by SDS-PAGE on an acrylamide gel (Bio-Rad) and stained with InstantBlue Ultrafast Coomassie (Sigma-Aldrich). Purity was determined using protein band concentration measurements using ImageLab software (Bio-Rad). The purified endonuclease was dialyzed against a storage buffer consisting of 50 mM Tris-HCl, 300 mM NaCl, 1 mM TCEP, 5% glycerol; pH 7.5 and stored at -80 ° C.

スペーサー配列とＰＡＭ配列（例えば、実施例６のように決定される）を含む標的ＤＮＡはＤＮＡ合成により構築された。代表的な１つのＰＡＭは、ＰＡＭが縮重塩基を有する場合に、試験のために選択された。標的ＤＮＡは、一端から７００ｂｐに位置するＰＡＭとスペーサーを用いたＰＣＲ増幅によってプラスミドから得られた２２００ｂｐの直鎖状ＤＮＡを含んでいた。切断に成功すると、７００ｂｐと１５００ｂｐの断片が得られた。標的ＤＮＡ、インビトロ転写された単一のＲＮＡ、および精製された組換えタンパク質を、余剰なタンパク質とＲＮＡを含む切断緩衝液（１０ｍＭのＴｒｉｓ、１００ｍＭのＮａＣｌ、１０ｍＭのＭｇＣｌ_２）で組み合わせ、５分から３時間、通常は１時間インキュベートした。反応を、ＲＮＡｓｅＡの添加と６０分でのインキュベーションによって停止させた。その後、反応物を１．２％ＴＡＥアガロースゲルで分解し、ＩｍａｇｅＬａｂソフトウェアで切断された標的ＤＮＡの割合を定量化する。 The target DNA containing the spacer sequence and the PAM sequence (eg, as determined in Example 6) was constructed by DNA synthesis. One representative PAM was selected for testing when the PAM had a degenerate base. The target DNA contained 2200 bp linear DNA obtained from the plasmid by PCR amplification with PAM located 700 bp from one end and spacers. Successful cleavage yielded 700 bp and 1500 bp fragments. The target DNA, a single RNA transcribed in vitro, and the purified recombinant protein are combined with a cleavage buffer containing excess protein and RNA (10 mM Tris, 100 mM NaCl, 10 mM MgCl ₂ ) from 5 minutes. Incubated for 3 hours, usually 1 hour. The reaction was stopped by the addition of RNAse A and incubation at 60 minutes. The reactants are then degraded on a 1.2% TAE agarose gel and the percentage of target DNA cleaved with ImageLab software is quantified.

実施例９．－（一般的なプロトコル）大腸菌中のＭＧＣＲＩＳＰＲ複合体のゲノム切断活性の試験
大腸菌は、二本鎖ＤＮＡの切断を効率的に修復する能力を持っていない。そのため、ゲノムＤＮＡの切断は致死的な事象となり得る。この現象を利用して、ゲノムＤＮＡにスペーサー／標的配列とＰＡＭ配列を組み込んだ標的株でエンドヌクレアーゼとｔｒａｃｒＲＮＡを組換え発現させることによって、大腸菌においてエンドヌクレアーゼ活性を試験した。 Example 9. -(General Protocol) Testing the Genome Cleavage Activity of the MG CRISPR Complex in E. coli E. coli does not have the ability to efficiently repair double-stranded DNA cleavage. Therefore, cleavage of genomic DNA can be a fatal event. Utilizing this phenomenon, endonuclease activity was tested in E. coli by recombinantly expressing endonucleases and tracrRNAs in target strains in which a spacer / target sequence and a PAM sequence were incorporated into genomic DNA.

このアッセイでは、ＰＡＭ配列は、実施例６に記載される方法で決定されたように試験対象のエンドヌクレアーゼに特異的である。ｓｇＲＮＡ配列は、ｔｒａｃｒＲＮＡの配列と予測される構造に基づいて決定された。リピートの５’末端から始めて、８～１２ｂｐ（一般的には１０ｂｐ）のリピート－アンチリピートの対を選択した。残りのリピートの３’末端とｔｒａｃｒＲＮＡの５’末端をテトラループに置き換えた。一般に、テトラループはＧＡＡＡであったが、特にＧＡＡＡ配列が折り畳みを阻害すると予測される場合には、他のテトラループを使用することができる。このような場合には、ＴＴＣＧテトラループを用いた。 In this assay, the PAM sequence is specific to the endonuclease under test as determined by the method described in Example 6. The sgRNA sequence was determined based on the expected structure of the traceRNA sequence. Starting from the 5'end of the repeat, a repeat-anti-repeat pair of 8-12 bp (typically 10 bp) was selected. The 3'end of the remaining repeats and the 5'end of tracrRNA were replaced with tetraloop. Generally, the tetraloop was GAAA, but other tetraloops can be used, especially if the GAAA sequence is expected to inhibit folding. In such a case, TTCG Tetraloop was used.

ゲノムＤＮＡにＰＡＭ配列を組み込んだ組換え株を、エンドヌクレアーゼをコードするＤＮＡで形質転換した。その後、形質転換体を化学的に適合させ、標的配列に特異的（「オンターゲット（ｏｎｔａｒｇｅｔ）」）または標的に非特異的（「ノンターゲット（ｎｏｎｔａｒｇｅｔ）」）な単一のガイドＲＮＡ５０ｎｇを用いて、形質転換した。熱ショック後、形質転換体をＳＯＣ中で３７℃にて２時間回収した。その後、ヌクレアーゼ効率は、誘導培地で培養した５倍希釈系列によって決定された。コロニーは３連で（ｉｎｔｒｉｐｌｉｃａｔｅ）希釈系列から定量化された。 A recombinant strain in which the PAM sequence was incorporated into genomic DNA was transformed with DNA encoding an endonuclease. The transformant is then chemically matched to give 50 ng of a single guide RNA that is specific for the target sequence (“on target”) or non-target (“non target”). Used to transform. After heat shock, the transformants were recovered in SOC at 37 ° C. for 2 hours. The nuclease efficiency was then determined by a 5-fold dilution series cultured in induction medium. Colonies were quantified from the intriplicate dilution series.

実施例１０．－（一般的なプロトコル）哺乳動物細胞中のＭＧＣＲＩＳＰＲ複合体のゲノム切断活性の試験
哺乳動物細胞における標的化および切断活性を示すために、ＭＧＣａｓエフェクタータンパク質配列を２つの哺乳動物発現ベクター：（ａ）Ｃ末端のＳＶ４０ＮＬＳと２Ａ－ＧＦＰタグを有するものと、（ｂ）ＧＦＰタグを持たず、Ｎ末端に１つとＣ末端に１つの、２つのＳＶ４０ＮＬＳ配列を有するもので試験した。いくつかの例では、エンドヌクレアーゼをコードするヌクレオチド配列は、哺乳動物細胞での発現にコドン最適化されている。 Example 10. -(General Protocol) Testing the Genome Cryptonic Activity of the MG CRISPR Complex in Mammalian Cells To show the targeting and cleaving activity in Mammalian cells, the MG Cas effector protein sequence was presented in two mammalian expression vectors: ( A) C-terminal SV40 NLS and 2A-GFP tags were tested, and (b) two SV40 NLS sequences, one at the N-terminus and one at the C-terminus, without the GFP tag. In some examples, the nucleotide sequence encoding the endonuclease is codon-optimized for expression in mammalian cells.

標的化配列が付加された対応する単一のガイドＲＮＡ配列（ｓｇＲＮＡ）を、第２の哺乳動物発現ベクターにクローン化する。２つのプラスミドをＨＥＫ２９３Ｔ細胞にコトランスフェクトする。発現プラスミドとｓｇＲＮＡ標的化プラスミドをＨＥＫ２９３Ｔ細胞にコトランスフェクトしてから７２時間後に、ＤＮＡを抽出し、ＮＧＳ－ライブラリの作成に使用する。哺乳動物細胞における酵素の標的化効率を実証するために、標的部位の配列決定におけるインデルを介してパーセントＮＨＥＪを測定する。各タンパク質の活性を試験するために、少なくとも１０の異なる標的部位が選択された。 The corresponding single guide RNA sequence (sgRNA) to which the targeting sequence has been added is cloned into a second mammalian expression vector. Two plasmids are cotransfected into HEK293T cells. 72 hours after cotransfection of the expression plasmid and sgRNA targeting plasmid into HEK293T cells, DNA is extracted and used to create the NGS-library. To demonstrate the efficiency of enzyme targeting in mammalian cells, percent NHEJ is measured via the indel in sequencing the target site. At least 10 different target sites were selected to test the activity of each protein.

実施例１１．－ＭＧ１ファミリーメンバーの特徴付け
ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡ検証 Example 11. -Characterization of MG1 family members PAM specificity, tracrRNA / sgRNA validation

ＭＧ１ファミリーエンドヌクレアーゼシステムの標的となるエンドヌクレアーゼ活性は、実施例６に記載したｍｙＴＸＴＬシステムを用いて確認された。このアッセイでは、切断された標的プラスミドのＰＣＲ増幅により、図１７－２０に示すように、ゲル中で約１７０ｂｐで遊走する産物が得られる。増幅産物は、ＭＧ１－４（二重ガイド：ゲル１、レーン３を参照、単一ガイド：ゲル６、レーン２を参照）、ＭＧ１－５（ゲル２、レーン１０）、ＭＧ１－６（二重ガイド：ゲル５、レーン６を参照、単一ガイド：ゲル６、レーン５を参照）、ＭＧ１－７（二重ガイド：ゲル３、レーン１３を参照、単一ガイド：ゲル３、レーン２を参照）（それぞれタンパク質配列番号：１－４）について観察された。ＰＣＲ産物の配列決定により、表２に示されるように、これらの酵素の活性ＰＡＭ配列が明らかになった。 The target endonuclease activity of the MG1 family endonuclease system was confirmed using the myTXTL system described in Example 6. In this assay, PCR amplification of the cleaved target plasmid gives a product that migrates at about 170 bp in the gel, as shown in Figure 17-20. Amplification products are MG1-4 (double guide: see gel 1, lane 3, single guide: see gel 6, lane 2), MG1-5 (gel 2, lane 10), MG1-6 (double). Guide: see gel 5, lane 6, single guide: see gel 6, lane 5, MG1-7 (double guide: see gel 3, lane 13, single guide: see gel 3, lane 2) ) (Each protein SEQ ID NO: 1-4) was observed. Sequencing of the PCR products revealed the active PAM sequences of these enzymes, as shown in Table 2.

合成単一ガイドＲＮＡ（ｓｇＲＮＡ）は、ｔｒａｃｒＲＮＡの配列と予測される構造に基づいて設計され、配列番号：５４６１－５４６４として提示されている。実施例６のＰＡＭ配列スクリーンは、ｓｇＲＮＡを用いて繰り返された。この実験の結果は表２にも示されており、ｓｇＲＮＡを使用するとＰＡＭの特異性がわずかに変化したことが明らかになっている。 Synthetic single guide RNAs (sgRNAs) are designed based on the sequence of tracrRNA and the expected structure and are presented as SEQ ID NO: 5461-5464. The PAM sequence screen of Example 6 was repeated with sgRNA. The results of this experiment are also shown in Table 2 and show that the use of sgRNA changed the specificity of PAM slightly.

インビトロでの標的となるエンドヌクレアーゼ活性 Targeted endonuclease activity in vitro

ＰＡＭ配列ＣＡＧＧＡＡＧＧを有する標的ＤＮＡに対する、ＭＧ１－４エンドヌクレアーゼシステム（ｓｇＲＮＡ配列番号：５４６１を有する、タンパク質配列番号：１）のインビトロ活性が、実施例８の方法を用いて実証された。上記で報告されている単一ガイド配列（配列番号：５４６１）を使用し、配列のＮｓを置き換えてスペーサー／標的化配列の長さを１８～２４ｎｔの範囲で変化させた。その結果を図１０に示す。左パネルは、異なる標的化配列長（１８～２４ｎｔ）を有する対応する単一ガイドｓｇＲＮＡと組み合わせたＭＧ１－４によるＤＮＡ切断を実証するゲルを示し、右パネルは、同じデータを棒グラフとして定量化したものを示す。このデータは、１８～２４ヌクレオチドの標的化配列がＭＧ１－４／ｓｇＲＮＡシステムで機能的であることを実証した。 The in vitro activity of the MG1-4 endonuclease system (with sgRNA SEQ ID NO: 5461, protein SEQ ID NO: 1) on target DNA with PAM sequence CAGGAAGG was demonstrated using the method of Example 8. The single guide sequence reported above (SEQ ID NO: 5461) was used and the Ns of the sequence was replaced to vary the length of the spacer / targeting sequence in the range of 18-24 nt. The results are shown in FIG. The left panel shows a gel demonstrating DNA cleavage by MG1-4 combined with a corresponding single guide sgRNA with different targeted sequence lengths (18-24 nt), and the right panel quantified the same data as a bar graph. Show things. This data demonstrated that the 18-24 nucleotide targeting sequence was functional in the MG1-4 / sgRNA system.

細菌細胞における標的となるエンドヌクレアーゼ活性 Targeted endonuclease activity in bacterial cells

ＭＧ１－４エンドヌクレアーゼシステム（タンパク質配列番号：１、ｓｇＲＮＡ配列番号：５４６１）のインビボ活性は、実施例９と同様にＰＡＭ配列ＣＡＧＧＡＡＧＧを用いて試験された。形質転換した大腸菌を連続希釈でプレーティングし、その結果（左パネルに大腸菌の連続希釈を示し、右パネルに定量化した成長を示す）を図１１に提示する。非標的ｓｇＲＮＡを発現する大腸菌と比較して、オンターゲットｓｇＲＮＡを発現する大腸菌の増殖の大幅な減少は、ゲノムＤＮＡが大腸菌細胞内でエンドヌクレアーゼによって特異的に切断されたことを示している。 The in vivo activity of the MG1-4 endonuclease system (protein SEQ ID NO: 1, sgRNA SEQ ID NO: 5461) was tested using the PAM sequence CAGGAAGG as in Example 9. Transformed E. coli are plated with serial dilutions and the results (the left panel shows the serial dilutions of E. coli and the right panel shows the quantified growth) are presented in FIG. Significant reduction in proliferation of E. coli expressing on-target sgRNA compared to E. coli expressing non-target sgRNA indicates that genomic DNA was specifically cleaved by endonucleases within E. coli cells.

哺乳動物細胞における標的となるエンドヌクレアーゼ活性 Targeted endonuclease activity in mammalian cells

実施例１０の方法を用いて、哺乳動物細胞における標的化および切断活性を実証した。ＭＧ１－４（タンパク質配列番号：５５２７）およびＭＧ１－６（タンパク質配列番号：５５２９）の配列をコードするオープンリーディングフレームは、Ｃ末端ＳＶ４０ＮＬＳと２Ａ－ＧＦＰタグを有するもの（大腸菌ＭＧ－ＢＢ）と、ＧＦＰタグを持たず、Ｎ末端に１つとＣ末端に１つの２つのＮＬＳ配列を持つもの（大腸菌ｐＭＧ５－ＢＢ）の２つの哺乳動物発現ベクターにクローン化された。ＭＧ１－６については、オープンリーディングフレームはさらに哺乳動物の発現のためにコドン最適化され（配列番号：５５８９）、２－ＮＬＳプラスミド骨格にクローン化された（ＭＧ－１６ｈｓ）。この実験の結果を図１２に示す。エンドヌクレアーゼ発現ベクターは、エンドヌクレアーゼに特異的なｔｒａｃｒ配列と表３～４から選択されたガイド配列とを有するｓｇＲＮＡ（例えば、配列番号：５５１２または５５１５）を発現させるための第２のベクターとともにＨＥＫ２９３Ｔ細胞にコトランスフェクトされた。コトランスフェクションの７２時間後に、ＤＮＡを抽出し、ＮＧＳ－ライブラリの調製に使用した。切断活性は、標的部位の配列に近接の内部欠失（ＮＨＥＪレムナント）の出現により検出された。哺乳動物細胞における酵素の標的化効率を実証するために、標的部位の配列におけるインデルを介してＮＨＥＪの割合を測定し、図１２に示した。 The method of Example 10 was used to demonstrate targeting and cleavage activity in mammalian cells. Open reading frames encoding the sequences of MG1-4 (protein SEQ ID NO: 5527) and MG1-6 (protein SEQ ID NO: 5259) are those having a C-terminal SV40 NLS and a 2A-GFP tag (E. coli MG-BB). , GFP-tagged and cloned into two mammalian expression vectors, one at the N-terminus and one at the C-terminus (E. coli pMG5-BB). For MG1-6, the open reading frame was further codon-optimized for mammalian expression (SEQ ID NO: 5589) and cloned into a 2-NLS plasmid backbone (MG-16hs). The results of this experiment are shown in FIG. The endonuclease expression vector is HEK293T with a second vector for expressing an sgRNA (eg, SEQ ID NO: 5512 or 5515) having an endonuclease-specific tracr sequence and a guide sequence selected from Tables 3-4. The cells were cotransfected. Seventy-two hours after cotransfection, DNA was extracted and used to prepare the NGS-library. Cleavage activity was detected by the appearance of an internal deletion (NHEJ remnant) in close proximity to the sequence at the target site. In order to demonstrate the targeting efficiency of the enzyme in mammalian cells, the proportion of NHEJ was measured via the indel in the sequence of the target site and is shown in FIG.

実施例１２．－ＭＧ２ファミリーメンバーの特徴付け
ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡの検証 Example 12. -Characterization of MG2 family members PAM specificity, validation of tracrRNA / sgRNA

ＭＧ２ファミリーメンバーの標的となるエンドヌクレアーゼ活性は、実施例６に記載されたようにｍｙＴＸＴＬシステムで確認された。このアッセイの結果を図１７～２０で示す。図１７～２０に示すアッセイでは、ライブラリの切断に成功した活性タンパク質は、ゲル中におよそ１７０ｂｐのバンドを生じさせる。ＭＧ２－１（ゲル２、レーン１１、およびゲル４、レーン６を参照）およびＭＧ２－７（ゲル１１、レーン１０を参照）について増幅産物が観察された（それぞれ配列番号：３２０および３２１）。ＰＣＲ産物の配列決定により、以下の表５に示す活性ＰＡＭ配列が明らかになった。 Targeted endonuclease activity of MG2 family members was confirmed on the myTXTL system as described in Example 6. The results of this assay are shown in FIGS. 17-20. In the assay shown in FIGS. 17-20, the active protein that successfully cleaves the library produces a band of approximately 170 bp in the gel. Amplification products were observed for MG2-1 (see Gel 2, Lane 11 and Gel 4, Lane 6) and MG2-7 (see Gel 11, Lane 10) (SEQ ID NOS: 320 and 321 respectively). Sequencing of the PCR product revealed the active PAM sequences shown in Table 5 below.

ｓｇＲＮＡを有するＭＧ２－７エンドヌクレアーゼシステム（エンドヌクレアーゼ配列番号：３２１；ｓｇＲＮＡ配列番号：５４６５）およびＡＧＣＧＴＡＡＧＰＡＭ配列のインビボ活性は、実施例９に記載される方法を用いて確認された。形質転換した大腸菌を連続希釈でプレーティングし、その結果（左パネルに大腸菌の連続希釈を示し、右パネルに定量化した成長を示す）を図３４に提示する。非標的ｓｇＲＮＡを発現する大腸菌と比較して、オンターゲットｓｇＲＮＡを発現する大腸菌の増殖の大幅な減少は、ゲノムＤＮＡが大腸菌細胞内でＭＧ１－４エンドヌクレアーゼによって特異的に切断されたことを示している。 The in vivo activity of the MG2-7 endonuclease system with sgRNA (endonuclease SEQ ID NO: 321; sgRNA SEQ ID NO: 5465) and the AGCGTAAG PAM sequence was confirmed using the method described in Example 9. Transformed E. coli are plated with serial dilutions and the results (the left panel shows the serial dilutions of E. coli and the right panel shows the quantified growth) are presented in FIG. Significant reduction in proliferation of E. coli expressing on-target sgRNA compared to E. coli expressing non-target sgRNA indicates that genomic DNA was specifically cleaved in E. coli cells by MG1-4 endonuclease. There is.

実施例１３．－ＭＧ３ファミリーメンバーの特徴付け Example 13. -Characteristics of MG3 family members

ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡの検証 Verification of PAM specificity, tracrRNA / sgRNA

ＭＧ３ファミリーメンバーの標的となるエンドヌクレアーゼ活性は、ｔｒａｃｒ配列およびＣＲＩＳＰＲアレイを使用して、実施例６に記載されたようなｍｙＴＸＴＬシステムを使用して確認された。このアッセイでは、切断された標的プラスミドのＰＣＲ増幅により、図１７～２０に示すように、ゲル中で約１７０ｂｐで遊走する産物が得られる。ＭＧ３－６（二重ガイド：ゲル２、レーン８を参照、単一ガイド：ゲル３、レーン３を参照）、ＭＧ３－７（二重ガイド：ゲル２、レーン３を参照、単一ガイド：ゲル３、レーン４を参照）、ＭＧ３－８（二重ガイド：ゲル９、レーン５を参照）では、増幅産物が観察された（それぞれ配列番号：４２１、４２２、および４２３）。ＰＣＲ産物の配列決定により、以下の表６に示す活性ＰＡＭ配列が明らかになった。 Targeted endonuclease activity of MG3 family members was confirmed using the tracr sequence and the CRISPR array using the myTXTL system as described in Example 6. In this assay, PCR amplification of the cleaved target plasmid gives a product that migrates at about 170 bp in the gel, as shown in FIGS. 17-20. MG3-6 (double guide: see gel 2, lane 8, single guide: see gel 3, lane 3), MG3-7 (double guide: see gel 2, lane 3, single guide: gel) 3. Amplification products were observed in MG3-8 (dual guides: see gel 9, lane 5) (see lanes 4) (SEQ ID NOs: 421, 422, and 423, respectively). Sequencing of the PCR product revealed the active PAM sequences shown in Table 6 below.

合成単一ガイドＲＮＡ（ｓｇＲＮＡ）は、ｔｒａｃｒＲＮＡの配列と予測される構造に基づいて設計され、配列番号：５４６６－５４６７として提示されている。実施例６のＰＡＭ配列スクリーンは、ｓｇＲＮＡを用いて繰り返された。この実験の結果は表６にも示されており、ｓｇＲＮＡを使用するとＰＡＭの特異性がわずかに変化したことが明らかになっている。 Synthetic single guide RNAs (sgRNAs) are designed based on the sequence of tracrRNA and the expected structure and are presented as SEQ ID NO: 5466-5467. The PAM sequence screen of Example 6 was repeated with sgRNA. The results of this experiment are also shown in Table 6 and show that the use of sgRNA changed the specificity of PAM slightly.

ＭＧ３－６（エンドヌクレアーゼ配列番号：４２１）のインビトロ活性は、実施例８の方法を使用して、ＰＡＭ配列ＧＴＧＧＧＴＴＡで実証された。上記で報告されている単一ガイド配列（配列番号：５４６６）を使用し、配列のＮｓを置き換えてスペーサー／標的化配列の長さを１８～２４ｎｔの範囲で変化させた。その結果を図１３に示す。上パネルは、異なる標的化配列長（１８～２４ｎｔ）を有する様々なｓｇＲＮＡと組み合わせたＭＧ３－６によるＤＮＡ切断を実証するゲルを示し、下パネルは、同じデータを棒グラフとして定量化したものを示す。このデータは、１８～２４ヌクレオチドの標的化配列がＭＧ３－６／ｓｇＲＮＡシステムで機能的であることを実証した。 In vitro activity of MG3-6 (endonuclease SEQ ID NO: 421) was demonstrated with the PAM sequence GTGGGTTA using the method of Example 8. The single guide sequence reported above (SEQ ID NO: 5466) was used, substituting the Ns of the sequence and varying the length of the spacer / targeting sequence in the range of 18-24 nt. The result is shown in FIG. The upper panel shows a gel demonstrating DNA cleavage by MG3-6 in combination with various sgRNAs with different targeted sequence lengths (18-24 nt), and the lower panel shows the same data quantified as a bar graph. .. This data demonstrated that the 18-24 nucleotide targeting sequence was functional in the MG3-6 / sgRNA system.

ＭＧ３－７エンドヌクレアーゼシステム（タンパク質配列番号：４２２、ｓｇＲＮＡ配列番号：５４６７）のインビボ活性は、実施例９の方法を使用してＰＡＭ配列ＴＧＧＡＣＣＴＧを用いて試験された。形質転換した大腸菌を連続希釈でプレーティングし、その結果（上パネルに大腸菌の連続希釈を示し、下パネルに定量化した成長を示す）を図１４に提示する。非標的ｓｇＲＮＡを発現する大腸菌と比較して、オンターゲットｓｇＲＮＡを発現する大腸菌の増殖の大幅な減少は、ゲノムＤＮＡがＭＧ３－７エンドヌクレアーゼシステムによって特異的に切断されていたことを示している。 The in vivo activity of the MG3-7 endonuclease system (protein SEQ ID NO: 422, sgRNA SEQ ID NO: 5467) was tested using the PAM sequence TGGACCTG using the method of Example 9. Transformed E. coli are plated with serial dilutions and the results (upper panel showing serial dilutions of E. coli and lower panel showing quantified growth) are presented in FIG. Significant reduction in proliferation of E. coli expressing on-target sgRNA compared to E. coli expressing non-target sgRNA indicates that genomic DNA was specifically cleaved by the MG3-7 endonuclease system.

実施例１０の方法を用いて、哺乳動物細胞における標的化および切断活性を実証した。ＭＧ３－７（タンパク質配列番号：４２２）の配列をコードするオープンリーディングフレームは、Ｃ末端ＳＶ４０ＮＬＳと２Ａ－ＧＦＰタグを有するもの（大腸菌ＭＧ－ＢＢ）と、ＧＦＰタグを持たず、Ｎ末端に１つとＣ末端に１つの２つのＮＬＳを持つもの（大腸菌ｐＭＧ５－ＢＢ）の２つの哺乳動物発現ベクターにクローン化された。エンドヌクレアーゼ発現ベクターは、表７から選択されたガイド配列を用いて上記のｓｇＲＮＡを発現するための第２のベクターでＨＥＫ２９３Ｔ細胞へコトランスフェクトされた。この実験の結果を図１２に示す。コトランスフェクションの７２時間後に、ＤＮＡを抽出し、ＮＧＳ－ライブラリの調製に使用した。切断活性は、標的部位の近傍にある内部欠失（ＮＨＥＪレムナント）の出現により検出された。結果を図１５に示す。 The method of Example 10 was used to demonstrate targeting and cleavage activity in mammalian cells. The open reading frames encoding the sequence of MG3-7 (protein SEQ ID NO: 422) are those with C-terminal SV40 NLS and 2A-GFP tag (E. coli MG-BB) and those without GFP tag and 1 at the N-terminus. It was cloned into two mammalian expression vectors, one with one and two NLS at the C-terminus (E. coli pMG5-BB). The endonuclease expression vector was co-transfected into HEK293T cells with a second vector for expressing the sgRNA described above using the guide sequence selected from Table 7. The results of this experiment are shown in FIG. Seventy-two hours after cotransfection, DNA was extracted and used to prepare the NGS-library. Cleavage activity was detected by the appearance of an internal deletion (NHEJ remnant) in the vicinity of the target site. The results are shown in FIG.

ｓｇＲＮＡプラスミド上でコードされた標的部位を以下の表７に示す。 The target sites encoded on the sgRNA plasmid are shown in Table 7 below.

実施例１３．－ＭＧ４ファミリーメンバーの特徴付け
ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡの検証 Example 13. -Characterization of MG4 family members PAM specificity, validation of tracrRNA / sgRNA

ＭＧ４ファミリーエンドヌクレアーゼシステムの標的となるエンドヌクレアーゼ活性は、実施例６に記載されるようなｍｙＴＸＴＬシステムを用いて確認された。このアッセイでは、切断された標的プラスミドのＰＣＲ増幅により、図１７～２０に示すように、ゲル中で約１７０ｂｐで遊走する産物が得られる。ＭＧ４－２（二重ガイド：ゲル２、レーン９、単一ガイド：ゲル１０、レーン７を参照）について増幅産物が観察された（配列番号：４３２）。ＰＣＲ産物の配列決定により、以下の表８に示される活性ＰＡＭ配列が明らかになった。 Targeted endonuclease activity of the MG4 family endonuclease system was confirmed using the myTXTL system as described in Example 6. In this assay, PCR amplification of the cleaved target plasmid gives a product that migrates at about 170 bp in the gel, as shown in FIGS. 17-20. Amplification products were observed for MG4-2 (see Double Guide: Gel 2, Lane 9, Single Guide: Gel 10, Lane 7) (SEQ ID NO: 432). Sequencing of the PCR product revealed the active PAM sequences shown in Table 8 below.

実施例１４．ＭＧ１４ファミリーメンバーの特徴付け
ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡの検証 Example 14. Characterization of MG14 family members PAM specificity, validation of tracrRNA / sgRNA

ＭＧ１４ファミリーメンバーの標的となるエンドヌクレアーゼ活性は、実施例６に記載されるようなｍｙＴＸＴＬシステムを用いて確認された。このアッセイでは、切断された標的プラスミドのＰＣＲ増幅により、図１７～２０に示すように、ゲル中で約１７０ｂｐで遊走する産物が得られる。ＭＧ１４－１（二重ガイド：ゲル２、１レーン４、単一ガイド：ゲル３、レーン８を参照）について増幅産物が観察された（配列番号：６７８）。ＰＣＲ産物の配列決定により、以下の表９に示される活性ＰＡＭ配列特異性が明らかになった。 Targeted endonuclease activity of MG14 family members was confirmed using the myTXTL system as described in Example 6. In this assay, PCR amplification of the cleaved target plasmid gives a product that migrates at about 170 bp in the gel, as shown in FIGS. 17-20. Amplification products were observed for MG14-1 (see Double Guide: Gel 2, 1 Lane 4, Single Guide: Gel 3, Lane 8) (SEQ ID NO: 678). Sequencing of the PCR product revealed the active PAM sequence specificity shown in Table 9 below.

ｓｇＲＮＡを有するＭＧ１４－１エンドヌクレアーゼシステム（エンドヌクレアーゼ配列番号：６７８；ｓｇＲＮＡ配列番号：５４６９）およびＧＧＣＧＧＧＧＡＰＡＭ配列のインビボ活性は、実施例９に記載される方法を用いて確認された。形質転換した大腸菌を連続希釈でプレーティングし、その結果（左パネルに大腸菌の連続希釈を示し、右パネルに定量化した成長を示す）を図３５に提示する。非標的ｓｇＲＮＡを発現する大腸菌と比較して、オンターゲットｓｇＲＮＡを発現する大腸菌の増殖の大幅な減少は、ゲノムＤＮＡが大腸菌細胞内でＭＧ１－４エンドヌクレアーゼによって特異的に切断されたことを示している。 The in vivo activity of the MG14-1 endonuclease system with sgRNA (endonuclease SEQ ID NO: 678; sgRNA SEQ ID NO: 5469) and the GGCGGGGA PAM sequence was confirmed using the method described in Example 9. Transformed E. coli are plated with serial dilutions and the results (the left panel shows the serial dilutions of E. coli and the right panel shows the quantified growth) are presented in FIG. Significant reduction in proliferation of E. coli expressing on-target sgRNA compared to E. coli expressing non-target sgRNA indicates that genomic DNA was specifically cleaved in E. coli cells by MG1-4 endonuclease. There is.

実施例１５．－ＭＧ１５ファミリーメンバーの特徴付け
ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡの検証 Example 15. -Characterization of MG15 family members PAM specificity, validation of tracrRNA / sgRNA

ＭＧ１５ファミリーメンバーの標的となるエンドヌクレアーゼ活性は、実施例６に記載されるようなｍｙＴＸＴＬシステムを用いて確認された。このアッセイでは、切断された標的プラスミドのＰＣＲ増幅により、図１７～２０に示すように、ゲル中で約１７０ｂｐで遊走する産物が得られる。ＭＧ１５－１（二重ガイド：ゲル２、レーン７、単一ガイド：ゲル３、レーン９を参照）について増幅産物が観察された（配列番号：９３０）。ＰＣＲ産物の配列決定により、以下の表１０で詳細に説明される活性ＰＡＭ配列特異性が明らかになった。 Targeted endonuclease activity of MG15 family members was confirmed using the myTXTL system as described in Example 6. In this assay, PCR amplification of the cleaved target plasmid gives a product that migrates at about 170 bp in the gel, as shown in FIGS. 17-20. Amplification products were observed for MG15-1 (see Double Guide: Gel 2, Lane 7, Single Guide: Gel 3, Lane 9) (SEQ ID NO: 930). Sequencing of the PCR product revealed the active PAM sequence specificity detailed in Table 10 below.

インビトロ活性 In vitro activity

ＭＧ１５－１エンドヌクレアーゼシステム（タンパク質配列番号：９３０、ｓｇＲＮＡ配列番号：５４７０）のインビトロ活性は、実施例８の方法を使用してＰＡＭ配列ＧＧＧＴＣＡＡＡを用いて試験された。上記で報告されている単一ガイド配列（配列番号：５４７０）を使用し、配列のＮｓを置き換えてスペーサー／標的化配列の長さを１８～２４ｎｔの範囲で変化させた。その結果を図１６に示す。上パネルは、異なる標的化配列長（１８～２４ｎｔ）を有する様々なｓｇＲＮＡと組み合わせたＭＧ１５－１によるＤＮＡ切断を実証するゲルを示し、下パネルは、同じデータを棒グラフとして定量化したものを示す。このデータは、１８～２４ヌクレオチドの標的化配列がＭＧ１５－１／ｓｇＲＮＡシステムで機能的であることを実証した。 The in vitro activity of the MG15-1 endonuclease system (protein SEQ ID NO: 930, sgRNA SEQ ID NO: 5470) was tested using the PAM sequence GGGTCAAA using the method of Example 8. The single guide sequence reported above (SEQ ID NO: 5470) was used and the Ns of the sequence was replaced to vary the length of the spacer / targeting sequence in the range of 18-24 nt. The result is shown in FIG. The upper panel shows a gel demonstrating DNA cleavage by MG15-1 in combination with various sgRNAs with different targeted sequence lengths (18-24 nt), and the lower panel shows the same data quantified as a bar graph. .. This data demonstrated that the 18-24 nucleotide targeting sequence was functional in the MG15-1 / sgRNA system.

ｓｇＲＮＡを有するＭＧ１５－１エンドヌクレアーゼシステム（エンドヌクレアーゼ配列番号：９３０；ｓｇＲＮＡ配列番号：５４７０）およびＧＧＧＴＣＡＡＡＰＡＭ配列のインビボ活性は、実施例９に記載される方法を用いて確認された。形質転換した大腸菌を連続希釈でプレーティングし、その結果（左パネルに大腸菌の連続希釈を示し、右パネルに定量化した成長を示す）を図３５に提示する。非標的ｓｇＲＮＡを発現する大腸菌と比較して、オンターゲットｓｇＲＮＡを発現する大腸菌の増殖の大幅な減少は、ゲノムＤＮＡが大腸菌細胞内でＭＧ１－４エンドヌクレアーゼによって特異的に切断されたことを示している。 The in vivo activity of the MG15-1 endonuclease system with sgRNA (endonuclease SEQ ID NO: 930; sgRNA SEQ ID NO: 5470) and the GGGTCAAA PAM sequence was confirmed using the method described in Example 9. Transformed E. coli are plated with serial dilutions and the results (the left panel shows the serial dilutions of E. coli and the right panel shows the quantified growth) are presented in FIG. Significant reduction in proliferation of E. coli expressing on-target sgRNA compared to E. coli expressing non-target sgRNA indicates that genomic DNA was specifically cleaved in E. coli cells by MG1-4 endonuclease. There is.

実施例１６．－ＭＧ１６ファミリーメンバーの特徴付け
ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡの検証 Example 16. -Characterization of MG16 family members PAM specificity, validation of tracrRNA / sgRNA

ＭＧ１６ファミリーメンバーの標的となるエンドヌクレアーゼ活性は、実施例６に記載されるようなｍｙＴＸＴＬシステムを用いて確認された。このアッセイでは、切断された標的プラスミドのＰＣＲ増幅により、図１７～２０に示すように、ゲル中で約１７０ｂｐで遊走する産物が得られる。増幅産物は、ＭＧ１６－２（ゲル１１、レーン１７を参照）（配列番号：１０９３）について観察された。ＰＣＲ産物の配列決定により、以下の表１１で詳細に説明される活性ＰＡＭ配列特異性が明らかになった。 Targeted endonuclease activity of MG16 family members was confirmed using the myTXTL system as described in Example 6. In this assay, PCR amplification of the cleaved target plasmid gives a product that migrates at about 170 bp in the gel, as shown in FIGS. 17-20. Amplification products were observed for MG16-2 (see Gel 11, Lane 17) (SEQ ID NO: 1093). Sequencing of the PCR product revealed the active PAM sequence specificity detailed in Table 11 below.

実施例１７．－ＭＧ１８ファミリーメンバーの特徴付け
ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡの検証 Example 17. -Characterization of MG18 family members PAM specificity, validation of tracrRNA / sgRNA

ＭＧ１８ファミリーメンバーの標的となるエンドヌクレアーゼ活性は、実施例６に記載されるようなｍｙＴＸＴＬシステムを用いて確認された。このアッセイでは、切断された標的プラスミドのＰＣＲ増幅により、図１７～２０に示すように、ゲル中で約１７０ｂｐで遊走する産物が得られる。ＭＧ１８－１（二重ガイド：ゲル２、レーン９、単一ガイド：ゲル１１、レーン１２を参照）について増幅産物が観察された（配列番号：１３５４）。ＰＣＲ産物の配列決定により、以下の表１２で詳細に説明される活性ＰＡＭ配列特異性が明らかになった。 Targeted endonuclease activity of MG18 family members was confirmed using the myTXTL system as described in Example 6. In this assay, PCR amplification of the cleaved target plasmid gives a product that migrates at about 170 bp in the gel, as shown in FIGS. 17-20. Amplification products were observed for MG18-1 (see Double Guide: Gel 2, Lane 9, Single Guide: Gel 11, Lane 12) (SEQ ID NO: 1354). Sequencing of the PCR product revealed the active PAM sequence specificity detailed in Table 12 below.

実施例１８．－ＭＧ２１ファミリーメンバーの特徴付け
ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡの検証 Example 18. -Characterization of MG21 family members PAM specificity, validation of tracrRNA / sgRNA

ＭＧ２１ファミリーの標的となるエンドヌクレアーゼ活性は、実施例６に記載されるようなｍｙＴＸＴＬシステムを用いて確認された。このアッセイでは、切断された標的プラスミドのＰＣＲ増幅により、図１７～２０に示すように、ゲル中で約１７０ｂｐで遊走する産物が得られる。増幅産物は、ＭＧ２１－１（ゲル１１、レーン２を参照）について観察された（配列番号：１５１２）。ＰＣＲ産物の配列決定により、以下の表１３で詳細に説明される活性ＰＡＭ配列特異性が明らかになった。 Targeted endonuclease activity of the MG21 family was confirmed using the myTXTL system as described in Example 6. In this assay, PCR amplification of the cleaved target plasmid gives a product that migrates at about 170 bp in the gel, as shown in FIGS. 17-20. Amplification products were observed for MG21-1 (see Gel 11, Lane 2) (SEQ ID NO: 1512). Sequencing of the PCR product revealed the active PAM sequence specificity detailed in Table 13 below.

実施例１９．－ＭＧ２２ファミリーメンバーの特徴付け
ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡの検証 Example 19. -Characterization of MG22 family members PAM specificity, validation of tracrRNA / sgRNA

ＭＧ２２ファミリーメンバーの標的となるエンドヌクレアーゼ活性は、実施例６に記載されるようなｍｙＴＸＴＬシステムを用いて確認された。このアッセイでは、切断された標的プラスミドのＰＣＲ増幅により、図１７～２０に示すように、ゲル中で約１７０ｂｐで遊走する産物が得られる。図１７～２０に示すアッセイでは、ライブラリの切断に成功した活性タンパク質は、ゲル中におよそ１７０ｂｐのバンドを生じさせる。増幅産物は、ＭＧ２２－１（ゲル１１、レーン３を参照）について観察された（タンパク質配列番号：１６５６）。ＰＣＲ産物の配列決定により、以下の表１４で詳細に説明される活性ＰＡＭ配列特異性が明らかになった。 Targeted endonuclease activity of MG22 family members was confirmed using the myTXTL system as described in Example 6. In this assay, PCR amplification of the cleaved target plasmid gives a product that migrates at about 170 bp in the gel, as shown in FIGS. 17-20. In the assay shown in FIGS. 17-20, the active protein that successfully cleaves the library produces a band of approximately 170 bp in the gel. Amplification products were observed for MG22-1 (see Gel 11, Lane 3) (protein SEQ ID NO: 1656). Sequencing of the PCR product revealed the active PAM sequence specificity detailed in Table 14 below.

実施例２０．－ＭＧ２３ファミリーメンバーの特徴付け
ＰＡＭ特異性、ｔｒａｃｒＲＮＡ／ｓｇＲＮＡの検証 Example 20. -Characterization of MG23 family members PAM specificity, validation of tracrRNA / sgRNA

ＭＧ２３ファミリーメンバーの標的となるエンドヌクレアーゼ活性は、実施例６に記載されるようなｍｙＴＸＴＬシステムを用いて確認された。このアッセイでは、切断された標的プラスミドのＰＣＲ増幅により、図１７～２０に示すように、ゲル中で約１７０ｂｐで遊走する産物が得られる。増幅産物は、ＭＧ２３－１（ゲル１１、レーン４を参照）について観察された（配列番号：１７５６）。ＰＣＲ産物の配列決定により、以下の表１５で詳細に説明されるこれらの酵素について活性ＰＡＭ配列特異性が明らかになった。 Targeted endonuclease activity of MG23 family members was confirmed using the myTXTL system as described in Example 6. In this assay, PCR amplification of the cleaved target plasmid gives a product that migrates at about 170 bp in the gel, as shown in FIGS. 17-20. Amplification products were observed for MG23-1 (see Gel 11, Lane 4) (SEQ ID NO: 1756). Sequencing of PCR products revealed active PAM sequence specificity for these enzymes, detailed in Table 15 below.

本開示のシステムは、例えば、核酸編集（例えば、遺伝子編集）、核酸分子への結合（例えば、配列特異的結合）など、様々な用途に使用することができる。このようなシステムは、例えば、ウイルスゲノムを標的とすることでウイルスを不活性化したり、宿主細胞に感染できないようにしたりするために、価値の高い低分子、高分子、または二次代謝産物を生成するように生物を操作するべく遺伝子を追加したり、代謝経路を変更したりするために、進化的選択のための遺伝子駆動要素を確立するために、バイオセンサーとして外来の低分子およびヌクレオチドによる細胞摂動を検出するために、特定のヌクレオチド配列（例えば、細菌における抗生物質耐性をコードする配列）を標的とするとともに検出するためにプローブと組み合わせた不活性化酵素のように、疾患を引き起こす遺伝的要素を検出するための診断ツールとして（例えば、逆転写されたウイルスＲＮＡまたは疾患を引き起こす突然変異をコードする増幅されたＤＮＡ配列の切断を介して）、被験体において疾患を引き起こす可能性のある遺伝的に受け継がれた突然変異をアドレス指定（例えば、除去または置換）して、遺伝子を不活性化することで細胞内での遺伝子の機能を確認するために使用されてもよい。

The systems of the present disclosure can be used for a variety of applications, such as nucleic acid editing (eg, gene editing), binding to nucleic acid molecules (eg, sequence-specific binding), and the like. Such systems provide high-value small, high-molecular-weight, or secondary metabolites, for example, to inactivate the virus by targeting the viral genome or prevent it from infecting host cells. By foreign small molecules and nucleotides as biosensors to establish gene-driving elements for evolutionary selection, to add genes to manipulate organisms to produce, or to alter metabolic pathways. Disease-causing inheritance, such as inactivating enzymes combined with probes to target and detect specific nucleotide sequences (eg, sequences encoding antibiotic resistance in bacteria) to detect cell perturbations. As a diagnostic tool for detecting target elements (eg, through cleavage of reverse transcribed viral RNA or amplified DNA sequences encoding disease-causing mutations), it can cause disease in the subject. It may be used to identify the function of a gene in a cell by addressing (eg, removing or replacing) a genetically inherited mutation to inactivate the gene.

本発明の好ましい実施形態が本明細書中で示され、記載されてきたが、このような実施形態はほんの一例として提供されているに過ぎないことが当業者に明らかであろう。本発明が明細書内で提供される特定の例によって制限されることは意図していない。本発明は前述の明細書に関して記載されているが、本明細書中の実施形態の記載および例示は、限定的な意味で解釈されることを目的としていない。当業者であれば、多くの変更、変化、および置換が、本発明から逸脱することなく思いつくだろう。さらに、本発明のすべての態様は、様々な条件および変数に依存する、本明細書で説明された特定の描写、構成、または相対的な比率に限定されないことが理解されよう。本明細書に記載される本発明の実施形態の様々な代替案が、本発明の実施に際して利用され得ることを理解されたい。それゆえ、本発明は、任意のそのような代替物、修正物、変形物、または同等物にも及ぶものと企図される。以下の請求項は本発明の範囲を定義するものであり、この請求項とその均等物の範囲内の方法、および構造体がそれによって包含されるものであるということが意図されている。 Although preferred embodiments of the invention have been shown and described herein, it will be apparent to those skilled in the art that such embodiments are provided by way of example only. The present invention is not intended to be limited by the particular examples provided herein. Although the present invention has been described with respect to the specification described above, the description and illustration of embodiments herein are not intended to be construed in a limited sense. Those skilled in the art will be able to come up with many changes, changes, and substitutions without departing from the invention. Further, it will be appreciated that all aspects of the invention are not limited to the particular depictions, configurations, or relative proportions described herein, which depend on various conditions and variables. It should be understood that various alternatives of the embodiments of the invention described herein may be utilized in the practice of the invention. Therefore, the invention is intended to extend to any such alternatives, modifications, variants, or equivalents. The following claims define the scope of the invention, and it is intended that the methods and structures within the scope of this claim and its equivalents are embraced therein.

Claims

An engineered nuclease system, said engineered nuclease system.
(A) An endonuclease containing a RuvC_III domain and an HNH domain, wherein the endonuclease is derived from a refractory microorganism, where the endonuclease is a class 2 type II Cas endonuclease. , Endonucleases,
(B) An engineered guide ribonucleic acid structure configured to form a complex with the endonuclease, wherein the engineered guide ribonucleic acid structure is:
(I) A guide ribonucleic acid sequence configured to hybridize to the target deoxyribonucleic acid sequence and
(Ii) Containing an engineered guide ribonucleic acid structure comprising a tracr ribonucleic acid sequence configured to bind to the endonuclease.
Manipulated nuclease system.

13. The RuvC_III domain according to claim 1, wherein the RuvC_III domain comprises a sequence having at least 70%, at least 75%, at least 80%, or at least 90% sequence identity to any one of SEQ ID NOs: 1827-3637. Manipulated nuclease system.

An engineered nuclease system, said engineered nuclease system.
(A) An endonuclease containing a RuvC_III domain having at least 75% sequence identity to any one of SEQ ID NOs: 1827-3637.
(B) An engineered guide ribonucleic acid structure configured to form a complex with the endonuclease, wherein the engineered guide ribonucleic acid structure is:
(I) A guide ribonucleic acid sequence configured to hybridize to the target deoxyribonucleic acid sequence and
(Ii) Containing an engineered guide ribonucleic acid structure comprising a tracr ribonucleic acid sequence configured to bind to the endonuclease.
Manipulated nuclease system.

An engineered nuclease system, said engineered nuclease system.
(A) An endonuclease configured to bind to a protospacer flanking motif (PAM) sequence comprising SEQ ID NO: 5512-5537, wherein the endonuclease is a class 2 type II Cas endonuclease. , Endonucleases,
(B) An engineered guide ribonucleic acid structure configured to form a complex with the endonuclease, wherein the engineered guide ribonucleic acid structure is:
(I) A guide ribonucleic acid sequence configured to hybridize to the target deoxyribonucleic acid sequence and
(Ii) Containing an engineered guide ribonucleic acid structure comprising a tracr ribonucleic acid sequence configured to bind to the endonuclease.
Manipulated nuclease system.

The engineered nuclease system of claim 4, wherein the endonuclease is derived from a refractory microorganism.

The engineered nuclease system of any one of claims 4-5, wherein the endonuclease has not been engineered to bind to a different PAM sequence.

The endonucleases are Cas9 endonucleases, Cas14 endonucleases, Cas12a endonucleases, Cas12b endonucleases, Cas12c endonucleases, Cas12d endonucleases, Cas12e endonucleases, Cas13a endonucleases, Cas13b endonucleases, Cas13c endonucleases, or Cas13d endonucleases. No, the engineered nuclease system of claim 4.

The engineered nuclease system of claim 4, wherein the endonuclease has less than 80% identity to Cas9 endonuclease.

The engineered nuclease system of any one of claims 3-8, wherein the endonuclease further comprises an HNH domain.

The tracr ribonucleic acid sequence comprises a sequence having at least 80% sequence identity to about 60-90 contiguous nucleotides selected from any one of SEQ ID NO: 5476-5511 and SEQ ID NO: 5538. , The engineered nuclease system according to any one of claims 1-9.

An engineered nuclease system, said engineered nuclease system.
(A) An manipulated guide ribonucleic acid structure, wherein the manipulated guide ribonucleic acid structure is:
(I) A guide ribonucleic acid sequence configured to hybridize to the target deoxyribonucleic acid sequence and
(Ii) A tracr ribonucleic acid sequence configured to bind to an endonuclease, wherein the tracr ribonucleic acid sequence is selected from any one of SEQ ID NO: 5476-5511 and SEQ ID NO: 5538. Includes a tracr ribonucleic acid sequence, which comprises a sequence having at least 80% sequence identity to about 60-90 contiguous nucleotides.
Manipulated guide ribonucleic acid structure and
(B) Containing a Class 2 Type II Cas endonuclease configured to bind to the engineered guide ribonucleic acid.
Manipulated nuclease system.

The manipulation according to any one of claims 1-3 or 11, wherein the endonuclease is configured to bind to a protospacer flanking motif (PAM) sequence selected from the group comprising SEQ ID NO: 5512-5537. Nuclease system.

The engineered nuclease system of any one of claims 1-11, wherein the engineered guide ribonucleic acid structure comprises at least two ribonucleic acid polynucleotides.

The engineered nuclease system of any one of claims 1-11, wherein the engineered guide ribonucleic acid structure comprises one ribonucleic acid polynucleotide comprising said guide ribonucleic acid sequence and said tracr ribonucleic acid sequence.

The operation according to any one of claims 1-14, wherein the guide ribonucleic acid sequence is complementary to a prokaryotic, bacterial, archaeal, eukaryotic, fungal, plant, mammalian, or human genomic sequence. Nuclease system.

The engineered nuclease system according to any one of claims 1-15, wherein the guide ribonucleic acid sequence is 15 to 24 nucleotides in length.

The engineered according to any one of claims 1-16, wherein the endonuclease comprises one or more nuclear localization sequences (NLS) located proximal to the N-terminus or C-terminus of the endonuclease. Nuclease system.

The engineered nuclease system of any one of claims 1-17, wherein the NLS comprises a sequence selected from SEQ ID NO: 5597-5612.

From 5'to 3', a first homology arm containing a sequence of at least 20 nucleotides in 5'of the target deoxyribonucleic acid sequence, a synthetic DNA sequence of at least 10 nucleotides, and 3'of the target deoxyribonucleic acid sequence. The engineered according to any one of claims 1-18, further comprising a single-stranded or double-stranded DNA repair template comprising a second homology arm comprising a sequence of at least 20 nucleotides. Nuclease system.

19. The operation of claim 19, wherein the first homology arm or the second homology arm comprises a sequence of at least 40, 80, 120, 150, 200, 300, 500, or 1,000 nucleotides. Nuclease system.

The engineered nuclease system according to any one of claims 1-20, wherein the engineered nuclease system further comprises a source of Mg ²⁺ .

The engineered nuclease system according to any one of claims 1-21, wherein the endonuclease and the tracr ribonucleic acid sequence are derived from distinct bacterial species within the same phylum.

The engineered nuclease system according to any one of claims 1-22, wherein the endonuclease is derived from a bacterium belonging to the genus Dermabacter.

The engineered nuclease system according to any one of claims 1-22, wherein the endonuclease is derived from a bacterium belonging to the phylum Verrucomicrobiota, the phylum Candidatus Peregrinibacteria, or the phylum Candidatus Melainabacteria.

The endonuclease is described in any one of claims 1-22, wherein the endonuclease is derived from a bacterium comprising a 16S rRNA gene having at least 90% identity to any one of SEQ ID NOs: 5592-5595. Manipulated nuclease system.

The manipulation of any one of claims 1-25, wherein the HNH domain comprises a sequence having at least 70% or at least 80% identity to any one of SEQ ID NOs: 5638-5460. Nuclease system.

The engineered nuclease system of any one of claims 1-26, wherein the endonuclease comprises SEQ ID NO: 1-1826 or a variant thereof having at least 55% identity to it.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1827-1830 or SEQ ID NO: 1827-2140. 27 The engineered nuclease system according to any one of 27.

The endonuclease comprises a sequence that is at least 70%, 80%, 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3638-3461 or SEQ ID NO: 3638-3954, claim 1-28. The engineered nuclease system according to any one of the above.

Any of claims 1-29, wherein the endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5615-5632. The engineered nuclease system according to one.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1-4 or SEQ ID NO: 1-319. 30 The engineered nuclease system according to any one of 30.

The guide RNA structure is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5461-5464, SEQ ID NO: 5476-5479, or SEQ ID NO: 5476-5489. The engineered nuclease system according to any one of claims 1-31, comprising a sequence.

The guide RNA structure comprises an RNA sequence that is expected to contain a hairpin consisting of a stem and a loop, wherein the stem is at least 10, at least 12, or at least 14 base pair ribonucleotides, and the loop. The engineered nuclease system of any one of claims 1-32, comprising an asymmetric bulge within four base pairs.

The endonuclease is described in any one of claims 1-33, wherein the endonuclease is configured to bind to a PAM comprising a sequence selected from the group consisting of SEQ ID NO: 5512-5515 or SEQ ID NO: 5527-5530. Manipulated nuclease system.

a) The endonuclease comprises a sequence that is at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 1827.
b) The guide RNA structure comprises a sequence that is at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 5461 or SEQ ID NO: 5476, and.
c) The engineered nuclease system of any one of claims 1-34, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5512 or SEQ ID NO: 5527.

a) The endonuclease contains a sequence that is at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 1828.
b) The guide RNA structure comprises a sequence that is at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 5462 or SEQ ID NO: 5477, and.
c) The engineered nuclease system of any one of claims 1-34, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5513 or SEQ ID NO: 5528.

a) The endonuclease comprises a sequence that is at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 1829.
b) The guide RNA structure comprises a sequence that is at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 5436 or SEQ ID NO: 5478, and.
c) The engineered nuclease system of any one of claims 1-34, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5514 or SEQ ID NO: 5259.

a) The endonuclease comprises a sequence that is at least 70%, at least 80%, or at least 90% identical to SEQ ID NO: 1830.
b) The guide RNA structure comprises a sequence that is at least 70%, at least 80%, or at least 90% identical to at least one of SEQ ID NO: 5464 or SEQ ID NO: 5479, and.
c) The engineered nuclease system of any one of claims 1-34, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5515 or SEQ ID NO: 5530.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2141-2142 or SEQ ID NO: 2141-2241. 27 The engineered nuclease system according to any one of 27.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3955-3965 or SEQ ID NO: 3955-4055. The engineered nuclease system according to any one of 27 or 39.

The endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5632-5638, claim 1-27 or 39. The engineered nuclease system according to any one of -40.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 320-321 or SEQ ID NO: 320-420, claim 1-. 27 or 39-41 according to any one of the engineered nuclease systems.

The guide RNA structure is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5465, SEQ ID NO: 5490-5491, or SEQ ID NO: 5490-5494. The engineered nuclease system according to any one of claims 1-27 or 39-42.

The guide RNA structure comprises any one of claims 1-27 or 39-43 comprising a tracr ribonucleic acid sequence comprising a hairpin comprising at least 8, at least 10 or at least 12 base pair ribonucleotides. Manipulated nuclease system.

13. Manipulated nuclease system.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2141.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5490, and
c) The engineered nuclease system of any one of claims 1-27 or 39-45, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5531.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2142.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5465 or SEQ ID NO: 5491, and
c) The engineered nuclease system of any one of claims 1-27 or 39-45, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5516.

The endonuclease comprises any one of claims 1-27 comprising a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2245-2246. The engineered nuclease system described.

One of claims 1-27 or 48, wherein the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4059-4060. One of the engineered nuclease systems described.

The endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5639-5648, claim 1-27 or 48. The engineered nuclease system according to any one of -49.

Either of claims 1-27 or 48-50, wherein the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 424-425. One of the engineered nuclease systems described in

The guide RNA structure comprises a sequence that is at least 70%, 80%, 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5498-5499 and SEQ ID NO: 5539, claim 1-27 or. The engineered nuclease system according to any one of 48-51.

The guide RNA structure comprises a guide ribonucleic acid sequence that is expected to contain a hairpin having an uninterrupted base pair region containing at least 8 nucleotides of the guide ribonucleic acid sequence and at least 8 nucleotides of the tracr ribonucleic acid sequence. , The tracr ribonucleic acid sequence comprises a first hairpin and a second hairpin at 5'to 3'where the first hairpin has a longer stem than the second hairpin, claimed. The engineered nuclease system according to any one of Items 1-27 or 48-52.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2242-2244 or SEQ ID NO: 2247-2249. 27 The engineered nuclease system according to any one of 27.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4056-4058 and SEQ ID NO: 4061-4063. The engineered nuclease system according to any one of 27 or 54.

The endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5639-5648, claim 1-27 or 54. The engineered nuclease system according to any one of -55.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 421-423 or SEQ ID NO: 426-428. The engineered nuclease system according to any one of 27 or 54-56.

The guide RNA structure is at least 70%, 80%, relative to a sequence selected from the group consisting of SEQ ID NO: 5466-5467, SEQ ID NO: 5495-5497, SEQ ID NO: 5500-5502, and SEQ ID NO: 5539. Or the engineered nuclease system of any one of claims 1-27 or 54-57, comprising sequences that are 90% identical.

The guide RNA structure comprises a guide ribonucleic acid sequence that is expected to contain a hairpin having an uninterrupted base pair region containing at least 8 nucleotides of the guide ribonucleic acid sequence and at least 8 nucleotides of the tracr ribonucleic acid sequence. , The tracr ribonucleic acid sequence comprises a first hairpin and a second hairpin at 5'to 3'where the first hairpin has a longer stem than the second hairpin, claimed. The engineered nuclease system according to any one of Items 1-27 or 54-58.

Either of claims 1-27 or 54-59, wherein the endonuclease is configured to bind to a PAM comprising a sequence selected from the group consisting of SEQ ID NO: 5517-5518 or SEQ ID NO: 5532-5534. One of the engineered nuclease systems described.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2247.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5500, and
c) The engineered nuclease system of any one of claims 1-27 or 54-60, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5517 or SEQ ID NO: 5532. ..

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2248.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5501, and
c) The engineered nuclease system of any one of claims 1-27 or 54-60, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5518 or SEQ ID NO: 5533. ..

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2249.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5502, and
c) The engineered nuclease system of any one of claims 1-27 or 54-60, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5534.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2253 or SEQ ID NO: 2253-2481. The engineered nuclease system according to any one.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4067 or SEQ ID NO: 4067-4295, claim 1-27 or. The engineered nuclease system according to any one of 64.

The engineered nuclease system of any one of claims 1-27 or 64-65, wherein the endonuclease comprises a peptide motif of SEQ ID NO: 5649.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 432 or SEQ ID NO: 432-660, claim 1-27 or. The engineered nuclease system according to any one of 64-66.

The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5468 or SEQ ID NO: 5503, claim 1-27 or 64. The engineered nuclease system according to any one of -67.

The engineered nuclease according to any one of claims 1-27 or 64-68, wherein the endonuclease is configured to bind to a PAM comprising a sequence selected from the group consisting of SEQ ID NO: 5519. system.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2253.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5468 or SEQ ID NO: 5503, and
c) The engineered nuclease system of any one of claims 1-27 or 64-69, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5519.

The endonuclease comprises any one of claims 1-27 comprising a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2482-2489. The engineered nuclease system described.

One of claims 1-27 or 71, wherein the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4296-4303. One of the engineered nuclease systems described.

Either of claims 1-27 or 71-72, wherein the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 661-668. The engineered nuclease system according to one.

The endonuclease comprises any one of claims 1-27 comprising a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2490-2498. The engineered nuclease system described.

One of claims 1-27 or 74, wherein the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4304-4321. One of the engineered nuclease systems described.

Either of claims 1-27 or 74-75, wherein the endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 669-677. The engineered nuclease system according to one.

One of claims 1-27 or 74-76, wherein the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5504. One of the engineered nuclease systems described.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2499 or SEQ ID NO: 2499-2750, claim 1-27. The engineered nuclease system according to any one.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4313 or SEQ ID NO: 4313-4564, claim 1-27 or. The engineered nuclease system according to any one of 78.

The endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5650-5667, claim 1-27 or 78. The engineered nuclease system according to any one of -79.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 678 or SEQ ID NO: 678-929, claim 1-27 or. The engineered nuclease system according to any one of 78-80.

One of claims 1-27 or 78-81, wherein the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5469 or SEQ ID NO: 5505. The engineered nuclease system described.

The engineered nuclease system of any one of claims 1-27 or 78-82, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5520 or SEQ ID NO: 5535.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2499.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5469 or SEQ ID NO: 5505, and
c) The engineered nuclease system of any one of claims 1-27 or 78-83, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5520 or SEQ ID NO: 5535. ..

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2751 or SEQ ID NO: 2751-2913, claim 1-27. The engineered nuclease system according to any one.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4565 or SEQ ID NO: 4565-4727, claim 1-27 or. The engineered nuclease system according to any one of 85.

The endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5668-5678, claim 1-27 or 85. The engineered nuclease system according to any one of -86.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 930 or SEQ ID NO: 930-1092, claim 1-27 or. The engineered nuclease system according to any one of 85-87.

One of claims 1-27 or 85-88, wherein the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5470 or SEQ ID NO: 5506. The engineered nuclease system described.

The endonuclease is described in any one of claims 1-27 or 85-89 configured to bind to a PAM comprising a sequence selected from the group consisting of SEQ ID NO: 5521 or SEQ ID NO: 5536. Manipulated nuclease system.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2751.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5470 or SEQ ID NO: 5506, and
c) The engineered nuclease system of any one of claims 1-27 or 85-90, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5521 or SEQ ID NO: 5536. ..

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 2914 or SEQ ID NO: 2914-3174, claim 1-27. The engineered nuclease system according to any one.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4728 or SEQ ID NO: 4728-4988, claim 1-27 or. 92. The engineered nuclease system according to any one of 92.

The endonuclease comprises any one of claims 1-27 or 92-93 comprising at least one, at least two, or at least three peptide motifs selected from the group consisting of SEQ ID NO: 5676-5678. The engineered nuclease system described.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1093 or SEQ ID NO: 1093-1353, claim 1-27 or. The engineered nuclease system according to any one of 92-94.

The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5471, SEQ ID NO: 5507, and SEQ ID NO: 5540-5542. , The engineered nuclease system according to any one of claims 1-27 or 92-95.

13. Manipulated nuclease system.

The engineered nuclease system of any one of claims 1-27 or 92-97, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5522.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 2914.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5471 or SEQ ID NO: 5507, and
c) The engineered nuclease system of any one of claims 1-27 or 92-98, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5522.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3175 or SEQ ID NO: 3175-3330, claim 1-27. The engineered nuclease system according to any one.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 4989 or SEQ ID NO: 4989-5146, claim 1-27 or. The engineered nuclease system according to any one of 100.

The endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5679-5686, claim 1-27 or 100. The engineered nuclease system according to any one of -101.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1354 or SEQ ID NO: 1354-1511, claim 1-27 or. The engineered nuclease system according to any one of 100-102.

The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5472 or SEQ ID NO: 5508, claim 1-27 or 100. -The engineered nuclease system according to any one of 103.

The endonuclease is described in any one of claims 1-27 or 100-104, configured to bind to a PAM comprising a sequence selected from the group consisting of SEQ ID NO: 5523 or SEQ ID NO: 5537. Manipulated nuclease system.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 3175.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5472 or SEQ ID NO: 5508, and
c) The engineered nuclease system of any one of claims 1-27 or 100-105, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5523 or SEQ ID NO: 5537. ..

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3331 or SEQ ID NO: 3331-3474, claim 1-27. The engineered nuclease system according to any one.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5147 or SEQ ID NO: 5147-5290, claim 1-27 or. The engineered nuclease system according to any one of 107.

The endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5674-5675 and SEQ ID NO: 5687-5693. The engineered nuclease system according to any one of claims 1-27 or 107-108.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1512 or SEQ ID NO: 1512-1655, claim 1-27 or. The engineered nuclease system according to any one of 107-109.

The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5473 or SEQ ID NO: 5509, claim 1-27 or 107. The engineered nuclease system according to any one of -110.

The engineered nuclease system of any one of claims 1-27 or 107-111, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5524.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 3331.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5473 or SEQ ID NO: 5509, and
c) The engineered nuclease system of any one of claims 1-27 or 107-112, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5524.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3475 or SEQ ID NO: 3475-3568, claim 1-27. The engineered nuclease system according to any one.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5291 or SEQ ID NO: 5291-5389, claim 1-27 or. The engineered nuclease system according to any one of 114.

The endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5694-569, claim 1-27 or 114. The engineered nuclease system according to any one of -115.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1656 or SEQ ID NO: 1656-1755, claim 1-27 or. The engineered nuclease system according to any one of 114-116.

One of claims 1-27 or 114-117, wherein the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5474 or SEQ ID NO: 5510. The engineered nuclease system described.

The engineered nuclease system of any one of claims 1-27 or 114-118, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5525.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 3475.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5474 or SEQ ID NO: 5510, and
c) The engineered nuclease system of any one of claims 1-27 or 114-119, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5525.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 3569 or SEQ ID NO: 3569-3637, claim 1-27. The engineered nuclease system according to any one.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 5390 or SEQ ID NO: 5390-5460, claim 1-27 or. The engineered nuclease system according to any one of 121.

The endonuclease comprises at least one, at least two, at least three, at least four, or at least five peptide motifs selected from the group consisting of SEQ ID NO: 5700-5717, claim 1-27 or 121-. The engineered nuclease system according to any one of 122.

The endonuclease comprises a sequence that is at least 70%, 80%, or 90% identical to a sequence selected from the group consisting of SEQ ID NO: 1756 or SEQ ID NO: 1756-1826, claim 1-27 or. The engineered nuclease system according to any one of 121-123.

One of claims 1-27 or 121-124, wherein the guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5475 or SEQ ID NO: 5511. The engineered nuclease system described.

The engineered nuclease system of any one of claims 1-27 or 121-125, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5526.

a) The endonuclease contains a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 3569.
b) The guide RNA structure comprises a sequence that is at least 70%, 80%, or 90% identical to SEQ ID NO: 5475 or SEQ ID NO: 5511, and
c) The engineered nuclease system of any one of claims 1-27 or 121-126, wherein the endonuclease is configured to bind to a PAM comprising SEQ ID NO: 5526.

The engineered nuclease system of any one of claims 1-127, wherein the sequence identity is determined by BLASTP, CLUSTALW, MUSCLE, MAFFT, or the Smith-Waterman homology search algorithm.

The sequence identity is conditional using the BLASTUM62 scoring matrix, which sets the gap cost at 3 wordlength (W), 10 extraction (E) parameters, and 11 exhibitions, 1 extension. The engineered nuclease system of claim 128, as determined by the BLASTP homology search algorithm using composition score matrix adjustments.

The engineered guide ribonucleic acid polynucleotide, wherein the engineered guide ribonucleic acid polynucleotide is.
a) A DNA-targeted segment that contains a nucleotide sequence that is complementary to the target sequence in the target DNA molecule.
b) A protein-binding segment containing two complementary stretches of nucleotides that hybridize to form a double-stranded RNA (dsRNA) duplex.
Here, the two complementary stretches of said nucleotides are covalently attached to each other at the intervening nucleotides, and
Here, the engineered guide ribonucleic acid polynucleotide forms a complex with an endonuclease containing a RuvC_III domain having at least 75% sequence identity to any one of SEQ ID NOs: 1827-3637, said. Configured to target the complex to the target sequence of the target DNA molecule,
Manipulated guide ribonucleic acid polynucleotide.

The engineered guide ribonucleic acid polynucleotide according to claim 130, wherein the DNA targeting segment is located at 5'both of the two complementary stretches of the nucleotide.

a) The protein binding segment is a sequence having at least 70%, at least 80%, or at least 90% identity to a sequence selected from the group consisting of SEQ ID NO: 5476-5479 or SEQ ID NO: 5476-5489. Including
b) The protein binding segment is at least 70%, at least 80%, or at least 90% of the sequence selected from the group consisting of (SEQ ID NO: 5490-5491 or SEQ ID NO: 5490-5494) and SEQ ID NO: 5538. Containing sequences with the same identity of
c) The protein binding segment comprises a sequence having at least 70%, at least 80%, or at least 90% identity to a sequence selected from the group consisting of SEQ ID NO: 5498-5499.
d) The protein binding segment is a sequence having at least 70%, at least 80%, or at least 90% identity to a sequence selected from the group consisting of SEQ ID NO: 5495-5497 and SEQ ID NO: 5500-5502. Including
e) The protein binding segment comprises a sequence having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5503.
f) The protein binding segment comprises a sequence having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5504.
g) The protein binding segment comprises a sequence having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5505.
h) Protein binding segments include sequences having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5506.
i) Protein binding segments include sequences having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5507.
j) The protein binding segment comprises a sequence having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5508.
k) The protein binding segment comprises a sequence having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5509.
l) The protein binding segment comprises, or contains, a sequence having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5510.
m) The engineered according to any of claims 130-131, wherein the protein binding segment comprises a sequence having at least 70%, at least 80%, or at least 90% identity to SEQ ID NO: 5511. Guide ribonucleic acid polynucleotide.

a) The guide ribonucleic acid polynucleotide comprises an RNA sequence comprising a hairpin comprising a stem and a loop, wherein the stem is at least 10, at least 12, or at least 14 base pair ribonucleotides, and the loop. Contains asymmetric bulges within 4 base pairs
b) The guide ribonucleic acid polynucleotide comprises a tracr ribonucleic acid sequence that is expected to contain a hairpin containing at least 8, at least 10, or at least 12 base pair ribonucleotides.
c) The guide ribonucleic acid polynucleotide is expected to contain a hairpin having an uninterrupted base pair region containing at least 8 nucleotides of the guide ribonucleic acid sequence and at least 8 nucleotides of the tracr ribonucleic acid sequence. Containing, where the tracr ribonucleic acid sequence comprises a first hairpin and a second hairpin at 5'to 3'where the first hairpin has a stem longer than the second hairpin. Or have
d) The engineered according to any of claims 130-132, wherein the guide ribonucleic acid polynucleotide comprises a tracr ribonucleic acid sequence that is expected to contain at least two hairpins comprising less than five base pairs of ribonucleotides. Guided ribonucleic acid polynucleotide.

A deoxyribonucleic acid polynucleotide encoding the engineered guide ribonucleic acid polynucleotide according to any one of claims 130-133.

A nucleic acid comprising an engineered nucleic acid sequence optimized for expression in an organism, wherein said nucleic acid encodes a class 2 type II Cas endonuclease containing a RuvC_III domain and an HNH domain. Here, the class 2 Cas endonuclease is a nucleic acid derived from a refractory microorganism.

A nucleic acid comprising an engineered nucleic acid sequence optimized for expression in an organism, wherein said nucleic acid is at least 70% sequence identical to any one of SEQ ID NOs: 1827-3637. A nucleic acid encoding an endonuclease containing a RuvC_III domain having sex.

13. Nucleic acid.

The nucleic acid according to any one of claims 135-137, wherein the endonuclease comprises SEQ ID NO: 5571-5591 or a variant thereof having at least 70% sequence identity to them.

13. One of claims 135-138, wherein the endonuclease comprises a sequence encoding one or more nuclear localization sequences (NLS) located proximal to the N-terminus or C-terminus of the endonuclease. Nucleic acid.

The nucleic acid of claim 139, wherein the NLS comprises a sequence selected from SEQ ID NO: 5597-5612.

The nucleic acid according to any one of claims 135-140, wherein the organism is a prokaryote, a bacterium, a eukaryote, a fungus, a plant, a mammal, a rodent, or a human.

The organism is E. coli, where
a) The nucleic acid sequence has at least 70%, 80%, or 90% identity to a sequence selected from the group consisting of SEQ ID NO: 5571-5575.
b) The nucleic acid sequence has at least 70%, 80%, or 90% identity to a sequence selected from the group consisting of SEQ ID NO: 5576-5571.
c) The nucleic acid sequence has at least 70%, 80%, or 90% identity to a sequence selected from the group consisting of SEQ ID NO: 5578-5580.
d) The nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5581.
e) The nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5582.
f) The nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5583.
g) The nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5584.
h) The nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5585.
i) The nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5586, or
j) The nucleic acid of claim 141, wherein the nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5587.

The organism is human, where
a) The nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5588 or SEQ ID NO: 5589, or
b) The nucleic acid of claim 141, wherein the nucleic acid sequence has at least 70%, 80%, or 90% identity to SEQ ID NO: 5590 or SEQ ID NO: 5591.

A vector containing a nucleic acid sequence encoding a class 2 type II Cas endonuclease containing a RuvC_III domain and an HNH domain, wherein the class 2 type II Cas endonuclease is a vector derived from a refractory microorganism.

A vector comprising the nucleic acid according to any one of claims 135-143.

The engineered guide ribonucleic acid structure further comprises a nucleic acid encoding an engineered guide ribonucleic acid structure configured to form a complex with the class 2 type II Cas endonuclease.
a) A guide ribonucleic acid sequence configured to hybridize to the target deoxyribonucleic acid sequence and
b) A tracr ribonucleic acid sequence configured to bind to the class 2 Cas endonuclease.
The vector according to any one of claims 144-145.

The vector according to any one of claims 144-146, wherein the vector is a plasmid, minicircle, CELiD, adeno-associated virus (AAV) -derived virion, or lentivirus.

A cell comprising the vector according to any one of claims 144-147.

A method for producing an endonuclease, wherein the method comprises the step of culturing the cells of claim 146.

A method for binding, cleaving, labeling, or modifying a double-stranded deoxyribonucleic acid polynucleotide, said method.
(A) Class 2 type II Cas endonucleases and class 2 type II Cas endonucleases that form a complex with an engineered guide ribonucleic acid structure configured to bind to said double-stranded deoxyribonucleic acid polynucleotide. Including the step of contacting the double-stranded deoxyribonucleic acid polynucleotide.
(B) The double-stranded deoxyribonucleic acid polynucleotide comprises a protospacer flanking motif (PAM) and
(C) The PAM comprises a sequence selected from the group consisting of SEQ ID NO: 5512-5526 or SEQ ID NO: 5527-5537.
Method.

The method of claim 149, wherein the double-stranded deoxyribonucleic acid polynucleotide comprises a first strand comprising a sequence complementary to the sequence of the engineered guide ribonucleic acid structure and a second strand comprising PAM. ..

15. The method of claim 151, wherein the PAM is directly adjacent to the 3'end of the sequence complementary to the sequence of the engineered guide ribonucleic acid structure.

The Class 2 Type II Cas endonucleases are Cas9 endonuclease, Cas14 endonuclease, Cas12a endonuclease, Cas12b endonuclease, Cas12c endonuclease, Cas12d endonuclease, Cas12e endonuclease, Cas13a endonuclease, Cas13b endonuclease, Cas13c endonuclease. , Or the method of any one of claims 149-152, which is not a Cas13d endonuclease.

The method according to any one of claims 149-153, wherein the class 2 type II Cas endonuclease is derived from a refractory microorganism.

The double-stranded deoxyribonucleic acid polynucleotide according to any one of claims 149-154, which is a double-stranded deoxyribonucleic acid polynucleotide of eukaryote, plant, fungus, mammal, rodent, or human. the method of.

a) The PAM comprises a sequence selected from the group consisting of SEQ ID NO: 5512-5515 and SEQ ID NO: 5527-5530.
b) The PAM comprises SEQ ID NO: 5516 or SEQ ID NO: 5531.
c) The PAM comprises SEQ ID NO: 5539.
d) The PAM comprises SEQ ID NO: 5517 or SEQ ID NO: 5518.
e) The PAM comprises SEQ ID NO: 5519.
f) The PAM comprises SEQ ID NO: 5520 or SEQ ID NO: 5535.
g) The PAM comprises SEQ ID NO: 5521 or SEQ ID NO: 5536.
h) The PAM comprises SEQ ID NO: 5522.
i) The PAM comprises SEQ ID NO: 5523 or SEQ ID NO: 5537.
j) The PAM comprises SEQ ID NO: 5524.
k) The PAM comprises or comprises SEQ ID NO: 5525.
l) The PAM comprises SEQ ID NO: 5526.
The method according to any one of claims 149-155.

A method of modifying a target nucleic acid locus, wherein the method delivers any of the engineered nuclease systems according to any one of claims 1-129 to the target nucleic acid locus. Containing, the endonuclease is configured to form a complex with the engineered guide ribonucleic acid structure, wherein the complex is such that the complex binds to the target nucleic acid locus. A method in which the complex is configured to modify the target nucleic acid locus.

156. The method of claim 156, wherein modifying the target nucleic acid locus comprises binding, nicking, cleaving, or labeling the target nucleic acid locus.

The method of any of claims 156-158, wherein the target nucleic acid locus comprises deoxyribonucleic acid (DNA) or ribonucleic acid (RNA).

159. The method of claim 159, wherein the target nucleic acid comprises genomic DNA, viral DNA, viral RNA, or bacterial DNA.

The method of any one of claims 156-160, wherein the target nucleic acid locus is in vitro.

The method according to any one of claims 156-160, wherein the target nucleic acid locus is intracellular.

16. The method of claim 162, wherein the cell is a prokaryotic cell, a bacterial cell, a eukaryotic cell, a fungal cell, a plant cell, an animal cell, a mammalian cell, a rodent cell, a primate cell, or a human cell.

The step of delivering the engineered nuclease system to the target nucleic acid locus comprises delivering the nucleic acid of any of claims 135-140 or the vector of any of claims 142-146. The method according to any one of claims 162-163.

162-1. Method.

164. The method of claim 164, wherein the nucleic acid comprises a promoter to which the open reading frame encoding the endonuclease is operably linked.

One of claims 162-163, wherein the step of delivering the engineered nuclease system to the target nucleic acid locus comprises delivering a capped mRNA comprising the open reading frame encoding the endonuclease. The method described in one.

The method of any one of claims 162-163, wherein the step of delivering the engineered nuclease system to the target nucleic acid locus comprises delivering the translated polypeptide.

The step of delivering the engineered nuclease system to the target nucleic acid locus delivers deoxyribonucleic acid (DNA) encoding the engineered guide ribonucleic acid structure operably linked to the ribonucleic acid (RNA) pol III promoter. The method of any one of claims 162-163, comprising:

The method of any one of claims 156-169, wherein the endonuclease causes single-strand breaks or double-strand breaks at or proximal to the target nucleic acid locus.