KR102572449B1 - 곤충 세포에서 생산된 더욱 향상된 aav 벡터 - Google Patents
곤충 세포에서 생산된 더욱 향상된 aav 벡터 Download PDFInfo
- Publication number
- KR102572449B1 KR102572449B1 KR1020167026098A KR20167026098A KR102572449B1 KR 102572449 B1 KR102572449 B1 KR 102572449B1 KR 1020167026098 A KR1020167026098 A KR 1020167026098A KR 20167026098 A KR20167026098 A KR 20167026098A KR 102572449 B1 KR102572449 B1 KR 102572449B1
- Authority
- KR
- South Korea
- Prior art keywords
- gly
- pro
- asn
- ser
- thr
- Prior art date
Links
- 241000238631 Hexapoda Species 0.000 title claims abstract description 126
- 239000013598 vector Substances 0.000 title claims description 113
- 230000001976 improved effect Effects 0.000 title description 6
- 239000002773 nucleotide Substances 0.000 claims abstract description 163
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 163
- 108090000565 Capsid Proteins Proteins 0.000 claims abstract description 106
- 102100023321 Ceruloplasmin Human genes 0.000 claims abstract description 106
- 108091081024 Start codon Proteins 0.000 claims abstract description 76
- 108020004705 Codon Proteins 0.000 claims abstract description 56
- 238000004519 manufacturing process Methods 0.000 claims abstract description 54
- 230000014621 translational initiation Effects 0.000 claims abstract description 52
- 125000000539 amino acid group Chemical group 0.000 claims abstract description 51
- 101100524324 Adeno-associated virus 2 (isolate Srivastava/1982) Rep78 gene Proteins 0.000 claims abstract description 28
- 101100524319 Adeno-associated virus 2 (isolate Srivastava/1982) Rep52 gene Proteins 0.000 claims abstract description 24
- 235000004279 alanine Nutrition 0.000 claims abstract description 23
- 108091026890 Coding region Proteins 0.000 claims abstract description 21
- 101100524321 Adeno-associated virus 2 (isolate Srivastava/1982) Rep68 gene Proteins 0.000 claims abstract description 17
- 241000702421 Dependoparvovirus Species 0.000 claims abstract description 16
- 101100524317 Adeno-associated virus 2 (isolate Srivastava/1982) Rep40 gene Proteins 0.000 claims abstract description 13
- 210000004027 cell Anatomy 0.000 claims description 162
- 108090000623 proteins and genes Proteins 0.000 claims description 100
- 150000007523 nucleic acids Chemical class 0.000 claims description 65
- 241001634120 Adeno-associated virus - 5 Species 0.000 claims description 60
- 108020004707 nucleic acids Proteins 0.000 claims description 57
- 102000039446 nucleic acids Human genes 0.000 claims description 57
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 38
- 210000002845 virion Anatomy 0.000 claims description 32
- 241000701447 unidentified baculovirus Species 0.000 claims description 31
- 108700026244 Open Reading Frames Proteins 0.000 claims description 26
- 210000004962 mammalian cell Anatomy 0.000 claims description 26
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 claims description 16
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 16
- 102100022641 Coagulation factor IX Human genes 0.000 claims description 9
- 108010076282 Factor IX Proteins 0.000 claims description 8
- 229960004222 factor ix Drugs 0.000 claims description 7
- 108010054218 Factor VIII Proteins 0.000 claims description 6
- 102000001690 Factor VIII Human genes 0.000 claims description 6
- 238000012258 culturing Methods 0.000 claims description 4
- 101100507655 Canis lupus familiaris HSPA1 gene Proteins 0.000 claims description 3
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 abstract description 31
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 abstract description 28
- 238000013519 translation Methods 0.000 abstract description 16
- 239000004471 Glycine Substances 0.000 abstract description 14
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 abstract description 14
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 abstract description 14
- 235000014393 valine Nutrition 0.000 abstract description 14
- 239000004474 valine Substances 0.000 abstract description 14
- 235000001014 amino acid Nutrition 0.000 abstract description 13
- 150000001413 amino acids Chemical class 0.000 abstract description 12
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 abstract description 11
- 235000003704 aspartic acid Nutrition 0.000 abstract description 11
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 abstract description 11
- 235000013922 glutamic acid Nutrition 0.000 abstract description 11
- 239000004220 glutamic acid Substances 0.000 abstract description 11
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 abstract description 10
- 125000003295 alanine group Chemical group N[C@@H](C)C(=O)* 0.000 abstract description 7
- 239000013603 viral vector Substances 0.000 abstract description 6
- 230000003612 virological effect Effects 0.000 abstract description 6
- 101710132601 Capsid protein Proteins 0.000 description 141
- 101710197658 Capsid protein VP1 Proteins 0.000 description 141
- 101710118046 RNA-directed RNA polymerase Proteins 0.000 description 141
- 101710108545 Viral protein 1 Proteins 0.000 description 141
- 101710081079 Minor spike protein H Proteins 0.000 description 73
- 101000805768 Banna virus (strain Indonesia/JKT-6423/1980) mRNA (guanine-N(7))-methyltransferase Proteins 0.000 description 65
- 101000686790 Chaetoceros protobacilladnavirus 2 Replication-associated protein Proteins 0.000 description 65
- 101000864475 Chlamydia phage 1 Internal scaffolding protein VP3 Proteins 0.000 description 65
- 101000803553 Eumenes pomiformis Venom peptide 3 Proteins 0.000 description 65
- 101000583961 Halorubrum pleomorphic virus 1 Matrix protein Proteins 0.000 description 65
- 102000004169 proteins and genes Human genes 0.000 description 50
- 210000000234 capsid Anatomy 0.000 description 48
- 235000018102 proteins Nutrition 0.000 description 46
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 36
- 108020004414 DNA Proteins 0.000 description 35
- 108010061238 threonyl-glycine Proteins 0.000 description 35
- 108010050848 glycylleucine Proteins 0.000 description 34
- 239000002245 particle Substances 0.000 description 32
- 239000000047 product Substances 0.000 description 32
- 241000282326 Felis catus Species 0.000 description 30
- 238000012054 celltiter-glo Methods 0.000 description 29
- 230000010354 integration Effects 0.000 description 27
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 26
- 241000702423 Adeno-associated virus - 2 Species 0.000 description 25
- 108010079364 N-glycylalanine Proteins 0.000 description 25
- 108010077245 asparaginyl-proline Proteins 0.000 description 25
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 24
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 21
- 108010089804 glycyl-threonine Proteins 0.000 description 21
- 238000000034 method Methods 0.000 description 19
- ZHQWPWQNVRCXAX-XQQFMLRXSA-N Val-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZHQWPWQNVRCXAX-XQQFMLRXSA-N 0.000 description 18
- 108010047857 aspartylglycine Proteins 0.000 description 18
- 108700042769 prolyl-leucyl-glycine Proteins 0.000 description 18
- 108010031719 prolyl-serine Proteins 0.000 description 18
- 230000014616 translation Effects 0.000 description 18
- 241000880493 Leptailurus serval Species 0.000 description 17
- 108010078144 glutaminyl-glycine Proteins 0.000 description 17
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 17
- 108010051242 phenylalanylserine Proteins 0.000 description 17
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 16
- 230000000694 effects Effects 0.000 description 16
- 108010092114 histidylphenylalanine Proteins 0.000 description 16
- 108010057821 leucylproline Proteins 0.000 description 16
- 230000035772 mutation Effects 0.000 description 16
- UZFNHAXYMICTBU-DZKIICNBSA-N Val-Phe-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N UZFNHAXYMICTBU-DZKIICNBSA-N 0.000 description 15
- 230000000875 corresponding effect Effects 0.000 description 15
- 238000000338 in vitro Methods 0.000 description 15
- 238000001727 in vivo Methods 0.000 description 15
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 15
- 108010070409 phenylalanyl-glycyl-glycine Proteins 0.000 description 15
- 108010015666 tryptophyl-leucyl-glutamic acid Proteins 0.000 description 15
- 241000125945 Protoparvovirus Species 0.000 description 14
- LVHHEVGYAZGXDE-KDXUFGMBSA-N Thr-Ala-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(=O)O)N)O LVHHEVGYAZGXDE-KDXUFGMBSA-N 0.000 description 14
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 14
- 241000700605 Viruses Species 0.000 description 14
- 108010016686 methionyl-alanyl-serine Proteins 0.000 description 14
- 108010053725 prolylvaline Proteins 0.000 description 14
- 241001164825 Adeno-associated virus - 8 Species 0.000 description 13
- NHCPCLJZRSIDHS-ZLUOBGJFSA-N Ala-Asp-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O NHCPCLJZRSIDHS-ZLUOBGJFSA-N 0.000 description 13
- XEDQMTWEYFBOIK-ACZMJKKPSA-N Asp-Ala-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XEDQMTWEYFBOIK-ACZMJKKPSA-N 0.000 description 13
- BDHUXUFYNUOUIT-SRVKXCTJSA-N His-Asp-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BDHUXUFYNUOUIT-SRVKXCTJSA-N 0.000 description 13
- YRAWWKUTNBILNT-FXQIFTODSA-N Met-Ala-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YRAWWKUTNBILNT-FXQIFTODSA-N 0.000 description 13
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 13
- VPRHDRKAPYZMHL-SZMVWBNQSA-N Trp-Leu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 VPRHDRKAPYZMHL-SZMVWBNQSA-N 0.000 description 13
- 108010092854 aspartyllysine Proteins 0.000 description 13
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Chemical compound NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 13
- 241001655883 Adeno-associated virus - 1 Species 0.000 description 12
- YCTIYBUTCKNOTI-UWJYBYFXSA-N Ala-Tyr-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCTIYBUTCKNOTI-UWJYBYFXSA-N 0.000 description 12
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 12
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 12
- LCMWVZLBCUVDAZ-IUCAKERBSA-N Lys-Gly-Glu Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CCC([O-])=O LCMWVZLBCUVDAZ-IUCAKERBSA-N 0.000 description 12
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 12
- 108010060035 arginylproline Proteins 0.000 description 12
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 12
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 12
- 108010037850 glycylvaline Proteins 0.000 description 12
- 108010085325 histidylproline Proteins 0.000 description 12
- 208000015181 infectious disease Diseases 0.000 description 12
- 108010034529 leucyl-lysine Proteins 0.000 description 12
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 12
- 230000004048 modification Effects 0.000 description 12
- 238000012986 modification Methods 0.000 description 12
- 108010026333 seryl-proline Proteins 0.000 description 12
- 241000580270 Adeno-associated virus - 4 Species 0.000 description 11
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 11
- PGUYEUCYVNZGGV-QWRGUYRKSA-N Asp-Gly-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PGUYEUCYVNZGGV-QWRGUYRKSA-N 0.000 description 11
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 11
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 11
- PNUFMLXHOLFRLD-KBPBESRZSA-N Gly-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 PNUFMLXHOLFRLD-KBPBESRZSA-N 0.000 description 11
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 11
- 108010079005 RDV peptide Proteins 0.000 description 11
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 11
- 108700019146 Transgenes Proteins 0.000 description 11
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 11
- 108010087924 alanylproline Proteins 0.000 description 11
- 108010040030 histidinoalanine Proteins 0.000 description 11
- 108010038745 tryptophylglycine Proteins 0.000 description 11
- 238000011144 upstream manufacturing Methods 0.000 description 11
- 239000013607 AAV vector Substances 0.000 description 10
- 241000202702 Adeno-associated virus - 3 Species 0.000 description 10
- GMFAGHNRXPSSJS-SRVKXCTJSA-N Arg-Leu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GMFAGHNRXPSSJS-SRVKXCTJSA-N 0.000 description 10
- IGFJVXOATGZTHD-UHFFFAOYSA-N Arg-Phe-His Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccccc1)C(=O)NC(Cc2c[nH]cn2)C(=O)O IGFJVXOATGZTHD-UHFFFAOYSA-N 0.000 description 10
- KVXVVDFOZNYYKZ-DCAQKATOSA-N Gln-Gln-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KVXVVDFOZNYYKZ-DCAQKATOSA-N 0.000 description 10
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 10
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 10
- 241000699670 Mus sp. Species 0.000 description 10
- XKFJENWJGHMDLI-QWRGUYRKSA-N Ser-Phe-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O XKFJENWJGHMDLI-QWRGUYRKSA-N 0.000 description 10
- NMCBVGFGWSIGSB-NUTKFTJISA-N Trp-Ala-Leu Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N NMCBVGFGWSIGSB-NUTKFTJISA-N 0.000 description 10
- CYDVHRFXDMDMGX-KKUMJFAQSA-N Tyr-Asn-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O CYDVHRFXDMDMGX-KKUMJFAQSA-N 0.000 description 10
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 10
- 108010067390 Viral Proteins Proteins 0.000 description 10
- 108010087823 glycyltyrosine Proteins 0.000 description 10
- 108020004999 messenger RNA Proteins 0.000 description 10
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 10
- 108010015796 prolylisoleucine Proteins 0.000 description 10
- 230000002829 reductive effect Effects 0.000 description 10
- 230000010076 replication Effects 0.000 description 10
- 239000000243 solution Substances 0.000 description 10
- 241000972680 Adeno-associated virus - 6 Species 0.000 description 9
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 9
- WIDVAWAQBRAKTI-YUMQZZPRSA-N Asn-Leu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O WIDVAWAQBRAKTI-YUMQZZPRSA-N 0.000 description 9
- UGIBTKGQVWFTGX-BIIVOSGPSA-N Asp-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)C(=O)O UGIBTKGQVWFTGX-BIIVOSGPSA-N 0.000 description 9
- YIDFBWRHIYOYAA-LKXGYXEUSA-N Asp-Ser-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YIDFBWRHIYOYAA-LKXGYXEUSA-N 0.000 description 9
- 102100026735 Coagulation factor VIII Human genes 0.000 description 9
- SBCYJMOOHUDWDA-NUMRIWBASA-N Glu-Asp-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SBCYJMOOHUDWDA-NUMRIWBASA-N 0.000 description 9
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 9
- 101000911390 Homo sapiens Coagulation factor VIII Proteins 0.000 description 9
- STAVRDQLZOTNKJ-RHYQMDGZSA-N Leu-Arg-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O STAVRDQLZOTNKJ-RHYQMDGZSA-N 0.000 description 9
- PDIDTSZKKFEDMB-UWVGGRQHSA-N Lys-Pro-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O PDIDTSZKKFEDMB-UWVGGRQHSA-N 0.000 description 9
- 108010047562 NGR peptide Proteins 0.000 description 9
- UNLYPPYNDXHGDG-IHRRRGAJSA-N Phe-Gln-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 UNLYPPYNDXHGDG-IHRRRGAJSA-N 0.000 description 9
- APJPXSFJBMMOLW-KBPBESRZSA-N Phe-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 APJPXSFJBMMOLW-KBPBESRZSA-N 0.000 description 9
- IMNVAOPEMFDAQD-NHCYSSNCSA-N Pro-Val-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IMNVAOPEMFDAQD-NHCYSSNCSA-N 0.000 description 9
- NIEWSKWFURSECR-FOHZUACHSA-N Thr-Gly-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NIEWSKWFURSECR-FOHZUACHSA-N 0.000 description 9
- YRSOERSDNRSCBC-XIRDDKMYSA-N Trp-His-Cys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)N[C@@H](CS)C(=O)O)N YRSOERSDNRSCBC-XIRDDKMYSA-N 0.000 description 9
- 108010005233 alanylglutamic acid Proteins 0.000 description 9
- 108010038633 aspartylglutamate Proteins 0.000 description 9
- 108010025306 histidylleucine Proteins 0.000 description 9
- 230000007246 mechanism Effects 0.000 description 9
- 238000013518 transcription Methods 0.000 description 9
- 230000035897 transcription Effects 0.000 description 9
- 108010045269 tryptophyltryptophan Proteins 0.000 description 9
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 8
- 241001164823 Adeno-associated virus - 7 Species 0.000 description 8
- GSCLWXDNIMNIJE-ZLUOBGJFSA-N Ala-Asp-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O GSCLWXDNIMNIJE-ZLUOBGJFSA-N 0.000 description 8
- JQFJNGVSGOUQDH-XIRDDKMYSA-N Arg-Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCN=C(N)N)N)C(O)=O)=CNC2=C1 JQFJNGVSGOUQDH-XIRDDKMYSA-N 0.000 description 8
- FSPQNLYOFCXUCE-BPUTZDHNSA-N Arg-Trp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FSPQNLYOFCXUCE-BPUTZDHNSA-N 0.000 description 8
- GNKVBRYFXYWXAB-WDSKDSINSA-N Asn-Glu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O GNKVBRYFXYWXAB-WDSKDSINSA-N 0.000 description 8
- RDLYUKRPEJERMM-XIRDDKMYSA-N Asn-Trp-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(C)C)C(O)=O RDLYUKRPEJERMM-XIRDDKMYSA-N 0.000 description 8
- XQDGOJPVMSWZSO-SRVKXCTJSA-N Gln-Pro-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N XQDGOJPVMSWZSO-SRVKXCTJSA-N 0.000 description 8
- CUXJIASLBRJOFV-LAEOZQHASA-N Glu-Gly-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUXJIASLBRJOFV-LAEOZQHASA-N 0.000 description 8
- JBRBACJPBZNFMF-YUMQZZPRSA-N Gly-Ala-Lys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN JBRBACJPBZNFMF-YUMQZZPRSA-N 0.000 description 8
- CIMULJZTTOBOPN-WHFBIAKZSA-N Gly-Asn-Asn Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CIMULJZTTOBOPN-WHFBIAKZSA-N 0.000 description 8
- BEQGFMIBZFNROK-JGVFFNPUSA-N Gly-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)CN)C(=O)O BEQGFMIBZFNROK-JGVFFNPUSA-N 0.000 description 8
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 8
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 8
- FXGIMYRVJJEIIM-UWVGGRQHSA-N Pro-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 FXGIMYRVJJEIIM-UWVGGRQHSA-N 0.000 description 8
- VGVCNKSUVSZEIE-IHRRRGAJSA-N Pro-Phe-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O VGVCNKSUVSZEIE-IHRRRGAJSA-N 0.000 description 8
- SBVPYBFMIGDIDX-SRVKXCTJSA-N Pro-Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1N(C(=O)[C@H]2NCCC2)CCC1 SBVPYBFMIGDIDX-SRVKXCTJSA-N 0.000 description 8
- SNGZLPOXVRTNMB-LPEHRKFASA-N Pro-Ser-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N2CCC[C@@H]2C(=O)O SNGZLPOXVRTNMB-LPEHRKFASA-N 0.000 description 8
- 108700008625 Reporter Genes Proteins 0.000 description 8
- LRZLZIUXQBIWTB-KATARQTJSA-N Ser-Lys-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRZLZIUXQBIWTB-KATARQTJSA-N 0.000 description 8
- RCEHMXVEMNXRIW-IRIUXVKKSA-N Thr-Gln-Tyr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N)O RCEHMXVEMNXRIW-IRIUXVKKSA-N 0.000 description 8
- YGCDFAJJCRVQKU-RCWTZXSCSA-N Thr-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)[C@@H](C)O YGCDFAJJCRVQKU-RCWTZXSCSA-N 0.000 description 8
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 8
- JAIZPWVHPQRYOU-ZJDVBMNYSA-N Val-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O JAIZPWVHPQRYOU-ZJDVBMNYSA-N 0.000 description 8
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 8
- 108010047495 alanylglycine Proteins 0.000 description 8
- 108010070944 alanylhistidine Proteins 0.000 description 8
- 238000013320 baculovirus expression vector system Methods 0.000 description 8
- 230000002068 genetic effect Effects 0.000 description 8
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 8
- 108010077435 glycyl-phenylalanyl-glycine Proteins 0.000 description 8
- 108010077515 glycylproline Proteins 0.000 description 8
- 238000009396 hybridization Methods 0.000 description 8
- 239000003999 initiator Substances 0.000 description 8
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 8
- 108010024654 phenylalanyl-prolyl-alanine Proteins 0.000 description 8
- 108010012581 phenylalanylglutamate Proteins 0.000 description 8
- 102000004196 processed proteins & peptides Human genes 0.000 description 8
- 108090000765 processed proteins & peptides Proteins 0.000 description 8
- IKKVASZHTMKJIR-ZKWXMUAHSA-N Ala-Asp-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IKKVASZHTMKJIR-ZKWXMUAHSA-N 0.000 description 7
- BTRULDJUUVGRNE-DCAQKATOSA-N Ala-Pro-Lys Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O BTRULDJUUVGRNE-DCAQKATOSA-N 0.000 description 7
- BHTBAVZSZCQZPT-GUBZILKMSA-N Ala-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N BHTBAVZSZCQZPT-GUBZILKMSA-N 0.000 description 7
- 208000002267 Anti-neutrophil cytoplasmic antibody-associated vasculitis Diseases 0.000 description 7
- LYJXHXGPWDTLKW-HJGDQZAQSA-N Arg-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O LYJXHXGPWDTLKW-HJGDQZAQSA-N 0.000 description 7
- AYZAWXAPBAYCHO-CIUDSAMLSA-N Asn-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N AYZAWXAPBAYCHO-CIUDSAMLSA-N 0.000 description 7
- PHJPKNUWWHRAOC-PEFMBERDSA-N Asn-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N PHJPKNUWWHRAOC-PEFMBERDSA-N 0.000 description 7
- GKKUBLFXKRDMFC-BQBZGAKWSA-N Asn-Pro-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O GKKUBLFXKRDMFC-BQBZGAKWSA-N 0.000 description 7
- UFAQGGZUXVLONR-AVGNSLFASA-N Asp-Gln-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N)O UFAQGGZUXVLONR-AVGNSLFASA-N 0.000 description 7
- PRBLYKYHAJEABA-SRVKXCTJSA-N Gln-Arg-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O PRBLYKYHAJEABA-SRVKXCTJSA-N 0.000 description 7
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 7
- SGVGIVDZLSHSEN-RYUDHWBXSA-N Gln-Tyr-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O SGVGIVDZLSHSEN-RYUDHWBXSA-N 0.000 description 7
- JKDBRTNMYXYLHO-JYJNAYRXSA-N Gln-Tyr-Leu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=C(O)C=C1 JKDBRTNMYXYLHO-JYJNAYRXSA-N 0.000 description 7
- ZMXZGYLINVNTKH-DZKIICNBSA-N Gln-Val-Phe Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ZMXZGYLINVNTKH-DZKIICNBSA-N 0.000 description 7
- DMYACXMQUABZIQ-NRPADANISA-N Glu-Ser-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O DMYACXMQUABZIQ-NRPADANISA-N 0.000 description 7
- QXPRJQPCFXMCIY-NKWVEPMBSA-N Gly-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN QXPRJQPCFXMCIY-NKWVEPMBSA-N 0.000 description 7
- RYAOJUMWLWUGNW-QMMMGPOBSA-N Gly-Val-Gly Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O RYAOJUMWLWUGNW-QMMMGPOBSA-N 0.000 description 7
- 108010065920 Insulin Lispro Proteins 0.000 description 7
- PWWVAXIEGOYWEE-UHFFFAOYSA-N Isophenergan Chemical compound C1=CC=C2N(CC(C)N(C)C)C3=CC=CC=C3SC2=C1 PWWVAXIEGOYWEE-UHFFFAOYSA-N 0.000 description 7
- ARRIJPQRBWRNLT-DCAQKATOSA-N Leu-Met-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N ARRIJPQRBWRNLT-DCAQKATOSA-N 0.000 description 7
- ILDSIMPXNFWKLH-KATARQTJSA-N Leu-Thr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ILDSIMPXNFWKLH-KATARQTJSA-N 0.000 description 7
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 7
- IZJGPPIGYTVXLB-FQUUOJAGSA-N Lys-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IZJGPPIGYTVXLB-FQUUOJAGSA-N 0.000 description 7
- 241001465754 Metazoa Species 0.000 description 7
- YFNOUBWUIIJQHF-LPEHRKFASA-N Pro-Asp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O YFNOUBWUIIJQHF-LPEHRKFASA-N 0.000 description 7
- XYHMFGGWNOFUOU-QXEWZRGKSA-N Pro-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 XYHMFGGWNOFUOU-QXEWZRGKSA-N 0.000 description 7
- BRJGUPWVFXKBQI-XUXIUFHCSA-N Pro-Leu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRJGUPWVFXKBQI-XUXIUFHCSA-N 0.000 description 7
- VVAWNPIOYXAMAL-KJEVXHAQSA-N Pro-Thr-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VVAWNPIOYXAMAL-KJEVXHAQSA-N 0.000 description 7
- KZPRPBLHYMZIMH-MXAVVETBSA-N Ser-Phe-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZPRPBLHYMZIMH-MXAVVETBSA-N 0.000 description 7
- UNURFMVMXLENAZ-KJEVXHAQSA-N Thr-Arg-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UNURFMVMXLENAZ-KJEVXHAQSA-N 0.000 description 7
- VYEHBMMAJFVTOI-JHEQGTHGSA-N Thr-Gly-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O VYEHBMMAJFVTOI-JHEQGTHGSA-N 0.000 description 7
- RVMNUBQWPVOUKH-HEIBUPTGSA-N Thr-Ser-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMNUBQWPVOUKH-HEIBUPTGSA-N 0.000 description 7
- YXONONCLMLHWJX-SZMVWBNQSA-N Trp-Glu-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O)=CNC2=C1 YXONONCLMLHWJX-SZMVWBNQSA-N 0.000 description 7
- LUMQYLVYUIRHHU-YJRXYDGGSA-N Tyr-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LUMQYLVYUIRHHU-YJRXYDGGSA-N 0.000 description 7
- NZYNRRGJJVSSTJ-GUBZILKMSA-N Val-Ser-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O NZYNRRGJJVSSTJ-GUBZILKMSA-N 0.000 description 7
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 7
- 108010093581 aspartyl-proline Proteins 0.000 description 7
- 238000003556 assay Methods 0.000 description 7
- 230000015572 biosynthetic process Effects 0.000 description 7
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 7
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 7
- 108010015792 glycyllysine Proteins 0.000 description 7
- 229920001184 polypeptide Polymers 0.000 description 7
- 108010077112 prolyl-proline Proteins 0.000 description 7
- 239000013608 rAAV vector Substances 0.000 description 7
- 239000004055 small Interfering RNA Substances 0.000 description 7
- HXUVTXPOZRFMOY-NSHDSACASA-N 2-[[(2s)-2-[[2-[(2-aminoacetyl)amino]acetyl]amino]-3-phenylpropanoyl]amino]acetic acid Chemical compound NCC(=O)NCC(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 HXUVTXPOZRFMOY-NSHDSACASA-N 0.000 description 6
- LBJYAILUMSUTAM-ZLUOBGJFSA-N Ala-Asn-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O LBJYAILUMSUTAM-ZLUOBGJFSA-N 0.000 description 6
- XCVRVWZTXPCYJT-BIIVOSGPSA-N Ala-Asn-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N XCVRVWZTXPCYJT-BIIVOSGPSA-N 0.000 description 6
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 6
- FBHOPGDGELNWRH-DRZSPHRISA-N Ala-Glu-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O FBHOPGDGELNWRH-DRZSPHRISA-N 0.000 description 6
- MQIGTEQXYCRLGK-BQBZGAKWSA-N Ala-Gly-Pro Chemical compound C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O MQIGTEQXYCRLGK-BQBZGAKWSA-N 0.000 description 6
- XHNLCGXYBXNRIS-BJDJZHNGSA-N Ala-Lys-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XHNLCGXYBXNRIS-BJDJZHNGSA-N 0.000 description 6
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 6
- LSMDIAAALJJLRO-XQXXSGGOSA-N Ala-Thr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LSMDIAAALJJLRO-XQXXSGGOSA-N 0.000 description 6
- ASQYTJJWAMDISW-BPUTZDHNSA-N Arg-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N ASQYTJJWAMDISW-BPUTZDHNSA-N 0.000 description 6
- CGWVCWFQGXOUSJ-ULQDDVLXSA-N Arg-Tyr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O CGWVCWFQGXOUSJ-ULQDDVLXSA-N 0.000 description 6
- HUZGPXBILPMCHM-IHRRRGAJSA-N Asn-Arg-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HUZGPXBILPMCHM-IHRRRGAJSA-N 0.000 description 6
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 6
- JRVABKHPWDRUJF-UBHSHLNASA-N Asn-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N JRVABKHPWDRUJF-UBHSHLNASA-N 0.000 description 6
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 6
- VHQSGALUSWIYOD-QXEWZRGKSA-N Asn-Pro-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O VHQSGALUSWIYOD-QXEWZRGKSA-N 0.000 description 6
- UGXYFDQFLVCDFC-CIUDSAMLSA-N Asn-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O UGXYFDQFLVCDFC-CIUDSAMLSA-N 0.000 description 6
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 6
- DAYDURRBMDCCFL-AAEUAGOBSA-N Asn-Trp-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N DAYDURRBMDCCFL-AAEUAGOBSA-N 0.000 description 6
- IXIWEFWRKIUMQX-DCAQKATOSA-N Asp-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(O)=O IXIWEFWRKIUMQX-DCAQKATOSA-N 0.000 description 6
- XYBJLTKSGFBLCS-QXEWZRGKSA-N Asp-Arg-Val Chemical compound NC(N)=NCCC[C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CC(O)=O XYBJLTKSGFBLCS-QXEWZRGKSA-N 0.000 description 6
- MGSVBZIBCCKGCY-ZLUOBGJFSA-N Asp-Ser-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MGSVBZIBCCKGCY-ZLUOBGJFSA-N 0.000 description 6
- MADFVRSKEIEZHZ-DCAQKATOSA-N Gln-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N MADFVRSKEIEZHZ-DCAQKATOSA-N 0.000 description 6
- IVCOYUURLWQDJQ-LPEHRKFASA-N Gln-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N)C(=O)O IVCOYUURLWQDJQ-LPEHRKFASA-N 0.000 description 6
- HPCOBEHVEHWREJ-DCAQKATOSA-N Gln-Lys-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HPCOBEHVEHWREJ-DCAQKATOSA-N 0.000 description 6
- FNAJNWPDTIXYJN-CIUDSAMLSA-N Gln-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCC(N)=O FNAJNWPDTIXYJN-CIUDSAMLSA-N 0.000 description 6
- FITIQFSXXBKFFM-NRPADANISA-N Gln-Val-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FITIQFSXXBKFFM-NRPADANISA-N 0.000 description 6
- VGUYMZGLJUJRBV-YVNDNENWSA-N Glu-Ile-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O VGUYMZGLJUJRBV-YVNDNENWSA-N 0.000 description 6
- RLFSBAPJTYKSLG-WHFBIAKZSA-N Gly-Ala-Asp Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O RLFSBAPJTYKSLG-WHFBIAKZSA-N 0.000 description 6
- RJIVPOXLQFJRTG-LURJTMIESA-N Gly-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N RJIVPOXLQFJRTG-LURJTMIESA-N 0.000 description 6
- FXLVSYVJDPCIHH-STQMWFEESA-N Gly-Phe-Arg Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FXLVSYVJDPCIHH-STQMWFEESA-N 0.000 description 6
- RIYIFUFFFBIOEU-KBPBESRZSA-N Gly-Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 RIYIFUFFFBIOEU-KBPBESRZSA-N 0.000 description 6
- VUUFXXGKMPLKNH-BZSNNMDCSA-N His-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N VUUFXXGKMPLKNH-BZSNNMDCSA-N 0.000 description 6
- YEKYGQZUBCRNGH-DCAQKATOSA-N His-Pro-Ser Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CN=CN2)N)C(=O)N[C@@H](CO)C(=O)O YEKYGQZUBCRNGH-DCAQKATOSA-N 0.000 description 6
- XHQYFGPIRUHQIB-PBCZWWQYSA-N His-Thr-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CN=CN1 XHQYFGPIRUHQIB-PBCZWWQYSA-N 0.000 description 6
- OVDKXUDMKXAZIV-ZPFDUUQYSA-N Ile-Lys-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OVDKXUDMKXAZIV-ZPFDUUQYSA-N 0.000 description 6
- PXKACEXYLPBMAD-JBDRJPRFSA-N Ile-Ser-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PXKACEXYLPBMAD-JBDRJPRFSA-N 0.000 description 6
- NURNJECQNNCRBK-FLBSBUHZSA-N Ile-Thr-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NURNJECQNNCRBK-FLBSBUHZSA-N 0.000 description 6
- JTBFQNHKNRZJDS-SYWGBEHUSA-N Ile-Trp-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](C)C(=O)O)N JTBFQNHKNRZJDS-SYWGBEHUSA-N 0.000 description 6
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 6
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 6
- DPWGZWUMUUJQDT-IUCAKERBSA-N Leu-Gln-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O DPWGZWUMUUJQDT-IUCAKERBSA-N 0.000 description 6
- LLBQJYDYOLIQAI-JYJNAYRXSA-N Leu-Glu-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LLBQJYDYOLIQAI-JYJNAYRXSA-N 0.000 description 6
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 6
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 6
- KOSWSHVQIVTVQF-ZPFDUUQYSA-N Leu-Ile-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KOSWSHVQIVTVQF-ZPFDUUQYSA-N 0.000 description 6
- FLNPJLDPGMLWAU-UWVGGRQHSA-N Leu-Met-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(C)C FLNPJLDPGMLWAU-UWVGGRQHSA-N 0.000 description 6
- 108010013563 Lipoprotein Lipase Proteins 0.000 description 6
- 102100022119 Lipoprotein lipase Human genes 0.000 description 6
- MPOHDJKRBLVGCT-CIUDSAMLSA-N Lys-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N MPOHDJKRBLVGCT-CIUDSAMLSA-N 0.000 description 6
- GNLJXWBNLAIPEP-MELADBBJSA-N Lys-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCCN)N)C(=O)O GNLJXWBNLAIPEP-MELADBBJSA-N 0.000 description 6
- OTKQHDPECKUDSB-SZMVWBNQSA-N Met-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCSC)C(O)=O)=CNC2=C1 OTKQHDPECKUDSB-SZMVWBNQSA-N 0.000 description 6
- 108010066427 N-valyltryptophan Proteins 0.000 description 6
- DJPXNKUDJKGQEE-BZSNNMDCSA-N Phe-Asp-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DJPXNKUDJKGQEE-BZSNNMDCSA-N 0.000 description 6
- YYKZDTVQHTUKDW-RYUDHWBXSA-N Phe-Gly-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N YYKZDTVQHTUKDW-RYUDHWBXSA-N 0.000 description 6
- HQCSLJFGZYOXHW-KKUMJFAQSA-N Phe-His-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CS)C(=O)O)N HQCSLJFGZYOXHW-KKUMJFAQSA-N 0.000 description 6
- SSSFPISOZOLQNP-GUBZILKMSA-N Pro-Arg-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SSSFPISOZOLQNP-GUBZILKMSA-N 0.000 description 6
- ZCXQTRXYZOSGJR-FXQIFTODSA-N Pro-Asp-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZCXQTRXYZOSGJR-FXQIFTODSA-N 0.000 description 6
- SKICPQLTOXGWGO-GARJFASQSA-N Pro-Gln-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)N)C(=O)N2CCC[C@@H]2C(=O)O SKICPQLTOXGWGO-GARJFASQSA-N 0.000 description 6
- KTFZQPLSPLWLKN-KKUMJFAQSA-N Pro-Gln-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KTFZQPLSPLWLKN-KKUMJFAQSA-N 0.000 description 6
- FYKUEXMZYFIZKA-DCAQKATOSA-N Pro-Pro-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O FYKUEXMZYFIZKA-DCAQKATOSA-N 0.000 description 6
- SEZGGSHLMROBFX-CIUDSAMLSA-N Pro-Ser-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O SEZGGSHLMROBFX-CIUDSAMLSA-N 0.000 description 6
- DMNANGOFEUVBRV-GJZGRUSLSA-N Pro-Trp-Gly Chemical compound N([C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)NCC(=O)O)C(=O)[C@@H]1CCCN1 DMNANGOFEUVBRV-GJZGRUSLSA-N 0.000 description 6
- FIDNSJUXESUDOV-JYJNAYRXSA-N Pro-Tyr-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O FIDNSJUXESUDOV-JYJNAYRXSA-N 0.000 description 6
- ZAUHSLVPDLNTRZ-QXEWZRGKSA-N Pro-Val-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ZAUHSLVPDLNTRZ-QXEWZRGKSA-N 0.000 description 6
- 238000001190 Q-PCR Methods 0.000 description 6
- UGJRQLURDVGULT-LKXGYXEUSA-N Ser-Asn-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UGJRQLURDVGULT-LKXGYXEUSA-N 0.000 description 6
- MMAPOBOTRUVNKJ-ZLUOBGJFSA-N Ser-Asp-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CO)N)C(=O)O MMAPOBOTRUVNKJ-ZLUOBGJFSA-N 0.000 description 6
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 6
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 6
- NNFMANHDYSVNIO-DCAQKATOSA-N Ser-Lys-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NNFMANHDYSVNIO-DCAQKATOSA-N 0.000 description 6
- WFUAUEQXPVNAEF-ZJDVBMNYSA-N Thr-Arg-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CCCN=C(N)N WFUAUEQXPVNAEF-ZJDVBMNYSA-N 0.000 description 6
- GCXFWAZRHBRYEM-NUMRIWBASA-N Thr-Gln-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O GCXFWAZRHBRYEM-NUMRIWBASA-N 0.000 description 6
- SLUWOCTZVGMURC-BFHQHQDPSA-N Thr-Gly-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O SLUWOCTZVGMURC-BFHQHQDPSA-N 0.000 description 6
- IWAVRIPRTCJAQO-HSHDSVGOSA-N Thr-Pro-Trp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O IWAVRIPRTCJAQO-HSHDSVGOSA-N 0.000 description 6
- UQCNIMDPYICBTR-KYNKHSRBSA-N Thr-Thr-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UQCNIMDPYICBTR-KYNKHSRBSA-N 0.000 description 6
- QGVBFDIREUUSHX-IFFSRLJSSA-N Thr-Val-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O QGVBFDIREUUSHX-IFFSRLJSSA-N 0.000 description 6
- QSFJHIRIHOJRKS-ULQDDVLXSA-N Tyr-Leu-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QSFJHIRIHOJRKS-ULQDDVLXSA-N 0.000 description 6
- DMWNPLOERDAHSY-MEYUZBJRSA-N Tyr-Leu-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DMWNPLOERDAHSY-MEYUZBJRSA-N 0.000 description 6
- BYAKMYBZADCNMN-JYJNAYRXSA-N Tyr-Lys-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O BYAKMYBZADCNMN-JYJNAYRXSA-N 0.000 description 6
- WQOHKVRQDLNDIL-YJRXYDGGSA-N Tyr-Thr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O WQOHKVRQDLNDIL-YJRXYDGGSA-N 0.000 description 6
- NXRAUQGGHPCJIB-RCOVLWMOSA-N Val-Gly-Asn Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O NXRAUQGGHPCJIB-RCOVLWMOSA-N 0.000 description 6
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 6
- 238000007792 addition Methods 0.000 description 6
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 6
- 230000027455 binding Effects 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 6
- 108010049041 glutamylalanine Proteins 0.000 description 6
- 238000003780 insertion Methods 0.000 description 6
- 230000037431 insertion Effects 0.000 description 6
- 108010003700 lysyl aspartic acid Proteins 0.000 description 6
- 108010029020 prolylglycine Proteins 0.000 description 6
- 238000005406 washing Methods 0.000 description 6
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 5
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 5
- MPLOSMWGDNJSEV-WHFBIAKZSA-N Ala-Gly-Asp Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MPLOSMWGDNJSEV-WHFBIAKZSA-N 0.000 description 5
- LTSBJNNXPBBNDT-HGNGGELXSA-N Ala-His-Gln Chemical compound N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(=O)O LTSBJNNXPBBNDT-HGNGGELXSA-N 0.000 description 5
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 5
- NHWYNIZWLJYZAG-XVYDVKMFSA-N Ala-Ser-His Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N NHWYNIZWLJYZAG-XVYDVKMFSA-N 0.000 description 5
- MMLHRUJLOUSRJX-CIUDSAMLSA-N Ala-Ser-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN MMLHRUJLOUSRJX-CIUDSAMLSA-N 0.000 description 5
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 5
- FRBAHXABMQXSJQ-FXQIFTODSA-N Arg-Ser-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O FRBAHXABMQXSJQ-FXQIFTODSA-N 0.000 description 5
- UZSQXCMNUPKLCC-FJXKBIBVSA-N Arg-Thr-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UZSQXCMNUPKLCC-FJXKBIBVSA-N 0.000 description 5
- WTFIFQWLQXZLIZ-UMPQAUOISA-N Arg-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O WTFIFQWLQXZLIZ-UMPQAUOISA-N 0.000 description 5
- CPTXATAOUQJQRO-GUBZILKMSA-N Arg-Val-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O CPTXATAOUQJQRO-GUBZILKMSA-N 0.000 description 5
- NVGWESORMHFISY-SRVKXCTJSA-N Asn-Asn-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NVGWESORMHFISY-SRVKXCTJSA-N 0.000 description 5
- RAUPFUCUDBQYHE-AVGNSLFASA-N Asn-Phe-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RAUPFUCUDBQYHE-AVGNSLFASA-N 0.000 description 5
- SUIJFTJDTJKSRK-IHRRRGAJSA-N Asn-Pro-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUIJFTJDTJKSRK-IHRRRGAJSA-N 0.000 description 5
- UQBGYPFHWFZMCD-ZLUOBGJFSA-N Asp-Asn-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O UQBGYPFHWFZMCD-ZLUOBGJFSA-N 0.000 description 5
- IDDMGSKZQDEDGA-SRVKXCTJSA-N Asp-Phe-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 IDDMGSKZQDEDGA-SRVKXCTJSA-N 0.000 description 5
- HJZLUGQGJWXJCJ-CIUDSAMLSA-N Asp-Pro-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O HJZLUGQGJWXJCJ-CIUDSAMLSA-N 0.000 description 5
- BRRPVTUFESPTCP-ACZMJKKPSA-N Asp-Ser-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O BRRPVTUFESPTCP-ACZMJKKPSA-N 0.000 description 5
- 238000011740 C57BL/6 mouse Methods 0.000 description 5
- 108091035707 Consensus sequence Proteins 0.000 description 5
- HBHMVBGGHDMPBF-GARJFASQSA-N Cys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N HBHMVBGGHDMPBF-GARJFASQSA-N 0.000 description 5
- BTSPOOHJBYJRKO-CIUDSAMLSA-N Gln-Asp-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BTSPOOHJBYJRKO-CIUDSAMLSA-N 0.000 description 5
- QYTKAVBFRUGYAU-ACZMJKKPSA-N Gln-Asp-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QYTKAVBFRUGYAU-ACZMJKKPSA-N 0.000 description 5
- NSORZJXKUQFEKL-JGVFFNPUSA-N Gln-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)N)N)C(=O)O NSORZJXKUQFEKL-JGVFFNPUSA-N 0.000 description 5
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 5
- LPIKVBWNNVFHCQ-GUBZILKMSA-N Gln-Ser-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LPIKVBWNNVFHCQ-GUBZILKMSA-N 0.000 description 5
- UQKVUFGUSVYJMQ-IRIUXVKKSA-N Gln-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)N)N)O UQKVUFGUSVYJMQ-IRIUXVKKSA-N 0.000 description 5
- MKRDNSWGJWTBKZ-GVXVVHGQSA-N Gln-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MKRDNSWGJWTBKZ-GVXVVHGQSA-N 0.000 description 5
- BUZMZDDKFCSKOT-CIUDSAMLSA-N Glu-Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BUZMZDDKFCSKOT-CIUDSAMLSA-N 0.000 description 5
- AUTNXSQEVVHSJK-YVNDNENWSA-N Glu-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O AUTNXSQEVVHSJK-YVNDNENWSA-N 0.000 description 5
- TWYFJOHWGCCRIR-DCAQKATOSA-N Glu-Pro-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYFJOHWGCCRIR-DCAQKATOSA-N 0.000 description 5
- BPCLDCNZBUYGOD-BPUTZDHNSA-N Glu-Trp-Glu Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 BPCLDCNZBUYGOD-BPUTZDHNSA-N 0.000 description 5
- PMSDOVISAARGAV-FHWLQOOXSA-N Glu-Tyr-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 PMSDOVISAARGAV-FHWLQOOXSA-N 0.000 description 5
- XLFHCWHXKSFVIB-BQBZGAKWSA-N Gly-Gln-Gln Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLFHCWHXKSFVIB-BQBZGAKWSA-N 0.000 description 5
- MIIVFRCYJABHTQ-ONGXEEELSA-N Gly-Leu-Val Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O MIIVFRCYJABHTQ-ONGXEEELSA-N 0.000 description 5
- FJWSJWACLMTDMI-WPRPVWTQSA-N Gly-Met-Val Chemical compound [H]NCC(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O FJWSJWACLMTDMI-WPRPVWTQSA-N 0.000 description 5
- CSMYMGFCEJWALV-WDSKDSINSA-N Gly-Ser-Gln Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O CSMYMGFCEJWALV-WDSKDSINSA-N 0.000 description 5
- ZLCLYFGMKFCDCN-XPUUQOCRSA-N Gly-Ser-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CO)NC(=O)CN)C(O)=O ZLCLYFGMKFCDCN-XPUUQOCRSA-N 0.000 description 5
- OCRQUYDOYKCOQG-IRXDYDNUSA-N Gly-Tyr-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 OCRQUYDOYKCOQG-IRXDYDNUSA-N 0.000 description 5
- HYWZHNUGAYVEEW-KKUMJFAQSA-N His-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N HYWZHNUGAYVEEW-KKUMJFAQSA-N 0.000 description 5
- VIJMRAIWYWRXSR-CIUDSAMLSA-N His-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 VIJMRAIWYWRXSR-CIUDSAMLSA-N 0.000 description 5
- 101000600434 Homo sapiens Putative uncharacterized protein encoded by MIR7-3HG Proteins 0.000 description 5
- DFFTXLCCDFYRKD-MBLNEYKQSA-N Ile-Gly-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N DFFTXLCCDFYRKD-MBLNEYKQSA-N 0.000 description 5
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 5
- ZURHXHNAEJJRNU-CIUDSAMLSA-N Leu-Asp-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZURHXHNAEJJRNU-CIUDSAMLSA-N 0.000 description 5
- YFBBUHJJUXXZOF-UWVGGRQHSA-N Leu-Gly-Pro Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O YFBBUHJJUXXZOF-UWVGGRQHSA-N 0.000 description 5
- QLDHBYRUNQZIJQ-DKIMLUQUSA-N Leu-Ile-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QLDHBYRUNQZIJQ-DKIMLUQUSA-N 0.000 description 5
- REPBGZHJKYWFMJ-KKUMJFAQSA-N Leu-Lys-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N REPBGZHJKYWFMJ-KKUMJFAQSA-N 0.000 description 5
- PTRKPHUGYULXPU-KKUMJFAQSA-N Leu-Phe-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O PTRKPHUGYULXPU-KKUMJFAQSA-N 0.000 description 5
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 5
- PWPBLZXWFXJFHE-RHYQMDGZSA-N Leu-Pro-Thr Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O PWPBLZXWFXJFHE-RHYQMDGZSA-N 0.000 description 5
- ZDJQVSIPFLMNOX-RHYQMDGZSA-N Leu-Thr-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZDJQVSIPFLMNOX-RHYQMDGZSA-N 0.000 description 5
- YIRIDPUGZKHMHT-ACRUOGEOSA-N Leu-Tyr-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YIRIDPUGZKHMHT-ACRUOGEOSA-N 0.000 description 5
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 5
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 5
- QUCDKEKDPYISNX-HJGDQZAQSA-N Lys-Asn-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QUCDKEKDPYISNX-HJGDQZAQSA-N 0.000 description 5
- QUYCUALODHJQLK-CIUDSAMLSA-N Lys-Asp-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUYCUALODHJQLK-CIUDSAMLSA-N 0.000 description 5
- ALEVUGKHINJNIF-QEJZJMRPSA-N Lys-Phe-Ala Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 ALEVUGKHINJNIF-QEJZJMRPSA-N 0.000 description 5
- WXHHTBVYQOSYSL-FXQIFTODSA-N Met-Ala-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O WXHHTBVYQOSYSL-FXQIFTODSA-N 0.000 description 5
- IHITVQKJXQQGLJ-LPEHRKFASA-N Met-Asn-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N IHITVQKJXQQGLJ-LPEHRKFASA-N 0.000 description 5
- MPFGIYLYWUCSJG-AVGNSLFASA-N Phe-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MPFGIYLYWUCSJG-AVGNSLFASA-N 0.000 description 5
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 5
- MYQCCQSMKNCNKY-KKUMJFAQSA-N Phe-His-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CO)C(=O)O)N MYQCCQSMKNCNKY-KKUMJFAQSA-N 0.000 description 5
- WKLMCMXFMQEKCX-SLFFLAALSA-N Phe-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CC=CC=C3)N)C(=O)O WKLMCMXFMQEKCX-SLFFLAALSA-N 0.000 description 5
- CVAUVSOFHJKCHN-BZSNNMDCSA-N Phe-Tyr-Cys Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CS)C(O)=O)C1=CC=CC=C1 CVAUVSOFHJKCHN-BZSNNMDCSA-N 0.000 description 5
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 5
- ICTZKEXYDDZZFP-SRVKXCTJSA-N Pro-Arg-Pro Chemical compound N([C@@H](CCCN=C(N)N)C(=O)N1[C@@H](CCC1)C(O)=O)C(=O)[C@@H]1CCCN1 ICTZKEXYDDZZFP-SRVKXCTJSA-N 0.000 description 5
- WVOXLKUUVCCCSU-ZPFDUUQYSA-N Pro-Glu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVOXLKUUVCCCSU-ZPFDUUQYSA-N 0.000 description 5
- UUHXBJHVTVGSKM-BQBZGAKWSA-N Pro-Gly-Asn Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UUHXBJHVTVGSKM-BQBZGAKWSA-N 0.000 description 5
- ZTMLZUNPFDGPKY-VKOGCVSHSA-N Pro-Ile-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@@H]3CCCN3 ZTMLZUNPFDGPKY-VKOGCVSHSA-N 0.000 description 5
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 5
- NAIPAPCKKRCMBL-JYJNAYRXSA-N Pro-Pro-Phe Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H]1N(CCC1)C(=O)[C@H]1NCCC1)C1=CC=CC=C1 NAIPAPCKKRCMBL-JYJNAYRXSA-N 0.000 description 5
- ZMLRZBWCXPQADC-TUAOUCFPSA-N Pro-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ZMLRZBWCXPQADC-TUAOUCFPSA-N 0.000 description 5
- 102100037401 Putative uncharacterized protein encoded by MIR7-3HG Human genes 0.000 description 5
- ADJDNJCSPNFFPI-FXQIFTODSA-N Ser-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO ADJDNJCSPNFFPI-FXQIFTODSA-N 0.000 description 5
- PYTKULIABVRXSC-BWBBJGPYSA-N Ser-Ser-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PYTKULIABVRXSC-BWBBJGPYSA-N 0.000 description 5
- RXUOAOOZIWABBW-XGEHTFHBSA-N Ser-Thr-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RXUOAOOZIWABBW-XGEHTFHBSA-N 0.000 description 5
- ZSDXEKUKQAKZFE-XAVMHZPKSA-N Ser-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N)O ZSDXEKUKQAKZFE-XAVMHZPKSA-N 0.000 description 5
- 108020004459 Small interfering RNA Proteins 0.000 description 5
- YLXAMFZYJTZXFH-OLHMAJIHSA-N Thr-Asn-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O YLXAMFZYJTZXFH-OLHMAJIHSA-N 0.000 description 5
- IMDMLDSVUSMAEJ-HJGDQZAQSA-N Thr-Leu-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IMDMLDSVUSMAEJ-HJGDQZAQSA-N 0.000 description 5
- NDZYTIMDOZMECO-SHGPDSBTSA-N Thr-Thr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O NDZYTIMDOZMECO-SHGPDSBTSA-N 0.000 description 5
- QJIODPFLAASXJC-JHYOHUSXSA-N Thr-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O QJIODPFLAASXJC-JHYOHUSXSA-N 0.000 description 5
- FYBFTPLPAXZBOY-KKHAAJSZSA-N Thr-Val-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O FYBFTPLPAXZBOY-KKHAAJSZSA-N 0.000 description 5
- QNTBGBCOEYNAPV-CWRNSKLLSA-N Trp-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O QNTBGBCOEYNAPV-CWRNSKLLSA-N 0.000 description 5
- GTNCSPKYWCJZAC-XIRDDKMYSA-N Trp-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N GTNCSPKYWCJZAC-XIRDDKMYSA-N 0.000 description 5
- XZLHHHYSWIYXHD-XIRDDKMYSA-N Trp-Gln-Arg Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XZLHHHYSWIYXHD-XIRDDKMYSA-N 0.000 description 5
- UJRIVCPPPMYCNA-HOCLYGCPSA-N Trp-Leu-Gly Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N UJRIVCPPPMYCNA-HOCLYGCPSA-N 0.000 description 5
- SDNVRAKIJVKAGS-LKTVYLICSA-N Tyr-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N SDNVRAKIJVKAGS-LKTVYLICSA-N 0.000 description 5
- HKIUVWMZYFBIHG-KKUMJFAQSA-N Tyr-Arg-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O HKIUVWMZYFBIHG-KKUMJFAQSA-N 0.000 description 5
- PZXUIGWOEWWFQM-SRVKXCTJSA-N Tyr-Asn-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O PZXUIGWOEWWFQM-SRVKXCTJSA-N 0.000 description 5
- LMKKMCGTDANZTR-BZSNNMDCSA-N Tyr-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LMKKMCGTDANZTR-BZSNNMDCSA-N 0.000 description 5
- OKDNSNWJEXAMSU-IRXDYDNUSA-N Tyr-Phe-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)NCC(O)=O)C1=CC=C(O)C=C1 OKDNSNWJEXAMSU-IRXDYDNUSA-N 0.000 description 5
- QPZMOUMNTGTEFR-ZKWXMUAHSA-N Val-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N QPZMOUMNTGTEFR-ZKWXMUAHSA-N 0.000 description 5
- HHSILIQTHXABKM-YDHLFZDLSA-N Val-Asp-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](Cc1ccccc1)C(O)=O HHSILIQTHXABKM-YDHLFZDLSA-N 0.000 description 5
- MGVYZTPLGXPVQB-CYDGBPFRSA-N Val-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C(C)C)N MGVYZTPLGXPVQB-CYDGBPFRSA-N 0.000 description 5
- GQMNEJMFMCJJTD-NHCYSSNCSA-N Val-Pro-Gln Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O GQMNEJMFMCJJTD-NHCYSSNCSA-N 0.000 description 5
- QIVPZSWBBHRNBA-JYJNAYRXSA-N Val-Pro-Phe Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccccc1)C(O)=O QIVPZSWBBHRNBA-JYJNAYRXSA-N 0.000 description 5
- JXWGBRRVTRAZQA-ULQDDVLXSA-N Val-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N JXWGBRRVTRAZQA-ULQDDVLXSA-N 0.000 description 5
- 108010062796 arginyllysine Proteins 0.000 description 5
- 238000013461 design Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 238000001415 gene therapy Methods 0.000 description 5
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 5
- 108010010147 glycylglutamine Proteins 0.000 description 5
- 238000010348 incorporation Methods 0.000 description 5
- 230000001965 increasing effect Effects 0.000 description 5
- 230000000977 initiatory effect Effects 0.000 description 5
- 238000009126 molecular therapy Methods 0.000 description 5
- 238000002360 preparation method Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 210000003705 ribosome Anatomy 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000001225 therapeutic effect Effects 0.000 description 5
- 238000010361 transduction Methods 0.000 description 5
- 230000026683 transduction Effects 0.000 description 5
- 230000009261 transgenic effect Effects 0.000 description 5
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 4
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 4
- CBCCCLMNOBLBSC-XVYDVKMFSA-N Ala-His-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O CBCCCLMNOBLBSC-XVYDVKMFSA-N 0.000 description 4
- KLALXKYLOMZDQT-ZLUOBGJFSA-N Ala-Ser-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(N)=O KLALXKYLOMZDQT-ZLUOBGJFSA-N 0.000 description 4
- PEEYDECOOVQKRZ-DLOVCJGASA-N Ala-Ser-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PEEYDECOOVQKRZ-DLOVCJGASA-N 0.000 description 4
- XSLGWYYNOSUMRM-ZKWXMUAHSA-N Ala-Val-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XSLGWYYNOSUMRM-ZKWXMUAHSA-N 0.000 description 4
- VXXHDZKEQNGXNU-QXEWZRGKSA-N Arg-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N VXXHDZKEQNGXNU-QXEWZRGKSA-N 0.000 description 4
- OQCWXQJLCDPRHV-UWVGGRQHSA-N Arg-Gly-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O OQCWXQJLCDPRHV-UWVGGRQHSA-N 0.000 description 4
- YNDLOUMBVDVALC-ZLUOBGJFSA-N Asn-Ala-Ala Chemical compound C[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CC(=O)N)N YNDLOUMBVDVALC-ZLUOBGJFSA-N 0.000 description 4
- ACRYGQFHAQHDSF-ZLUOBGJFSA-N Asn-Asn-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ACRYGQFHAQHDSF-ZLUOBGJFSA-N 0.000 description 4
- LJUOLNXOWSWGKF-ACZMJKKPSA-N Asn-Asn-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N LJUOLNXOWSWGKF-ACZMJKKPSA-N 0.000 description 4
- PIWWUBYJNONVTJ-ZLUOBGJFSA-N Asn-Asp-Asn Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)C(=O)N PIWWUBYJNONVTJ-ZLUOBGJFSA-N 0.000 description 4
- DDPXDCKYWDGZAL-BQBZGAKWSA-N Asn-Gly-Arg Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N DDPXDCKYWDGZAL-BQBZGAKWSA-N 0.000 description 4
- RAKKBBHMTJSXOY-XVYDVKMFSA-N Asn-His-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O RAKKBBHMTJSXOY-XVYDVKMFSA-N 0.000 description 4
- HFPXZWPUVFVNLL-GUBZILKMSA-N Asn-Leu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFPXZWPUVFVNLL-GUBZILKMSA-N 0.000 description 4
- PPCORQFLAZWUNO-QWRGUYRKSA-N Asn-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N PPCORQFLAZWUNO-QWRGUYRKSA-N 0.000 description 4
- BKFXFUPYETWGGA-XVSYOHENSA-N Asn-Phe-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BKFXFUPYETWGGA-XVSYOHENSA-N 0.000 description 4
- JPSODRNUDXONAS-XIRDDKMYSA-N Asn-Trp-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)NC(=O)[C@H](CC(=O)N)N JPSODRNUDXONAS-XIRDDKMYSA-N 0.000 description 4
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 4
- GHODABZPVZMWCE-FXQIFTODSA-N Asp-Glu-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O GHODABZPVZMWCE-FXQIFTODSA-N 0.000 description 4
- WSGVTKZFVJSJOG-RCOVLWMOSA-N Asp-Gly-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O WSGVTKZFVJSJOG-RCOVLWMOSA-N 0.000 description 4
- ZVGRHIRJLWBWGJ-ACZMJKKPSA-N Asp-Ser-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZVGRHIRJLWBWGJ-ACZMJKKPSA-N 0.000 description 4
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 4
- LLRJPYJQNBMOOO-QEJZJMRPSA-N Asp-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N LLRJPYJQNBMOOO-QEJZJMRPSA-N 0.000 description 4
- SFJUYBCDQBAYAJ-YDHLFZDLSA-N Asp-Val-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SFJUYBCDQBAYAJ-YDHLFZDLSA-N 0.000 description 4
- JGLWFWXGOINXEA-YDHLFZDLSA-N Asp-Val-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 JGLWFWXGOINXEA-YDHLFZDLSA-N 0.000 description 4
- YZFCGHIBLBDZDA-ZLUOBGJFSA-N Cys-Asp-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YZFCGHIBLBDZDA-ZLUOBGJFSA-N 0.000 description 4
- XIZWKXATMJODQW-KKUMJFAQSA-N Cys-His-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CS)N XIZWKXATMJODQW-KKUMJFAQSA-N 0.000 description 4
- XLLSMEFANRROJE-GUBZILKMSA-N Cys-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CS)N XLLSMEFANRROJE-GUBZILKMSA-N 0.000 description 4
- 238000002965 ELISA Methods 0.000 description 4
- ULXXDWZMMSQBDC-ACZMJKKPSA-N Gln-Asp-Asp Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ULXXDWZMMSQBDC-ACZMJKKPSA-N 0.000 description 4
- NKCZYEDZTKOFBG-GUBZILKMSA-N Gln-Gln-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NKCZYEDZTKOFBG-GUBZILKMSA-N 0.000 description 4
- VOLVNCMGXWDDQY-LPEHRKFASA-N Gln-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)C(=O)O VOLVNCMGXWDDQY-LPEHRKFASA-N 0.000 description 4
- VSXBYIJUAXPAAL-WDSKDSINSA-N Gln-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O VSXBYIJUAXPAAL-WDSKDSINSA-N 0.000 description 4
- HVQCEQTUSWWFOS-WDSKDSINSA-N Gln-Gly-Cys Chemical compound C(CC(=O)N)[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N HVQCEQTUSWWFOS-WDSKDSINSA-N 0.000 description 4
- SMLDOQHTOAAFJQ-WDSKDSINSA-N Gln-Gly-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SMLDOQHTOAAFJQ-WDSKDSINSA-N 0.000 description 4
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 4
- ITZWDGBYBPUZRG-KBIXCLLPSA-N Gln-Ile-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O ITZWDGBYBPUZRG-KBIXCLLPSA-N 0.000 description 4
- SYZZMPFLOLSMHL-XHNCKOQMSA-N Gln-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N)C(=O)O SYZZMPFLOLSMHL-XHNCKOQMSA-N 0.000 description 4
- UBRQJXFDVZNYJP-AVGNSLFASA-N Gln-Tyr-Ser Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O UBRQJXFDVZNYJP-AVGNSLFASA-N 0.000 description 4
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 4
- WVYJNPCWJYBHJG-YVNDNENWSA-N Glu-Ile-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O WVYJNPCWJYBHJG-YVNDNENWSA-N 0.000 description 4
- QGAJQIGFFIQJJK-IHRRRGAJSA-N Glu-Tyr-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O QGAJQIGFFIQJJK-IHRRRGAJSA-N 0.000 description 4
- WGYHAAXZWPEBDQ-IFFSRLJSSA-N Glu-Val-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGYHAAXZWPEBDQ-IFFSRLJSSA-N 0.000 description 4
- 102000016354 Glucuronosyltransferase Human genes 0.000 description 4
- 108010092364 Glucuronosyltransferase Proteins 0.000 description 4
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 4
- GGEJHJIXRBTJPD-BYPYZUCNSA-N Gly-Asn-Gly Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GGEJHJIXRBTJPD-BYPYZUCNSA-N 0.000 description 4
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 4
- IWAXHBCACVWNHT-BQBZGAKWSA-N Gly-Asp-Arg Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IWAXHBCACVWNHT-BQBZGAKWSA-N 0.000 description 4
- LCNXZQROPKFGQK-WHFBIAKZSA-N Gly-Asp-Ser Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O LCNXZQROPKFGQK-WHFBIAKZSA-N 0.000 description 4
- IXKRSKPKSLXIHN-YUMQZZPRSA-N Gly-Cys-Leu Chemical compound [H]NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O IXKRSKPKSLXIHN-YUMQZZPRSA-N 0.000 description 4
- SSFWXSNOKDZNHY-QXEWZRGKSA-N Gly-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN SSFWXSNOKDZNHY-QXEWZRGKSA-N 0.000 description 4
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 4
- DNAZKGFYFRGZIH-QWRGUYRKSA-N Gly-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 DNAZKGFYFRGZIH-QWRGUYRKSA-N 0.000 description 4
- AASLOGQZZKZWKH-SRVKXCTJSA-N His-Cys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N AASLOGQZZKZWKH-SRVKXCTJSA-N 0.000 description 4
- PLCAEMGSYOYIPP-GUBZILKMSA-N His-Ser-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 PLCAEMGSYOYIPP-GUBZILKMSA-N 0.000 description 4
- LQSBBHNVAVNZSX-GHCJXIJMSA-N Ile-Ala-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N LQSBBHNVAVNZSX-GHCJXIJMSA-N 0.000 description 4
- QYZYJFXHXYUZMZ-UGYAYLCHSA-N Ile-Asn-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N QYZYJFXHXYUZMZ-UGYAYLCHSA-N 0.000 description 4
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 4
- IGUOAYLTQJLPPD-DCAQKATOSA-N Leu-Asn-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IGUOAYLTQJLPPD-DCAQKATOSA-N 0.000 description 4
- DBVWMYGBVFCRBE-CIUDSAMLSA-N Leu-Asn-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DBVWMYGBVFCRBE-CIUDSAMLSA-N 0.000 description 4
- GPICTNQYKHHHTH-GUBZILKMSA-N Leu-Gln-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O GPICTNQYKHHHTH-GUBZILKMSA-N 0.000 description 4
- KGCLIYGPQXUNLO-IUCAKERBSA-N Leu-Gly-Glu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O KGCLIYGPQXUNLO-IUCAKERBSA-N 0.000 description 4
- ONPJGOIVICHWBW-BZSNNMDCSA-N Leu-Lys-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 ONPJGOIVICHWBW-BZSNNMDCSA-N 0.000 description 4
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 4
- XIZQPFCRXLUNMK-BZSNNMDCSA-N Lys-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCCCN)N XIZQPFCRXLUNMK-BZSNNMDCSA-N 0.000 description 4
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 4
- OZVXDDFYCQOPFD-XQQFMLRXSA-N Lys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N OZVXDDFYCQOPFD-XQQFMLRXSA-N 0.000 description 4
- VHGIWFGJIHTASW-FXQIFTODSA-N Met-Ala-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O VHGIWFGJIHTASW-FXQIFTODSA-N 0.000 description 4
- 241000701945 Parvoviridae Species 0.000 description 4
- SEPNOAFMZLLCEW-UBHSHLNASA-N Phe-Ala-Val Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O SEPNOAFMZLLCEW-UBHSHLNASA-N 0.000 description 4
- BRDYYVQTEJVRQT-HRCADAONSA-N Phe-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O BRDYYVQTEJVRQT-HRCADAONSA-N 0.000 description 4
- HXSUFWQYLPKEHF-IHRRRGAJSA-N Phe-Asn-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HXSUFWQYLPKEHF-IHRRRGAJSA-N 0.000 description 4
- NAXPHWZXEXNDIW-JTQLQIEISA-N Phe-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 NAXPHWZXEXNDIW-JTQLQIEISA-N 0.000 description 4
- WFHRXJOZEXUKLV-IRXDYDNUSA-N Phe-Gly-Tyr Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 WFHRXJOZEXUKLV-IRXDYDNUSA-N 0.000 description 4
- DOXQMJCSSYZSNM-BZSNNMDCSA-N Phe-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O DOXQMJCSSYZSNM-BZSNNMDCSA-N 0.000 description 4
- QRUOLOPKCOEZKU-HJWJTTGWSA-N Phe-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CC=CC=C1)N QRUOLOPKCOEZKU-HJWJTTGWSA-N 0.000 description 4
- FKFCKDROTNIVSO-JYJNAYRXSA-N Phe-Pro-Met Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(O)=O FKFCKDROTNIVSO-JYJNAYRXSA-N 0.000 description 4
- QSWKNJAPHQDAAS-MELADBBJSA-N Phe-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O QSWKNJAPHQDAAS-MELADBBJSA-N 0.000 description 4
- RAGOJJCBGXARPO-XVSYOHENSA-N Phe-Thr-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 RAGOJJCBGXARPO-XVSYOHENSA-N 0.000 description 4
- BPIMVBKDLSBKIJ-FCLVOEFKSA-N Phe-Thr-Phe Chemical compound C([C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 BPIMVBKDLSBKIJ-FCLVOEFKSA-N 0.000 description 4
- HFZNNDWPHBRNPV-KZVJFYERSA-N Pro-Ala-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HFZNNDWPHBRNPV-KZVJFYERSA-N 0.000 description 4
- VPEVBAUSTBWQHN-NHCYSSNCSA-N Pro-Glu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O VPEVBAUSTBWQHN-NHCYSSNCSA-N 0.000 description 4
- HATVCTYBNCNMAA-AVGNSLFASA-N Pro-Leu-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O HATVCTYBNCNMAA-AVGNSLFASA-N 0.000 description 4
- GNADVDLLGVSXLS-ULQDDVLXSA-N Pro-Phe-His Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(O)=O GNADVDLLGVSXLS-ULQDDVLXSA-N 0.000 description 4
- ZVEQWRWMRFIVSD-HRCADAONSA-N Pro-Phe-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N3CCC[C@@H]3C(=O)O ZVEQWRWMRFIVSD-HRCADAONSA-N 0.000 description 4
- GZNYIXWOIUFLGO-ZJDVBMNYSA-N Pro-Thr-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZNYIXWOIUFLGO-ZJDVBMNYSA-N 0.000 description 4
- DWUIECHTAMYEFL-XVYDVKMFSA-N Ser-Ala-His Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 DWUIECHTAMYEFL-XVYDVKMFSA-N 0.000 description 4
- YMAWDPHQVABADW-CIUDSAMLSA-N Ser-Gln-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O YMAWDPHQVABADW-CIUDSAMLSA-N 0.000 description 4
- FMDHKPRACUXATF-ACZMJKKPSA-N Ser-Gln-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O FMDHKPRACUXATF-ACZMJKKPSA-N 0.000 description 4
- YRBGKVIWMNEVCZ-WDSKDSINSA-N Ser-Glu-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YRBGKVIWMNEVCZ-WDSKDSINSA-N 0.000 description 4
- UFKPDBLKLOBMRH-XHNCKOQMSA-N Ser-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)C(=O)O UFKPDBLKLOBMRH-XHNCKOQMSA-N 0.000 description 4
- WBINSDOPZHQPPM-AVGNSLFASA-N Ser-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)O WBINSDOPZHQPPM-AVGNSLFASA-N 0.000 description 4
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 4
- NLOAIFSWUUFQFR-CIUDSAMLSA-N Ser-Leu-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O NLOAIFSWUUFQFR-CIUDSAMLSA-N 0.000 description 4
- NUEHQDHDLDXCRU-GUBZILKMSA-N Ser-Pro-Arg Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NUEHQDHDLDXCRU-GUBZILKMSA-N 0.000 description 4
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 4
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 4
- OLKICIBQRVSQMA-SRVKXCTJSA-N Ser-Ser-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OLKICIBQRVSQMA-SRVKXCTJSA-N 0.000 description 4
- OQSQCUWQOIHECT-YJRXYDGGSA-N Ser-Tyr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OQSQCUWQOIHECT-YJRXYDGGSA-N 0.000 description 4
- GZYNMZQXFRWDFH-YTWAJWBKSA-N Thr-Arg-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N)O GZYNMZQXFRWDFH-YTWAJWBKSA-N 0.000 description 4
- LXWZOMSOUAMOIA-JIOCBJNQSA-N Thr-Asn-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N)O LXWZOMSOUAMOIA-JIOCBJNQSA-N 0.000 description 4
- YBXMGKCLOPDEKA-NUMRIWBASA-N Thr-Asp-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YBXMGKCLOPDEKA-NUMRIWBASA-N 0.000 description 4
- JEDIEMIJYSRUBB-FOHZUACHSA-N Thr-Asp-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O JEDIEMIJYSRUBB-FOHZUACHSA-N 0.000 description 4
- XOWKUMFHEZLKLT-CIQUZCHMSA-N Thr-Ile-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O XOWKUMFHEZLKLT-CIQUZCHMSA-N 0.000 description 4
- JWQNAFHCXKVZKZ-UVOCVTCTSA-N Thr-Lys-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWQNAFHCXKVZKZ-UVOCVTCTSA-N 0.000 description 4
- UJQVSMNQMQHVRY-KZVJFYERSA-N Thr-Met-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O UJQVSMNQMQHVRY-KZVJFYERSA-N 0.000 description 4
- AAZOYLQUEQRUMZ-GSSVUCPTSA-N Thr-Thr-Asn Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O AAZOYLQUEQRUMZ-GSSVUCPTSA-N 0.000 description 4
- NHQVWACSJZJCGJ-FLBSBUHZSA-N Thr-Thr-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NHQVWACSJZJCGJ-FLBSBUHZSA-N 0.000 description 4
- ZMYCLHFLHRVOEA-HEIBUPTGSA-N Thr-Thr-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ZMYCLHFLHRVOEA-HEIBUPTGSA-N 0.000 description 4
- FBQHKSPOIAFUEI-OWLDWWDNSA-N Thr-Trp-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O FBQHKSPOIAFUEI-OWLDWWDNSA-N 0.000 description 4
- NLWDSYKZUPRMBJ-IEGACIPQSA-N Thr-Trp-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O NLWDSYKZUPRMBJ-IEGACIPQSA-N 0.000 description 4
- WVHUFSCKCBQKJW-HKUYNNGSSA-N Trp-Gly-Tyr Chemical compound C([C@H](NC(=O)CNC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=C(O)C=C1 WVHUFSCKCBQKJW-HKUYNNGSSA-N 0.000 description 4
- YLRLHDFMMWDYTK-KKUMJFAQSA-N Tyr-Cys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CS)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 YLRLHDFMMWDYTK-KKUMJFAQSA-N 0.000 description 4
- TWAVEIJGFCBWCG-JYJNAYRXSA-N Tyr-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N TWAVEIJGFCBWCG-JYJNAYRXSA-N 0.000 description 4
- PRONOHBTMLNXCZ-BZSNNMDCSA-N Tyr-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PRONOHBTMLNXCZ-BZSNNMDCSA-N 0.000 description 4
- WURLIFOWSMBUAR-SLFFLAALSA-N Tyr-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CC=C(C=C3)O)N)C(=O)O WURLIFOWSMBUAR-SLFFLAALSA-N 0.000 description 4
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 4
- XCCTYIAWTASOJW-XVFCMESISA-N Uridine-5'-Diphosphate Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 XCCTYIAWTASOJW-XVFCMESISA-N 0.000 description 4
- SLLKXDSRVAOREO-KZVJFYERSA-N Val-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N)O SLLKXDSRVAOREO-KZVJFYERSA-N 0.000 description 4
- VCAWFLIWYNMHQP-UKJIMTQDSA-N Val-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N VCAWFLIWYNMHQP-UKJIMTQDSA-N 0.000 description 4
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 4
- ZXYPHBKIZLAQTL-QXEWZRGKSA-N Val-Pro-Asp Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N ZXYPHBKIZLAQTL-QXEWZRGKSA-N 0.000 description 4
- IECQJCJNPJVUSB-IHRRRGAJSA-N Val-Tyr-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(O)=O IECQJCJNPJVUSB-IHRRRGAJSA-N 0.000 description 4
- 230000006978 adaptation Effects 0.000 description 4
- 108010044940 alanylglutamine Proteins 0.000 description 4
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 4
- 108010038850 arginyl-isoleucyl-tyrosine Proteins 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 108010069495 cysteinyltyrosine Proteins 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 4
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 4
- 230000002458 infectious effect Effects 0.000 description 4
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 4
- 108010012058 leucyltyrosine Proteins 0.000 description 4
- 108010009298 lysylglutamic acid Proteins 0.000 description 4
- 108010054155 lysyllysine Proteins 0.000 description 4
- 239000003550 marker Substances 0.000 description 4
- 108010056582 methionylglutamic acid Proteins 0.000 description 4
- 108010005942 methionylglycine Proteins 0.000 description 4
- 238000004806 packaging method and process Methods 0.000 description 4
- 108010093296 prolyl-prolyl-alanine Proteins 0.000 description 4
- 239000013646 rAAV2 vector Substances 0.000 description 4
- 150000003839 salts Chemical class 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 210000001519 tissue Anatomy 0.000 description 4
- 108010080629 tryptophan-leucine Proteins 0.000 description 4
- 108010073969 valyllysine Proteins 0.000 description 4
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 3
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 3
- ZEXDYVGDZJBRMO-ACZMJKKPSA-N Ala-Asn-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N ZEXDYVGDZJBRMO-ACZMJKKPSA-N 0.000 description 3
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 3
- BLTRAARCJYVJKV-QEJZJMRPSA-N Ala-Lys-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1ccccc1)C(O)=O BLTRAARCJYVJKV-QEJZJMRPSA-N 0.000 description 3
- OINVDEKBKBCPLX-JXUBOQSCSA-N Ala-Lys-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OINVDEKBKBCPLX-JXUBOQSCSA-N 0.000 description 3
- XUCHENWTTBFODJ-FXQIFTODSA-N Ala-Met-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O XUCHENWTTBFODJ-FXQIFTODSA-N 0.000 description 3
- WQKAQKZRDIZYNV-VZFHVOOUSA-N Ala-Ser-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WQKAQKZRDIZYNV-VZFHVOOUSA-N 0.000 description 3
- XQNRANMFRPCFFW-GCJQMDKQSA-N Ala-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C)N)O XQNRANMFRPCFFW-GCJQMDKQSA-N 0.000 description 3
- KUFVXLQLDHJVOG-SHGPDSBTSA-N Ala-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C)N)O KUFVXLQLDHJVOG-SHGPDSBTSA-N 0.000 description 3
- KBBKCNHWCDJPGN-GUBZILKMSA-N Arg-Gln-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KBBKCNHWCDJPGN-GUBZILKMSA-N 0.000 description 3
- WVNFNPGXYADPPO-BQBZGAKWSA-N Arg-Gly-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O WVNFNPGXYADPPO-BQBZGAKWSA-N 0.000 description 3
- FVBZXNSRIDVYJS-AVGNSLFASA-N Arg-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N FVBZXNSRIDVYJS-AVGNSLFASA-N 0.000 description 3
- HOIFSHOLNKQCSA-FXQIFTODSA-N Asn-Arg-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O HOIFSHOLNKQCSA-FXQIFTODSA-N 0.000 description 3
- DAPLJWATMAXPPZ-CIUDSAMLSA-N Asn-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(N)=O DAPLJWATMAXPPZ-CIUDSAMLSA-N 0.000 description 3
- KXFCBAHYSLJCCY-ZLUOBGJFSA-N Asn-Asn-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O KXFCBAHYSLJCCY-ZLUOBGJFSA-N 0.000 description 3
- XSGBIBGAMKTHMY-WHFBIAKZSA-N Asn-Asp-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O XSGBIBGAMKTHMY-WHFBIAKZSA-N 0.000 description 3
- AYKKKGFJXIDYLX-ACZMJKKPSA-N Asn-Gln-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O AYKKKGFJXIDYLX-ACZMJKKPSA-N 0.000 description 3
- FTSAJSADJCMDHH-CIUDSAMLSA-N Asn-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N FTSAJSADJCMDHH-CIUDSAMLSA-N 0.000 description 3
- VCJCPARXDBEGNE-GUBZILKMSA-N Asn-Pro-Pro Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 VCJCPARXDBEGNE-GUBZILKMSA-N 0.000 description 3
- REQUGIWGOGSOEZ-ZLUOBGJFSA-N Asn-Ser-Asn Chemical compound C([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)C(=O)N REQUGIWGOGSOEZ-ZLUOBGJFSA-N 0.000 description 3
- WLVLIYYBPPONRJ-GCJQMDKQSA-N Asn-Thr-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O WLVLIYYBPPONRJ-GCJQMDKQSA-N 0.000 description 3
- XIDSGDJNUJRUHE-VEVYYDQMSA-N Asn-Thr-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O XIDSGDJNUJRUHE-VEVYYDQMSA-N 0.000 description 3
- KZYSHAMXEBPJBD-JRQIVUDYSA-N Asn-Thr-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KZYSHAMXEBPJBD-JRQIVUDYSA-N 0.000 description 3
- JPPLRQVZMZFOSX-UWJYBYFXSA-N Asn-Tyr-Ala Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=C(O)C=C1 JPPLRQVZMZFOSX-UWJYBYFXSA-N 0.000 description 3
- RTFXPCYMDYBZNQ-SRVKXCTJSA-N Asn-Tyr-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O RTFXPCYMDYBZNQ-SRVKXCTJSA-N 0.000 description 3
- JZLFYAAGGYMRIK-BYULHYEWSA-N Asn-Val-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O JZLFYAAGGYMRIK-BYULHYEWSA-N 0.000 description 3
- ZLGKHJHFYSRUBH-FXQIFTODSA-N Asp-Arg-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O ZLGKHJHFYSRUBH-FXQIFTODSA-N 0.000 description 3
- VBVKSAFJPVXMFJ-CIUDSAMLSA-N Asp-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N VBVKSAFJPVXMFJ-CIUDSAMLSA-N 0.000 description 3
- SVABRQFIHCSNCI-FOHZUACHSA-N Asp-Gly-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O SVABRQFIHCSNCI-FOHZUACHSA-N 0.000 description 3
- KPSHWSWFPUDEGF-FXQIFTODSA-N Asp-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(O)=O KPSHWSWFPUDEGF-FXQIFTODSA-N 0.000 description 3
- QPDUWAUSSWGJSB-NGZCFLSTSA-N Asp-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N QPDUWAUSSWGJSB-NGZCFLSTSA-N 0.000 description 3
- 208000003322 Coinfection Diseases 0.000 description 3
- MJOYUXLETJMQGG-IHRRRGAJSA-N Cys-Tyr-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MJOYUXLETJMQGG-IHRRRGAJSA-N 0.000 description 3
- 108010046649 GDNP peptide Proteins 0.000 description 3
- YJIUYQKQBBQYHZ-ACZMJKKPSA-N Gln-Ala-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YJIUYQKQBBQYHZ-ACZMJKKPSA-N 0.000 description 3
- HHWQMFIGMMOVFK-WDSKDSINSA-N Gln-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O HHWQMFIGMMOVFK-WDSKDSINSA-N 0.000 description 3
- JSYULGSPLTZDHM-NRPADANISA-N Gln-Ala-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O JSYULGSPLTZDHM-NRPADANISA-N 0.000 description 3
- JESJDAAGXULQOP-CIUDSAMLSA-N Gln-Arg-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)CN=C(N)N JESJDAAGXULQOP-CIUDSAMLSA-N 0.000 description 3
- LJEPDHWNQXPXMM-NHCYSSNCSA-N Gln-Arg-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O LJEPDHWNQXPXMM-NHCYSSNCSA-N 0.000 description 3
- OETQLUYCMBARHJ-CIUDSAMLSA-N Gln-Asn-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OETQLUYCMBARHJ-CIUDSAMLSA-N 0.000 description 3
- ODBLJLZVLAWVMS-GUBZILKMSA-N Gln-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N ODBLJLZVLAWVMS-GUBZILKMSA-N 0.000 description 3
- CGVWDTRDPLOMHZ-FXQIFTODSA-N Gln-Glu-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O CGVWDTRDPLOMHZ-FXQIFTODSA-N 0.000 description 3
- PODFFOWWLUPNMN-DCAQKATOSA-N Gln-His-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O PODFFOWWLUPNMN-DCAQKATOSA-N 0.000 description 3
- FALJZCPMTGJOHX-SRVKXCTJSA-N Gln-Met-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O FALJZCPMTGJOHX-SRVKXCTJSA-N 0.000 description 3
- UESYBOXFJWJVSB-AVGNSLFASA-N Gln-Phe-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O UESYBOXFJWJVSB-AVGNSLFASA-N 0.000 description 3
- MFORDNZDKAVNSR-SRVKXCTJSA-N Gln-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCC(N)=O MFORDNZDKAVNSR-SRVKXCTJSA-N 0.000 description 3
- NYCVMJGIJYQWDO-CIUDSAMLSA-N Gln-Ser-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NYCVMJGIJYQWDO-CIUDSAMLSA-N 0.000 description 3
- OXEMJGCAJFFREE-FXQIFTODSA-N Glu-Gln-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O OXEMJGCAJFFREE-FXQIFTODSA-N 0.000 description 3
- AIGROOHQXCACHL-WDSKDSINSA-N Glu-Gly-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O AIGROOHQXCACHL-WDSKDSINSA-N 0.000 description 3
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 3
- UGSVSNXPJJDJKL-SDDRHHMPSA-N Glu-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N UGSVSNXPJJDJKL-SDDRHHMPSA-N 0.000 description 3
- HAGKYCXGTRUUFI-RYUDHWBXSA-N Glu-Tyr-Gly Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)O HAGKYCXGTRUUFI-RYUDHWBXSA-N 0.000 description 3
- YQPFCZVKMUVZIN-AUTRQRHGSA-N Glu-Val-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQPFCZVKMUVZIN-AUTRQRHGSA-N 0.000 description 3
- GWCRIHNSVMOBEQ-BQBZGAKWSA-N Gly-Arg-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O GWCRIHNSVMOBEQ-BQBZGAKWSA-N 0.000 description 3
- JVACNFOPSUPDTK-QWRGUYRKSA-N Gly-Asn-Phe Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JVACNFOPSUPDTK-QWRGUYRKSA-N 0.000 description 3
- GMTXWRIDLGTVFC-IUCAKERBSA-N Gly-Lys-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMTXWRIDLGTVFC-IUCAKERBSA-N 0.000 description 3
- GGLIDLCEPDHEJO-BQBZGAKWSA-N Gly-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)CN GGLIDLCEPDHEJO-BQBZGAKWSA-N 0.000 description 3
- QSQXZZCGPXQBPP-BQBZGAKWSA-N Gly-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)CN)C(=O)N[C@@H](CS)C(=O)O QSQXZZCGPXQBPP-BQBZGAKWSA-N 0.000 description 3
- IALQAMYQJBZNSK-WHFBIAKZSA-N Gly-Ser-Asn Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O IALQAMYQJBZNSK-WHFBIAKZSA-N 0.000 description 3
- TVTZEOHWHUVYCG-KYNKHSRBSA-N Gly-Thr-Thr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O TVTZEOHWHUVYCG-KYNKHSRBSA-N 0.000 description 3
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 3
- HVCRQRQPIIRNLY-IUCAKERBSA-N His-Gln-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N HVCRQRQPIIRNLY-IUCAKERBSA-N 0.000 description 3
- QEYUCKCWTMIERU-SRVKXCTJSA-N His-Lys-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N QEYUCKCWTMIERU-SRVKXCTJSA-N 0.000 description 3
- DVRDRICMWUSCBN-UKJIMTQDSA-N Ile-Gln-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DVRDRICMWUSCBN-UKJIMTQDSA-N 0.000 description 3
- GAZGFPOZOLEYAJ-YTFOTSKYSA-N Ile-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N GAZGFPOZOLEYAJ-YTFOTSKYSA-N 0.000 description 3
- UIEZQYNXCYHMQS-BJDJZHNGSA-N Ile-Lys-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)O)N UIEZQYNXCYHMQS-BJDJZHNGSA-N 0.000 description 3
- IITVUURPOYGCTD-NAKRPEOUSA-N Ile-Pro-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IITVUURPOYGCTD-NAKRPEOUSA-N 0.000 description 3
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 3
- UGTHTQWIQKEDEH-BQBZGAKWSA-N L-alanyl-L-prolylglycine zwitterion Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UGTHTQWIQKEDEH-BQBZGAKWSA-N 0.000 description 3
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 3
- JKGHDYGZRDWHGA-SRVKXCTJSA-N Leu-Asn-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JKGHDYGZRDWHGA-SRVKXCTJSA-N 0.000 description 3
- MDVZJYGNAGLPGJ-KKUMJFAQSA-N Leu-Asn-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MDVZJYGNAGLPGJ-KKUMJFAQSA-N 0.000 description 3
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 3
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 3
- FGZVGOAAROXFAB-IXOXFDKPSA-N Leu-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N)O FGZVGOAAROXFAB-IXOXFDKPSA-N 0.000 description 3
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 3
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 3
- KNKHAVVBVXKOGX-JXUBOQSCSA-N Lys-Ala-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KNKHAVVBVXKOGX-JXUBOQSCSA-N 0.000 description 3
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 3
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 3
- ZJWIXBZTAAJERF-IHRRRGAJSA-N Lys-Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZJWIXBZTAAJERF-IHRRRGAJSA-N 0.000 description 3
- MIFFFXHMAHFACR-KATARQTJSA-N Lys-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN MIFFFXHMAHFACR-KATARQTJSA-N 0.000 description 3
- CUHGAUZONORRIC-HJGDQZAQSA-N Lys-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O CUHGAUZONORRIC-HJGDQZAQSA-N 0.000 description 3
- GIKFNMZSGYAPEJ-HJGDQZAQSA-N Lys-Thr-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O GIKFNMZSGYAPEJ-HJGDQZAQSA-N 0.000 description 3
- IUYCGMNKIZDRQI-BQBZGAKWSA-N Met-Gly-Ala Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O IUYCGMNKIZDRQI-BQBZGAKWSA-N 0.000 description 3
- HZLSUXCMSIBCRV-RVMXOQNASA-N Met-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N HZLSUXCMSIBCRV-RVMXOQNASA-N 0.000 description 3
- GGXZOTSDJJTDGB-GUBZILKMSA-N Met-Ser-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O GGXZOTSDJJTDGB-GUBZILKMSA-N 0.000 description 3
- 241000699666 Mus <mouse, genus> Species 0.000 description 3
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 3
- QMMRHASQEVCJGR-UBHSHLNASA-N Phe-Ala-Pro Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=CC=C1 QMMRHASQEVCJGR-UBHSHLNASA-N 0.000 description 3
- KIEPQOIQHFKQLK-PCBIJLKTSA-N Phe-Asn-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KIEPQOIQHFKQLK-PCBIJLKTSA-N 0.000 description 3
- BEEVXUYVEHXWRQ-YESZJQIVSA-N Phe-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC3=CC=CC=C3)N)C(=O)O BEEVXUYVEHXWRQ-YESZJQIVSA-N 0.000 description 3
- JLLJTMHNXQTMCK-UBHSHLNASA-N Phe-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 JLLJTMHNXQTMCK-UBHSHLNASA-N 0.000 description 3
- NJJBATPLUQHRBM-IHRRRGAJSA-N Phe-Pro-Ser Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CO)C(=O)O NJJBATPLUQHRBM-IHRRRGAJSA-N 0.000 description 3
- 101710182846 Polyhedrin Proteins 0.000 description 3
- VXCHGLYSIOOZIS-GUBZILKMSA-N Pro-Ala-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 VXCHGLYSIOOZIS-GUBZILKMSA-N 0.000 description 3
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 3
- DIFXZGPHVCIVSQ-CIUDSAMLSA-N Pro-Gln-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DIFXZGPHVCIVSQ-CIUDSAMLSA-N 0.000 description 3
- ULWBBFKQBDNGOY-RWMBFGLXSA-N Pro-Lys-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N2CCC[C@@H]2C(=O)O ULWBBFKQBDNGOY-RWMBFGLXSA-N 0.000 description 3
- KDCGOANMDULRCW-UHFFFAOYSA-N Purine Natural products N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 3
- MWMKFWJYRRGXOR-ZLUOBGJFSA-N Ser-Ala-Asn Chemical compound N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)O)CC(N)=O)C)CO MWMKFWJYRRGXOR-ZLUOBGJFSA-N 0.000 description 3
- MMGJPDWSIOAGTH-ACZMJKKPSA-N Ser-Ala-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MMGJPDWSIOAGTH-ACZMJKKPSA-N 0.000 description 3
- JPIDMRXXNMIVKY-VZFHVOOUSA-N Ser-Ala-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPIDMRXXNMIVKY-VZFHVOOUSA-N 0.000 description 3
- FIDMVVBUOCMMJG-CIUDSAMLSA-N Ser-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO FIDMVVBUOCMMJG-CIUDSAMLSA-N 0.000 description 3
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 3
- RDFQNDHEHVSONI-ZLUOBGJFSA-N Ser-Asn-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDFQNDHEHVSONI-ZLUOBGJFSA-N 0.000 description 3
- CRZRTKAVUUGKEQ-ACZMJKKPSA-N Ser-Gln-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CRZRTKAVUUGKEQ-ACZMJKKPSA-N 0.000 description 3
- MUARUIBTKQJKFY-WHFBIAKZSA-N Ser-Gly-Asp Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MUARUIBTKQJKFY-WHFBIAKZSA-N 0.000 description 3
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 3
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 3
- HBTCFCHYALPXME-HTFCKZLJSA-N Ser-Ile-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HBTCFCHYALPXME-HTFCKZLJSA-N 0.000 description 3
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 3
- WNDUPCKKKGSKIQ-CIUDSAMLSA-N Ser-Pro-Gln Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O WNDUPCKKKGSKIQ-CIUDSAMLSA-N 0.000 description 3
- FLMYSKVSDVHLEW-SVSWQMSJSA-N Ser-Thr-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLMYSKVSDVHLEW-SVSWQMSJSA-N 0.000 description 3
- SNXUIBACCONSOH-BWBBJGPYSA-N Ser-Thr-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CO)C(O)=O SNXUIBACCONSOH-BWBBJGPYSA-N 0.000 description 3
- VLMIUSLQONKLDV-HEIBUPTGSA-N Ser-Thr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VLMIUSLQONKLDV-HEIBUPTGSA-N 0.000 description 3
- PIQRHJQWEPWFJG-UWJYBYFXSA-N Ser-Tyr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PIQRHJQWEPWFJG-UWJYBYFXSA-N 0.000 description 3
- PQEQXWRVHQAAKS-SRVKXCTJSA-N Ser-Tyr-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CO)N)CC1=CC=C(O)C=C1 PQEQXWRVHQAAKS-SRVKXCTJSA-N 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- GFDUZZACIWNMPE-KZVJFYERSA-N Thr-Ala-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O GFDUZZACIWNMPE-KZVJFYERSA-N 0.000 description 3
- JMZKMSTYXHFYAK-VEVYYDQMSA-N Thr-Arg-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O JMZKMSTYXHFYAK-VEVYYDQMSA-N 0.000 description 3
- SHOMROOOQBDGRL-JHEQGTHGSA-N Thr-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SHOMROOOQBDGRL-JHEQGTHGSA-N 0.000 description 3
- KKPOGALELPLJTL-MEYUZBJRSA-N Thr-Lys-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KKPOGALELPLJTL-MEYUZBJRSA-N 0.000 description 3
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 3
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 3
- NJGMALCNYAMYCB-JRQIVUDYSA-N Thr-Tyr-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O NJGMALCNYAMYCB-JRQIVUDYSA-N 0.000 description 3
- XVHAUVJXBFGUPC-RPTUDFQQSA-N Thr-Tyr-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XVHAUVJXBFGUPC-RPTUDFQQSA-N 0.000 description 3
- HYVLNORXQGKONN-NUTKFTJISA-N Trp-Ala-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 HYVLNORXQGKONN-NUTKFTJISA-N 0.000 description 3
- NOFFAYIYPAUNRM-HKUYNNGSSA-N Trp-Gly-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC2=CNC3=CC=CC=C32)N NOFFAYIYPAUNRM-HKUYNNGSSA-N 0.000 description 3
- SEXRBCGSZRCIPE-LYSGOOTNSA-N Trp-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O SEXRBCGSZRCIPE-LYSGOOTNSA-N 0.000 description 3
- AYHSJESDFKREAR-KKUMJFAQSA-N Tyr-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AYHSJESDFKREAR-KKUMJFAQSA-N 0.000 description 3
- NMKJPMCEKQHRPD-IRXDYDNUSA-N Tyr-Gly-Tyr Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 NMKJPMCEKQHRPD-IRXDYDNUSA-N 0.000 description 3
- FBHBVXUBTYVCRU-BZSNNMDCSA-N Tyr-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CN=CN1 FBHBVXUBTYVCRU-BZSNNMDCSA-N 0.000 description 3
- NKUGCYDFQKFVOJ-JYJNAYRXSA-N Tyr-Leu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NKUGCYDFQKFVOJ-JYJNAYRXSA-N 0.000 description 3
- LVFZXRQQQDTBQH-IRIUXVKKSA-N Tyr-Thr-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LVFZXRQQQDTBQH-IRIUXVKKSA-N 0.000 description 3
- LDKDSFQSEUOCOO-RPTUDFQQSA-N Tyr-Thr-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LDKDSFQSEUOCOO-RPTUDFQQSA-N 0.000 description 3
- HZWPGKAKGYJWCI-ULQDDVLXSA-N Tyr-Val-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(C)C)C(O)=O HZWPGKAKGYJWCI-ULQDDVLXSA-N 0.000 description 3
- YFOCMOVJBQDBCE-NRPADANISA-N Val-Ala-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N YFOCMOVJBQDBCE-NRPADANISA-N 0.000 description 3
- ZMDCGGKHRKNWKD-LAEOZQHASA-N Val-Asn-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZMDCGGKHRKNWKD-LAEOZQHASA-N 0.000 description 3
- VUTHNLMCXKLLFI-LAEOZQHASA-N Val-Asp-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VUTHNLMCXKLLFI-LAEOZQHASA-N 0.000 description 3
- UZDHNIJRRTUKKC-DLOVCJGASA-N Val-Gln-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N UZDHNIJRRTUKKC-DLOVCJGASA-N 0.000 description 3
- JTWIMNMUYLQNPI-WPRPVWTQSA-N Val-Gly-Arg Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N JTWIMNMUYLQNPI-WPRPVWTQSA-N 0.000 description 3
- ZEBRMWPTJNHXAJ-JYJNAYRXSA-N Val-Phe-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)O)N ZEBRMWPTJNHXAJ-JYJNAYRXSA-N 0.000 description 3
- KISFXYYRKKNLOP-IHRRRGAJSA-N Val-Phe-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N KISFXYYRKKNLOP-IHRRRGAJSA-N 0.000 description 3
- XBJKAZATRJBDCU-GUBZILKMSA-N Val-Pro-Ala Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O XBJKAZATRJBDCU-GUBZILKMSA-N 0.000 description 3
- 108010070783 alanyltyrosine Proteins 0.000 description 3
- 108010013835 arginine glutamate Proteins 0.000 description 3
- 108010008355 arginyl-glutamine Proteins 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 210000004369 blood Anatomy 0.000 description 3
- 239000000872 buffer Substances 0.000 description 3
- 238000004113 cell culture Methods 0.000 description 3
- 230000001413 cellular effect Effects 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 230000001627 detrimental effect Effects 0.000 description 3
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 3
- 229960000301 factor viii Drugs 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000002347 injection Methods 0.000 description 3
- 239000007924 injection Substances 0.000 description 3
- 108010064235 lysylglycine Proteins 0.000 description 3
- 108010038320 lysylphenylalanine Proteins 0.000 description 3
- 108010017391 lysylvaline Proteins 0.000 description 3
- 239000013612 plasmid Substances 0.000 description 3
- 108010004914 prolylarginine Proteins 0.000 description 3
- 230000001105 regulatory effect Effects 0.000 description 3
- 239000011347 resin Substances 0.000 description 3
- 229920005989 resin Polymers 0.000 description 3
- 108010071207 serylmethionine Proteins 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 108010051110 tyrosyl-lysine Proteins 0.000 description 3
- 241000701161 unidentified adenovirus Species 0.000 description 3
- QMOQBVOBWVNSNO-UHFFFAOYSA-N 2-[[2-[[2-[(2-azaniumylacetyl)amino]acetyl]amino]acetyl]amino]acetate Chemical compound NCC(=O)NCC(=O)NCC(=O)NCC(O)=O QMOQBVOBWVNSNO-UHFFFAOYSA-N 0.000 description 2
- 241000300529 Adeno-associated virus 13 Species 0.000 description 2
- 241000649044 Adeno-associated virus 9 Species 0.000 description 2
- DKJPOZOEBONHFS-ZLUOBGJFSA-N Ala-Ala-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O DKJPOZOEBONHFS-ZLUOBGJFSA-N 0.000 description 2
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 2
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 2
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 2
- JAMAWBXXKFGFGX-KZVJFYERSA-N Ala-Arg-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JAMAWBXXKFGFGX-KZVJFYERSA-N 0.000 description 2
- YAXNATKKPOWVCP-ZLUOBGJFSA-N Ala-Asn-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O YAXNATKKPOWVCP-ZLUOBGJFSA-N 0.000 description 2
- GFBLJMHGHAXGNY-ZLUOBGJFSA-N Ala-Asn-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O GFBLJMHGHAXGNY-ZLUOBGJFSA-N 0.000 description 2
- GORKKVHIBWAQHM-GCJQMDKQSA-N Ala-Asn-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GORKKVHIBWAQHM-GCJQMDKQSA-N 0.000 description 2
- GWFSQQNGMPGBEF-GHCJXIJMSA-N Ala-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)N GWFSQQNGMPGBEF-GHCJXIJMSA-N 0.000 description 2
- BUDNAJYVCUHLSV-ZLUOBGJFSA-N Ala-Asp-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O BUDNAJYVCUHLSV-ZLUOBGJFSA-N 0.000 description 2
- CZPAHAKGPDUIPJ-CIUDSAMLSA-N Ala-Gln-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O CZPAHAKGPDUIPJ-CIUDSAMLSA-N 0.000 description 2
- BGNLUHXLSAQYRQ-FXQIFTODSA-N Ala-Glu-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O BGNLUHXLSAQYRQ-FXQIFTODSA-N 0.000 description 2
- NHLAEBFGWPXFGI-WHFBIAKZSA-N Ala-Gly-Asn Chemical compound C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N NHLAEBFGWPXFGI-WHFBIAKZSA-N 0.000 description 2
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 2
- PNALXAODQKTNLV-JBDRJPRFSA-N Ala-Ile-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O PNALXAODQKTNLV-JBDRJPRFSA-N 0.000 description 2
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 2
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 2
- FFZJHQODAYHGPO-KZVJFYERSA-N Ala-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N FFZJHQODAYHGPO-KZVJFYERSA-N 0.000 description 2
- DYXOFPBJBAHWFY-JBDRJPRFSA-N Ala-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N DYXOFPBJBAHWFY-JBDRJPRFSA-N 0.000 description 2
- SYIFFFHSXBNPMC-UWJYBYFXSA-N Ala-Ser-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N SYIFFFHSXBNPMC-UWJYBYFXSA-N 0.000 description 2
- ARHJJAAWNWOACN-FXQIFTODSA-N Ala-Ser-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O ARHJJAAWNWOACN-FXQIFTODSA-N 0.000 description 2
- WNHNMKOFKCHKKD-BFHQHQDPSA-N Ala-Thr-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O WNHNMKOFKCHKKD-BFHQHQDPSA-N 0.000 description 2
- QOIGKCBMXUCDQU-KDXUFGMBSA-N Ala-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N)O QOIGKCBMXUCDQU-KDXUFGMBSA-N 0.000 description 2
- BHFOJPDOQPWJRN-XDTLVQLUSA-N Ala-Tyr-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCC(N)=O)C(O)=O BHFOJPDOQPWJRN-XDTLVQLUSA-N 0.000 description 2
- LYILPUNCKACNGF-NAKRPEOUSA-N Ala-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N LYILPUNCKACNGF-NAKRPEOUSA-N 0.000 description 2
- 102000005666 Apolipoprotein A-I Human genes 0.000 description 2
- 108010059886 Apolipoprotein A-I Proteins 0.000 description 2
- IASNWHAGGYTEKX-IUCAKERBSA-N Arg-Arg-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(O)=O IASNWHAGGYTEKX-IUCAKERBSA-N 0.000 description 2
- NONSEUUPKITYQT-BQBZGAKWSA-N Arg-Asn-Gly Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N)CN=C(N)N NONSEUUPKITYQT-BQBZGAKWSA-N 0.000 description 2
- IIABBYGHLYWVOS-FXQIFTODSA-N Arg-Asn-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O IIABBYGHLYWVOS-FXQIFTODSA-N 0.000 description 2
- ZTKHZAXGTFXUDD-VEVYYDQMSA-N Arg-Asn-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZTKHZAXGTFXUDD-VEVYYDQMSA-N 0.000 description 2
- HKRXJBBCQBAGIM-FXQIFTODSA-N Arg-Asp-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N)CN=C(N)N HKRXJBBCQBAGIM-FXQIFTODSA-N 0.000 description 2
- BGDILZXXDJCKPF-CIUDSAMLSA-N Arg-Gln-Cys Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CS)C(O)=O BGDILZXXDJCKPF-CIUDSAMLSA-N 0.000 description 2
- VNFWDYWTSHFRRG-SRVKXCTJSA-N Arg-Gln-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O VNFWDYWTSHFRRG-SRVKXCTJSA-N 0.000 description 2
- NKBQZKVMKJJDLX-SRVKXCTJSA-N Arg-Glu-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NKBQZKVMKJJDLX-SRVKXCTJSA-N 0.000 description 2
- OGUPCHKBOKJFMA-SRVKXCTJSA-N Arg-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N OGUPCHKBOKJFMA-SRVKXCTJSA-N 0.000 description 2
- RFXXUWGNVRJTNQ-QXEWZRGKSA-N Arg-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N RFXXUWGNVRJTNQ-QXEWZRGKSA-N 0.000 description 2
- OOIMKQRCPJBGPD-XUXIUFHCSA-N Arg-Ile-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O OOIMKQRCPJBGPD-XUXIUFHCSA-N 0.000 description 2
- FNXCAFKDGBROCU-STECZYCISA-N Arg-Ile-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FNXCAFKDGBROCU-STECZYCISA-N 0.000 description 2
- BTJVOUQWFXABOI-IHRRRGAJSA-N Arg-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCNC(N)=N BTJVOUQWFXABOI-IHRRRGAJSA-N 0.000 description 2
- GSUFZRURORXYTM-STQMWFEESA-N Arg-Phe-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 GSUFZRURORXYTM-STQMWFEESA-N 0.000 description 2
- FIQKRDXFTANIEJ-ULQDDVLXSA-N Arg-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FIQKRDXFTANIEJ-ULQDDVLXSA-N 0.000 description 2
- KZXPVYVSHUJCEO-ULQDDVLXSA-N Arg-Phe-Lys Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 KZXPVYVSHUJCEO-ULQDDVLXSA-N 0.000 description 2
- VUGWHBXPMAHEGZ-SRVKXCTJSA-N Arg-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N VUGWHBXPMAHEGZ-SRVKXCTJSA-N 0.000 description 2
- VRTWYUYCJGNFES-CIUDSAMLSA-N Arg-Ser-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O VRTWYUYCJGNFES-CIUDSAMLSA-N 0.000 description 2
- ZPWMEWYQBWSGAO-ZJDVBMNYSA-N Arg-Thr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZPWMEWYQBWSGAO-ZJDVBMNYSA-N 0.000 description 2
- DRDWXKWUSIKKOB-PJODQICGSA-N Arg-Trp-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O DRDWXKWUSIKKOB-PJODQICGSA-N 0.000 description 2
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 2
- BWMMKQPATDUYKB-IHRRRGAJSA-N Arg-Tyr-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=C(O)C=C1 BWMMKQPATDUYKB-IHRRRGAJSA-N 0.000 description 2
- CTAPSNCVKPOOSM-KKUMJFAQSA-N Arg-Tyr-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O CTAPSNCVKPOOSM-KKUMJFAQSA-N 0.000 description 2
- PTNFNTOBUDWHNZ-GUBZILKMSA-N Asn-Arg-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O PTNFNTOBUDWHNZ-GUBZILKMSA-N 0.000 description 2
- VKCOHFFSTKCXEQ-OLHMAJIHSA-N Asn-Asn-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VKCOHFFSTKCXEQ-OLHMAJIHSA-N 0.000 description 2
- PQAIOUVVZCOLJK-FXQIFTODSA-N Asn-Gln-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N PQAIOUVVZCOLJK-FXQIFTODSA-N 0.000 description 2
- VJTWLBMESLDOMK-WDSKDSINSA-N Asn-Gln-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VJTWLBMESLDOMK-WDSKDSINSA-N 0.000 description 2
- KUYKVGODHGHFDI-ACZMJKKPSA-N Asn-Gln-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O KUYKVGODHGHFDI-ACZMJKKPSA-N 0.000 description 2
- OWUCNXMFJRFOFI-BQBZGAKWSA-N Asn-Gly-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O OWUCNXMFJRFOFI-BQBZGAKWSA-N 0.000 description 2
- UDSVWSUXKYXSTR-QWRGUYRKSA-N Asn-Gly-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UDSVWSUXKYXSTR-QWRGUYRKSA-N 0.000 description 2
- MOHUTCNYQLMARY-GUBZILKMSA-N Asn-His-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MOHUTCNYQLMARY-GUBZILKMSA-N 0.000 description 2
- IKLAUGBIDCDFOY-SRVKXCTJSA-N Asn-His-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O IKLAUGBIDCDFOY-SRVKXCTJSA-N 0.000 description 2
- PNHQRQTVBRDIEF-CIUDSAMLSA-N Asn-Leu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)N)N PNHQRQTVBRDIEF-CIUDSAMLSA-N 0.000 description 2
- FODVBOKTYKYRFJ-CIUDSAMLSA-N Asn-Lys-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)N)N FODVBOKTYKYRFJ-CIUDSAMLSA-N 0.000 description 2
- COWITDLVHMZSIW-CIUDSAMLSA-N Asn-Lys-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O COWITDLVHMZSIW-CIUDSAMLSA-N 0.000 description 2
- NLDNNZKUSLAYFW-NHCYSSNCSA-N Asn-Lys-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O NLDNNZKUSLAYFW-NHCYSSNCSA-N 0.000 description 2
- HGGIYWURFPGLIU-FXQIFTODSA-N Asn-Met-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(N)=O HGGIYWURFPGLIU-FXQIFTODSA-N 0.000 description 2
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 2
- ZJIFRAPZHAGLGR-MELADBBJSA-N Asn-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC(=O)N)N)C(=O)O ZJIFRAPZHAGLGR-MELADBBJSA-N 0.000 description 2
- YUOXLJYVSZYPBJ-CIUDSAMLSA-N Asn-Pro-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O YUOXLJYVSZYPBJ-CIUDSAMLSA-N 0.000 description 2
- IDUUACUJKUXKKD-VEVYYDQMSA-N Asn-Pro-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O IDUUACUJKUXKKD-VEVYYDQMSA-N 0.000 description 2
- GZXOUBTUAUAVHD-ACZMJKKPSA-N Asn-Ser-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GZXOUBTUAUAVHD-ACZMJKKPSA-N 0.000 description 2
- ZUFPUBYQYWCMDB-NUMRIWBASA-N Asn-Thr-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZUFPUBYQYWCMDB-NUMRIWBASA-N 0.000 description 2
- SYZWMVSXBZCOBZ-QXEWZRGKSA-N Asn-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)N)N SYZWMVSXBZCOBZ-QXEWZRGKSA-N 0.000 description 2
- GBAWQWASNGUNQF-ZLUOBGJFSA-N Asp-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N GBAWQWASNGUNQF-ZLUOBGJFSA-N 0.000 description 2
- XPGVTUBABLRGHY-BIIVOSGPSA-N Asp-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N XPGVTUBABLRGHY-BIIVOSGPSA-N 0.000 description 2
- AKPLMZMNJGNUKT-ZLUOBGJFSA-N Asp-Asp-Cys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CS)C(O)=O AKPLMZMNJGNUKT-ZLUOBGJFSA-N 0.000 description 2
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 2
- FTNVLGCFIJEMQT-CIUDSAMLSA-N Asp-Cys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N FTNVLGCFIJEMQT-CIUDSAMLSA-N 0.000 description 2
- NYQHSUGFEWDWPD-ACZMJKKPSA-N Asp-Gln-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N NYQHSUGFEWDWPD-ACZMJKKPSA-N 0.000 description 2
- YDJVIBMKAMQPPP-LAEOZQHASA-N Asp-Glu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O YDJVIBMKAMQPPP-LAEOZQHASA-N 0.000 description 2
- YNCHFVRXEQFPBY-BQBZGAKWSA-N Asp-Gly-Arg Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N YNCHFVRXEQFPBY-BQBZGAKWSA-N 0.000 description 2
- ZSVJVIOVABDTTL-YUMQZZPRSA-N Asp-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)O)N ZSVJVIOVABDTTL-YUMQZZPRSA-N 0.000 description 2
- KLYPOCBLKMPBIQ-GHCJXIJMSA-N Asp-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N KLYPOCBLKMPBIQ-GHCJXIJMSA-N 0.000 description 2
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 2
- YWLDTBBUHZJQHW-KKUMJFAQSA-N Asp-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N YWLDTBBUHZJQHW-KKUMJFAQSA-N 0.000 description 2
- WWOYXVBGHAHQBG-FXQIFTODSA-N Asp-Met-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O WWOYXVBGHAHQBG-FXQIFTODSA-N 0.000 description 2
- QJHOOKBAHRJPPX-QWRGUYRKSA-N Asp-Phe-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 QJHOOKBAHRJPPX-QWRGUYRKSA-N 0.000 description 2
- UCHSVZYJKJLPHF-BZSNNMDCSA-N Asp-Phe-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O UCHSVZYJKJLPHF-BZSNNMDCSA-N 0.000 description 2
- NWAHPBGBDIFUFD-KKUMJFAQSA-N Asp-Tyr-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O NWAHPBGBDIFUFD-KKUMJFAQSA-N 0.000 description 2
- 241000282465 Canis Species 0.000 description 2
- 101150044789 Cap gene Proteins 0.000 description 2
- ZFHXNNXMNLWKJH-HJPIBITLSA-N Cys-Tyr-Ile Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZFHXNNXMNLWKJH-HJPIBITLSA-N 0.000 description 2
- JRZMCSIUYGSJKP-ZKWXMUAHSA-N Cys-Val-Asn Chemical compound SC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O JRZMCSIUYGSJKP-ZKWXMUAHSA-N 0.000 description 2
- DGQJGBDBFVGLGL-ZKWXMUAHSA-N Cys-Val-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N DGQJGBDBFVGLGL-ZKWXMUAHSA-N 0.000 description 2
- 108010079245 Cystic Fibrosis Transmembrane Conductance Regulator Proteins 0.000 description 2
- 108090000695 Cytokines Proteins 0.000 description 2
- 102000004127 Cytokines Human genes 0.000 description 2
- 108010090461 DFG peptide Proteins 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 241000255581 Drosophila <fruit fly, genus> Species 0.000 description 2
- 102000001039 Dystrophin Human genes 0.000 description 2
- 108010069091 Dystrophin Proteins 0.000 description 2
- 108010092408 Eosinophil Peroxidase Proteins 0.000 description 2
- 102100028471 Eosinophil peroxidase Human genes 0.000 description 2
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 2
- 102100027581 Forkhead box protein P3 Human genes 0.000 description 2
- 108091010837 Glial cell line-derived neurotrophic factor Proteins 0.000 description 2
- 102000034615 Glial cell line-derived neurotrophic factor Human genes 0.000 description 2
- INKFLNZBTSNFON-CIUDSAMLSA-N Gln-Ala-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O INKFLNZBTSNFON-CIUDSAMLSA-N 0.000 description 2
- SHERTACNJPYHAR-ACZMJKKPSA-N Gln-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O SHERTACNJPYHAR-ACZMJKKPSA-N 0.000 description 2
- SOBBAYVQSNXYPQ-ACZMJKKPSA-N Gln-Asn-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SOBBAYVQSNXYPQ-ACZMJKKPSA-N 0.000 description 2
- WLODHVXYKYHLJD-ACZMJKKPSA-N Gln-Asp-Ser Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CO)C(=O)O)N WLODHVXYKYHLJD-ACZMJKKPSA-N 0.000 description 2
- CITDWMLWXNUQKD-FXQIFTODSA-N Gln-Gln-Asn Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CITDWMLWXNUQKD-FXQIFTODSA-N 0.000 description 2
- GPISLLFQNHELLK-DCAQKATOSA-N Gln-Gln-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N GPISLLFQNHELLK-DCAQKATOSA-N 0.000 description 2
- BLOXULLYFRGYKZ-GUBZILKMSA-N Gln-Glu-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BLOXULLYFRGYKZ-GUBZILKMSA-N 0.000 description 2
- MAGNEQBFSBREJL-DCAQKATOSA-N Gln-Glu-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N MAGNEQBFSBREJL-DCAQKATOSA-N 0.000 description 2
- JXBZEDIQFFCHPZ-PEFMBERDSA-N Gln-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JXBZEDIQFFCHPZ-PEFMBERDSA-N 0.000 description 2
- MWERYIXRDZDXOA-QEWYBTABSA-N Gln-Ile-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MWERYIXRDZDXOA-QEWYBTABSA-N 0.000 description 2
- LGIKBBLQVSWUGK-DCAQKATOSA-N Gln-Leu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGIKBBLQVSWUGK-DCAQKATOSA-N 0.000 description 2
- BZULIEARJFRINC-IHRRRGAJSA-N Gln-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N BZULIEARJFRINC-IHRRRGAJSA-N 0.000 description 2
- JILRMFFFCHUUTJ-ACZMJKKPSA-N Gln-Ser-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O JILRMFFFCHUUTJ-ACZMJKKPSA-N 0.000 description 2
- YRHZWVKUFWCEPW-GLLZPBPUSA-N Gln-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O YRHZWVKUFWCEPW-GLLZPBPUSA-N 0.000 description 2
- YMCPEHDGTRUOHO-SXNHZJKMSA-N Gln-Trp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CCC(=O)N)N YMCPEHDGTRUOHO-SXNHZJKMSA-N 0.000 description 2
- VYOILACOFPPNQH-UMNHJUIQSA-N Gln-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N VYOILACOFPPNQH-UMNHJUIQSA-N 0.000 description 2
- HNAUFGBKJLTWQE-IFFSRLJSSA-N Gln-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCC(=O)N)N)O HNAUFGBKJLTWQE-IFFSRLJSSA-N 0.000 description 2
- ITYRYNUZHPNCIK-GUBZILKMSA-N Glu-Ala-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O ITYRYNUZHPNCIK-GUBZILKMSA-N 0.000 description 2
- KEBACWCLVOXFNC-DCAQKATOSA-N Glu-Arg-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O KEBACWCLVOXFNC-DCAQKATOSA-N 0.000 description 2
- WOSRKEJQESVHGA-CIUDSAMLSA-N Glu-Arg-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O WOSRKEJQESVHGA-CIUDSAMLSA-N 0.000 description 2
- SBYVDRJAXWSXQL-AVGNSLFASA-N Glu-Asn-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SBYVDRJAXWSXQL-AVGNSLFASA-N 0.000 description 2
- RDDSZZJOKDVPAE-ACZMJKKPSA-N Glu-Asn-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDDSZZJOKDVPAE-ACZMJKKPSA-N 0.000 description 2
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 2
- QPRZKNOOOBWXSU-CIUDSAMLSA-N Glu-Asp-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N QPRZKNOOOBWXSU-CIUDSAMLSA-N 0.000 description 2
- WATXSTJXNBOHKD-LAEOZQHASA-N Glu-Asp-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O WATXSTJXNBOHKD-LAEOZQHASA-N 0.000 description 2
- ZXQPJYWZSFGWJB-AVGNSLFASA-N Glu-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N ZXQPJYWZSFGWJB-AVGNSLFASA-N 0.000 description 2
- HUFCEIHAFNVSNR-IHRRRGAJSA-N Glu-Gln-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HUFCEIHAFNVSNR-IHRRRGAJSA-N 0.000 description 2
- NKLRYVLERDYDBI-FXQIFTODSA-N Glu-Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKLRYVLERDYDBI-FXQIFTODSA-N 0.000 description 2
- XOIATPHFYVWFEU-DCAQKATOSA-N Glu-His-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XOIATPHFYVWFEU-DCAQKATOSA-N 0.000 description 2
- JGHNIWVNCAOVRO-DCAQKATOSA-N Glu-His-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O JGHNIWVNCAOVRO-DCAQKATOSA-N 0.000 description 2
- DVLZZEPUNFEUBW-AVGNSLFASA-N Glu-His-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N DVLZZEPUNFEUBW-AVGNSLFASA-N 0.000 description 2
- INGJLBQKTRJLFO-UKJIMTQDSA-N Glu-Ile-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O INGJLBQKTRJLFO-UKJIMTQDSA-N 0.000 description 2
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 2
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 2
- MFNUFCFRAZPJFW-JYJNAYRXSA-N Glu-Lys-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MFNUFCFRAZPJFW-JYJNAYRXSA-N 0.000 description 2
- SUIAHERNFYRBDZ-GVXVVHGQSA-N Glu-Lys-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O SUIAHERNFYRBDZ-GVXVVHGQSA-N 0.000 description 2
- QNJNPKSWAHPYGI-JYJNAYRXSA-N Glu-Phe-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 QNJNPKSWAHPYGI-JYJNAYRXSA-N 0.000 description 2
- UDEPRBFQTWGLCW-CIUDSAMLSA-N Glu-Pro-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O UDEPRBFQTWGLCW-CIUDSAMLSA-N 0.000 description 2
- BPLNJYHNAJVLRT-ACZMJKKPSA-N Glu-Ser-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O BPLNJYHNAJVLRT-ACZMJKKPSA-N 0.000 description 2
- MWTGQXBHVRTCOR-GLLZPBPUSA-N Glu-Thr-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MWTGQXBHVRTCOR-GLLZPBPUSA-N 0.000 description 2
- XAXJIUAWAFVADB-VJBMBRPKSA-N Glu-Trp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XAXJIUAWAFVADB-VJBMBRPKSA-N 0.000 description 2
- NTHIHAUEXVTXQG-KKUMJFAQSA-N Glu-Tyr-Arg Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O NTHIHAUEXVTXQG-KKUMJFAQSA-N 0.000 description 2
- HQTDNEZTGZUWSY-XVKPBYJWSA-N Glu-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)NCC(O)=O HQTDNEZTGZUWSY-XVKPBYJWSA-N 0.000 description 2
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 2
- RMWAOBGCZZSJHE-UMNHJUIQSA-N Glu-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N RMWAOBGCZZSJHE-UMNHJUIQSA-N 0.000 description 2
- PYTZFYUXZZHOAD-WHFBIAKZSA-N Gly-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)CN PYTZFYUXZZHOAD-WHFBIAKZSA-N 0.000 description 2
- FKJQNJCQTKUBCD-XPUUQOCRSA-N Gly-Ala-His Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O FKJQNJCQTKUBCD-XPUUQOCRSA-N 0.000 description 2
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 2
- JRDYDYXZKFNNRQ-XPUUQOCRSA-N Gly-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN JRDYDYXZKFNNRQ-XPUUQOCRSA-N 0.000 description 2
- JXYMPBCYRKWJEE-BQBZGAKWSA-N Gly-Arg-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JXYMPBCYRKWJEE-BQBZGAKWSA-N 0.000 description 2
- OGCIHJPYKVSMTE-YUMQZZPRSA-N Gly-Arg-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O OGCIHJPYKVSMTE-YUMQZZPRSA-N 0.000 description 2
- KRRMJKMGWWXWDW-STQMWFEESA-N Gly-Arg-Phe Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KRRMJKMGWWXWDW-STQMWFEESA-N 0.000 description 2
- XUORRGAFUQIMLC-STQMWFEESA-N Gly-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)CN)O XUORRGAFUQIMLC-STQMWFEESA-N 0.000 description 2
- UXJHNZODTMHWRD-WHFBIAKZSA-N Gly-Asn-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O UXJHNZODTMHWRD-WHFBIAKZSA-N 0.000 description 2
- XCLCVBYNGXEVDU-WHFBIAKZSA-N Gly-Asn-Ser Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O XCLCVBYNGXEVDU-WHFBIAKZSA-N 0.000 description 2
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 2
- BYYNJRSNDARRBX-YFKPBYRVSA-N Gly-Gln-Gly Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O BYYNJRSNDARRBX-YFKPBYRVSA-N 0.000 description 2
- AQLHORCVPGXDJW-IUCAKERBSA-N Gly-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN AQLHORCVPGXDJW-IUCAKERBSA-N 0.000 description 2
- NPSWCZIRBAYNSB-JHEQGTHGSA-N Gly-Gln-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NPSWCZIRBAYNSB-JHEQGTHGSA-N 0.000 description 2
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 2
- BUEFQXUHTUZXHR-LURJTMIESA-N Gly-Gly-Pro zwitterion Chemical compound NCC(=O)NCC(=O)N1CCC[C@H]1C(O)=O BUEFQXUHTUZXHR-LURJTMIESA-N 0.000 description 2
- UQJNXZSSGQIPIQ-FBCQKBJTSA-N Gly-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)CN UQJNXZSSGQIPIQ-FBCQKBJTSA-N 0.000 description 2
- OLPPXYMMIARYAL-QMMMGPOBSA-N Gly-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)CN OLPPXYMMIARYAL-QMMMGPOBSA-N 0.000 description 2
- HPAIKDPJURGQLN-KBPBESRZSA-N Gly-His-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CNC=N1 HPAIKDPJURGQLN-KBPBESRZSA-N 0.000 description 2
- HMHRTKOWRUPPNU-RCOVLWMOSA-N Gly-Ile-Gly Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O HMHRTKOWRUPPNU-RCOVLWMOSA-N 0.000 description 2
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 2
- ULZCYBYDTUMHNF-IUCAKERBSA-N Gly-Leu-Glu Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ULZCYBYDTUMHNF-IUCAKERBSA-N 0.000 description 2
- CLNSYANKYVMZNM-UWVGGRQHSA-N Gly-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N CLNSYANKYVMZNM-UWVGGRQHSA-N 0.000 description 2
- FHQRLHFYVZAQHU-IUCAKERBSA-N Gly-Lys-Gln Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O FHQRLHFYVZAQHU-IUCAKERBSA-N 0.000 description 2
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 2
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 2
- SCJJPCQUJYPHRZ-BQBZGAKWSA-N Gly-Pro-Asn Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O SCJJPCQUJYPHRZ-BQBZGAKWSA-N 0.000 description 2
- FFJQHWKSGAWSTJ-BFHQHQDPSA-N Gly-Thr-Ala Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O FFJQHWKSGAWSTJ-BFHQHQDPSA-N 0.000 description 2
- YXTFLTJYLIAZQG-FJXKBIBVSA-N Gly-Thr-Arg Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YXTFLTJYLIAZQG-FJXKBIBVSA-N 0.000 description 2
- FOKISINOENBSDM-WLTAIBSBSA-N Gly-Thr-Tyr Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O FOKISINOENBSDM-WLTAIBSBSA-N 0.000 description 2
- CUVBTVWFVIIDOC-YEPSODPASA-N Gly-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)CN CUVBTVWFVIIDOC-YEPSODPASA-N 0.000 description 2
- JKSMZVCGQWVTBW-STQMWFEESA-N Gly-Trp-Asn Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O JKSMZVCGQWVTBW-STQMWFEESA-N 0.000 description 2
- DNVDEMWIYLVIQU-RCOVLWMOSA-N Gly-Val-Asp Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O DNVDEMWIYLVIQU-RCOVLWMOSA-N 0.000 description 2
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 2
- LDTJBEOANMQRJE-CIUDSAMLSA-N His-Cys-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N LDTJBEOANMQRJE-CIUDSAMLSA-N 0.000 description 2
- VBOFRJNDIOPNDO-YUMQZZPRSA-N His-Gly-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N VBOFRJNDIOPNDO-YUMQZZPRSA-N 0.000 description 2
- CMMBEMZGNGYJRJ-IHRRRGAJSA-N His-Met-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N CMMBEMZGNGYJRJ-IHRRRGAJSA-N 0.000 description 2
- LNDVNHOSZQPJGI-AVGNSLFASA-N His-Pro-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CN=CN1 LNDVNHOSZQPJGI-AVGNSLFASA-N 0.000 description 2
- DGLAHESNTJWGDO-SRVKXCTJSA-N His-Ser-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N DGLAHESNTJWGDO-SRVKXCTJSA-N 0.000 description 2
- MRVZCDSYLJXKKX-ACRUOGEOSA-N His-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CN=CN3)N MRVZCDSYLJXKKX-ACRUOGEOSA-N 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- 101000861452 Homo sapiens Forkhead box protein P3 Proteins 0.000 description 2
- 101000599951 Homo sapiens Insulin-like growth factor I Proteins 0.000 description 2
- 101000959820 Homo sapiens Interferon alpha-1/13 Proteins 0.000 description 2
- 101001067140 Homo sapiens Porphobilinogen deaminase Proteins 0.000 description 2
- 101000629622 Homo sapiens Serine-pyruvate aminotransferase Proteins 0.000 description 2
- 101001104102 Homo sapiens X-linked retinitis pigmentosa GTPase regulator Proteins 0.000 description 2
- GRRNUXAQVGOGFE-UHFFFAOYSA-N Hygromycin-B Natural products OC1C(NC)CC(N)C(O)C1OC1C2OC3(C(C(O)C(O)C(C(N)CO)O3)O)OC2C(O)C(CO)O1 GRRNUXAQVGOGFE-UHFFFAOYSA-N 0.000 description 2
- JRHFQUPIZOYKQP-KBIXCLLPSA-N Ile-Ala-Glu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O JRHFQUPIZOYKQP-KBIXCLLPSA-N 0.000 description 2
- DMHGKBGOUAJRHU-RVMXOQNASA-N Ile-Arg-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N DMHGKBGOUAJRHU-RVMXOQNASA-N 0.000 description 2
- DMHGKBGOUAJRHU-UHFFFAOYSA-N Ile-Arg-Pro Natural products CCC(C)C(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O DMHGKBGOUAJRHU-UHFFFAOYSA-N 0.000 description 2
- AMSYMDIIIRJRKZ-HJPIBITLSA-N Ile-His-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N AMSYMDIIIRJRKZ-HJPIBITLSA-N 0.000 description 2
- AKOYRLRUFBZOSP-BJDJZHNGSA-N Ile-Lys-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)O)N AKOYRLRUFBZOSP-BJDJZHNGSA-N 0.000 description 2
- FJWALBCCVIHZBS-QXEWZRGKSA-N Ile-Met-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)O)N FJWALBCCVIHZBS-QXEWZRGKSA-N 0.000 description 2
- MSASLZGZQAXVFP-PEDHHIEDSA-N Ile-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N MSASLZGZQAXVFP-PEDHHIEDSA-N 0.000 description 2
- NPAYJTAXWXJKLO-NAKRPEOUSA-N Ile-Met-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N NPAYJTAXWXJKLO-NAKRPEOUSA-N 0.000 description 2
- OTSVBELRDMSPKY-PCBIJLKTSA-N Ile-Phe-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OTSVBELRDMSPKY-PCBIJLKTSA-N 0.000 description 2
- SAVXZJYTTQQQDD-QEWYBTABSA-N Ile-Phe-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SAVXZJYTTQQQDD-QEWYBTABSA-N 0.000 description 2
- UAELWXJFLZBKQS-WHOFXGATSA-N Ile-Phe-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O UAELWXJFLZBKQS-WHOFXGATSA-N 0.000 description 2
- BJECXJHLUJXPJQ-PYJNHQTQSA-N Ile-Pro-His Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N BJECXJHLUJXPJQ-PYJNHQTQSA-N 0.000 description 2
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 2
- NAFIFZNBSPWYOO-RWRJDSDZSA-N Ile-Thr-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NAFIFZNBSPWYOO-RWRJDSDZSA-N 0.000 description 2
- WCNWGAUZWWSYDG-SVSWQMSJSA-N Ile-Thr-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)O)N WCNWGAUZWWSYDG-SVSWQMSJSA-N 0.000 description 2
- RTSQPLLOYSGMKM-DSYPUSFNSA-N Ile-Trp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(C)C)C(=O)O)N RTSQPLLOYSGMKM-DSYPUSFNSA-N 0.000 description 2
- XDVKZSJODLMNLJ-GGQYPGDFSA-N Ile-Trp-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CC=3C4=CC=CC=C4NC=3)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 XDVKZSJODLMNLJ-GGQYPGDFSA-N 0.000 description 2
- ZUWSVOYKBCHLRR-MGHWNKPDSA-N Ile-Tyr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZUWSVOYKBCHLRR-MGHWNKPDSA-N 0.000 description 2
- DLEBSGAVWRPTIX-PEDHHIEDSA-N Ile-Val-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)[C@@H](C)CC DLEBSGAVWRPTIX-PEDHHIEDSA-N 0.000 description 2
- JZBVBOKASHNXAD-NAKRPEOUSA-N Ile-Val-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N JZBVBOKASHNXAD-NAKRPEOUSA-N 0.000 description 2
- APQYGMBHIVXFML-OSUNSFLBSA-N Ile-Val-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N APQYGMBHIVXFML-OSUNSFLBSA-N 0.000 description 2
- 108090001061 Insulin Proteins 0.000 description 2
- 102000004877 Insulin Human genes 0.000 description 2
- 102100037852 Insulin-like growth factor I Human genes 0.000 description 2
- 102100040019 Interferon alpha-1/13 Human genes 0.000 description 2
- 102000003814 Interleukin-10 Human genes 0.000 description 2
- 108090000174 Interleukin-10 Proteins 0.000 description 2
- 108010063738 Interleukins Proteins 0.000 description 2
- 102000015696 Interleukins Human genes 0.000 description 2
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 2
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 2
- CZCSUZMIRKFFFA-CIUDSAMLSA-N Leu-Ala-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O CZCSUZMIRKFFFA-CIUDSAMLSA-N 0.000 description 2
- ZRLUISBDKUWAIZ-CIUDSAMLSA-N Leu-Ala-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O ZRLUISBDKUWAIZ-CIUDSAMLSA-N 0.000 description 2
- YKNBJXOJTURHCU-DCAQKATOSA-N Leu-Asp-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKNBJXOJTURHCU-DCAQKATOSA-N 0.000 description 2
- ZDSNOSQHMJBRQN-SRVKXCTJSA-N Leu-Asp-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZDSNOSQHMJBRQN-SRVKXCTJSA-N 0.000 description 2
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 2
- CQGSYZCULZMEDE-UHFFFAOYSA-N Leu-Gln-Pro Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)N1CCCC1C(O)=O CQGSYZCULZMEDE-UHFFFAOYSA-N 0.000 description 2
- HFBCHNRFRYLZNV-GUBZILKMSA-N Leu-Glu-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HFBCHNRFRYLZNV-GUBZILKMSA-N 0.000 description 2
- BABSVXFGKFLIGW-UWVGGRQHSA-N Leu-Gly-Arg Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N BABSVXFGKFLIGW-UWVGGRQHSA-N 0.000 description 2
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 2
- AUBMZAMQCOYSIC-MNXVOIDGSA-N Leu-Ile-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O AUBMZAMQCOYSIC-MNXVOIDGSA-N 0.000 description 2
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 2
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 2
- UCXQIIIFOOGYEM-ULQDDVLXSA-N Leu-Pro-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UCXQIIIFOOGYEM-ULQDDVLXSA-N 0.000 description 2
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 2
- BGGTYDNTOYRTTR-MEYUZBJRSA-N Leu-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC(C)C)N)O BGGTYDNTOYRTTR-MEYUZBJRSA-N 0.000 description 2
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 2
- IXHKPDJKKCUKHS-GARJFASQSA-N Lys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IXHKPDJKKCUKHS-GARJFASQSA-N 0.000 description 2
- FUKDBQGFSJUXGX-RWMBFGLXSA-N Lys-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)C(=O)O FUKDBQGFSJUXGX-RWMBFGLXSA-N 0.000 description 2
- NTSPQIONFJUMJV-AVGNSLFASA-N Lys-Arg-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O NTSPQIONFJUMJV-AVGNSLFASA-N 0.000 description 2
- JBRWKVANRYPCAF-XIRDDKMYSA-N Lys-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N JBRWKVANRYPCAF-XIRDDKMYSA-N 0.000 description 2
- IBQMEXQYZMVIFU-SRVKXCTJSA-N Lys-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N IBQMEXQYZMVIFU-SRVKXCTJSA-N 0.000 description 2
- MWVUEPNEPWMFBD-SRVKXCTJSA-N Lys-Cys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@H](C(O)=O)CCCCN MWVUEPNEPWMFBD-SRVKXCTJSA-N 0.000 description 2
- MQMIRLVJXQNTRJ-SDDRHHMPSA-N Lys-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O MQMIRLVJXQNTRJ-SDDRHHMPSA-N 0.000 description 2
- HEWWNLVEWBJBKA-WDCWCFNPSA-N Lys-Gln-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN HEWWNLVEWBJBKA-WDCWCFNPSA-N 0.000 description 2
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 2
- GHOIOYHDDKXIDX-SZMVWBNQSA-N Lys-Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 GHOIOYHDDKXIDX-SZMVWBNQSA-N 0.000 description 2
- KEPWSUPUFAPBRF-DKIMLUQUSA-N Lys-Ile-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KEPWSUPUFAPBRF-DKIMLUQUSA-N 0.000 description 2
- ONPDTSFZAIWMDI-AVGNSLFASA-N Lys-Leu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ONPDTSFZAIWMDI-AVGNSLFASA-N 0.000 description 2
- UQRZFMQQXXJTTF-AVGNSLFASA-N Lys-Lys-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O UQRZFMQQXXJTTF-AVGNSLFASA-N 0.000 description 2
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 2
- VSTNAUBHKQPVJX-IHRRRGAJSA-N Lys-Met-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O VSTNAUBHKQPVJX-IHRRRGAJSA-N 0.000 description 2
- KFSALEZVQJYHCE-AVGNSLFASA-N Lys-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCCCN)N KFSALEZVQJYHCE-AVGNSLFASA-N 0.000 description 2
- IPTUBUUIFRZMJK-ACRUOGEOSA-N Lys-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 IPTUBUUIFRZMJK-ACRUOGEOSA-N 0.000 description 2
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 2
- PLOUVAYOMTYJRG-JXUBOQSCSA-N Lys-Thr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PLOUVAYOMTYJRG-JXUBOQSCSA-N 0.000 description 2
- JHNOXVASMSXSNB-WEDXCCLWSA-N Lys-Thr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JHNOXVASMSXSNB-WEDXCCLWSA-N 0.000 description 2
- RPWTZTBIFGENIA-VOAKCMCISA-N Lys-Thr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RPWTZTBIFGENIA-VOAKCMCISA-N 0.000 description 2
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 2
- HUKLXYYPZWPXCC-KZVJFYERSA-N Met-Ala-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HUKLXYYPZWPXCC-KZVJFYERSA-N 0.000 description 2
- SJDQOYTYNGZZJX-SRVKXCTJSA-N Met-Glu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SJDQOYTYNGZZJX-SRVKXCTJSA-N 0.000 description 2
- DGNZGCQSVGGYJS-BQBZGAKWSA-N Met-Gly-Asp Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O DGNZGCQSVGGYJS-BQBZGAKWSA-N 0.000 description 2
- UZWMJZSOXGOVIN-LURJTMIESA-N Met-Gly-Gly Chemical compound CSCC[C@H](N)C(=O)NCC(=O)NCC(O)=O UZWMJZSOXGOVIN-LURJTMIESA-N 0.000 description 2
- ORRNBLTZBBESPN-HJWJTTGWSA-N Met-Ile-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ORRNBLTZBBESPN-HJWJTTGWSA-N 0.000 description 2
- VSJAPSMRFYUOKS-IUCAKERBSA-N Met-Pro-Gly Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O VSJAPSMRFYUOKS-IUCAKERBSA-N 0.000 description 2
- HLZORBMOISUNIV-DCAQKATOSA-N Met-Ser-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C HLZORBMOISUNIV-DCAQKATOSA-N 0.000 description 2
- FIZZULTXMVEIAA-IHRRRGAJSA-N Met-Ser-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 FIZZULTXMVEIAA-IHRRRGAJSA-N 0.000 description 2
- RIIFMEBFDDXGCV-VEVYYDQMSA-N Met-Thr-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O RIIFMEBFDDXGCV-VEVYYDQMSA-N 0.000 description 2
- KYXDADPHSNFWQX-VEVYYDQMSA-N Met-Thr-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O KYXDADPHSNFWQX-VEVYYDQMSA-N 0.000 description 2
- QAVZUKIPOMBLMC-AVGNSLFASA-N Met-Val-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C QAVZUKIPOMBLMC-AVGNSLFASA-N 0.000 description 2
- 101001055320 Myxine glutinosa Insulin-like growth factor Proteins 0.000 description 2
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 2
- NEHSHYOUIWBYSA-DCPHZVHLSA-N Phe-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=CC=C3)N NEHSHYOUIWBYSA-DCPHZVHLSA-N 0.000 description 2
- HHOOEUSPFGPZFP-QWRGUYRKSA-N Phe-Asn-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HHOOEUSPFGPZFP-QWRGUYRKSA-N 0.000 description 2
- OYQBFWWQSVIHBN-FHWLQOOXSA-N Phe-Glu-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O OYQBFWWQSVIHBN-FHWLQOOXSA-N 0.000 description 2
- QPVFUAUFEBPIPT-CDMKHQONSA-N Phe-Gly-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O QPVFUAUFEBPIPT-CDMKHQONSA-N 0.000 description 2
- BYAIIACBWBOJCU-URLPEUOOSA-N Phe-Ile-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BYAIIACBWBOJCU-URLPEUOOSA-N 0.000 description 2
- GRVMHFCZUIYNKQ-UFYCRDLUSA-N Phe-Phe-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O GRVMHFCZUIYNKQ-UFYCRDLUSA-N 0.000 description 2
- YVXPUUOTMVBKDO-IHRRRGAJSA-N Phe-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CS)C(=O)O YVXPUUOTMVBKDO-IHRRRGAJSA-N 0.000 description 2
- ZVRJWDUPIDMHDN-ULQDDVLXSA-N Phe-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 ZVRJWDUPIDMHDN-ULQDDVLXSA-N 0.000 description 2
- ZJPGOXWRFNKIQL-JYJNAYRXSA-N Phe-Pro-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CC=CC=C1 ZJPGOXWRFNKIQL-JYJNAYRXSA-N 0.000 description 2
- AFNJAQVMTIQTCB-DLOVCJGASA-N Phe-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 AFNJAQVMTIQTCB-DLOVCJGASA-N 0.000 description 2
- XDMMOISUAHXXFD-SRVKXCTJSA-N Phe-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O XDMMOISUAHXXFD-SRVKXCTJSA-N 0.000 description 2
- CXMSESHALPOLRE-MEYUZBJRSA-N Phe-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O CXMSESHALPOLRE-MEYUZBJRSA-N 0.000 description 2
- SHUFSZDAIPLZLF-BEAPCOKYSA-N Phe-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O SHUFSZDAIPLZLF-BEAPCOKYSA-N 0.000 description 2
- VGTJSEYTVMAASM-RPTUDFQQSA-N Phe-Thr-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VGTJSEYTVMAASM-RPTUDFQQSA-N 0.000 description 2
- MSSXKZBDKZAHCX-UNQGMJICSA-N Phe-Thr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O MSSXKZBDKZAHCX-UNQGMJICSA-N 0.000 description 2
- QUUCAHIYARMNBL-FHWLQOOXSA-N Phe-Tyr-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N QUUCAHIYARMNBL-FHWLQOOXSA-N 0.000 description 2
- AGTHXWTYCLLYMC-FHWLQOOXSA-N Phe-Tyr-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=CC=C1 AGTHXWTYCLLYMC-FHWLQOOXSA-N 0.000 description 2
- GCFNFKNPCMBHNT-IRXDYDNUSA-N Phe-Tyr-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)NCC(=O)O)N GCFNFKNPCMBHNT-IRXDYDNUSA-N 0.000 description 2
- ZOGICTVLQDWPER-UFYCRDLUSA-N Phe-Tyr-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O ZOGICTVLQDWPER-UFYCRDLUSA-N 0.000 description 2
- GOUWCZRDTWTODO-YDHLFZDLSA-N Phe-Val-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O GOUWCZRDTWTODO-YDHLFZDLSA-N 0.000 description 2
- JSGWNFKWZNPDAV-YDHLFZDLSA-N Phe-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JSGWNFKWZNPDAV-YDHLFZDLSA-N 0.000 description 2
- 102100034391 Porphobilinogen deaminase Human genes 0.000 description 2
- 241000288906 Primates Species 0.000 description 2
- DBALDZKOTNSBFM-FXQIFTODSA-N Pro-Ala-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DBALDZKOTNSBFM-FXQIFTODSA-N 0.000 description 2
- DRVIASBABBMZTF-GUBZILKMSA-N Pro-Ala-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@@H]1CCCN1 DRVIASBABBMZTF-GUBZILKMSA-N 0.000 description 2
- FUVBEZJCRMHWEM-FXQIFTODSA-N Pro-Asn-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FUVBEZJCRMHWEM-FXQIFTODSA-N 0.000 description 2
- RETPETNFPLNLRV-JYJNAYRXSA-N Pro-Asn-Trp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O RETPETNFPLNLRV-JYJNAYRXSA-N 0.000 description 2
- MLQVJYMFASXBGZ-IHRRRGAJSA-N Pro-Asn-Tyr Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O MLQVJYMFASXBGZ-IHRRRGAJSA-N 0.000 description 2
- ZYBUKTMPPFQSHL-JYJNAYRXSA-N Pro-Asp-Trp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O ZYBUKTMPPFQSHL-JYJNAYRXSA-N 0.000 description 2
- ZBAGOWGNNAXMOY-IHRRRGAJSA-N Pro-Cys-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZBAGOWGNNAXMOY-IHRRRGAJSA-N 0.000 description 2
- WGAQWMRJUFQXMF-ZPFDUUQYSA-N Pro-Gln-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WGAQWMRJUFQXMF-ZPFDUUQYSA-N 0.000 description 2
- UEHYFUCOGHWASA-HJGDQZAQSA-N Pro-Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 UEHYFUCOGHWASA-HJGDQZAQSA-N 0.000 description 2
- FEPSEIDIPBMIOS-QXEWZRGKSA-N Pro-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEPSEIDIPBMIOS-QXEWZRGKSA-N 0.000 description 2
- UIMCLYYSUCIUJM-UWVGGRQHSA-N Pro-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 UIMCLYYSUCIUJM-UWVGGRQHSA-N 0.000 description 2
- FEVDNIBDCRKMER-IUCAKERBSA-N Pro-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEVDNIBDCRKMER-IUCAKERBSA-N 0.000 description 2
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 2
- BODDREDDDRZUCF-QTKMDUPCSA-N Pro-His-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@@H]2CCCN2)O BODDREDDDRZUCF-QTKMDUPCSA-N 0.000 description 2
- VTFXTWDFPTWNJY-RHYQMDGZSA-N Pro-Leu-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VTFXTWDFPTWNJY-RHYQMDGZSA-N 0.000 description 2
- SUENWIFTSTWUKD-AVGNSLFASA-N Pro-Leu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SUENWIFTSTWUKD-AVGNSLFASA-N 0.000 description 2
- JUJCUYWRJMFJJF-AVGNSLFASA-N Pro-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 JUJCUYWRJMFJJF-AVGNSLFASA-N 0.000 description 2
- DWGFLKQSGRUQTI-IHRRRGAJSA-N Pro-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 DWGFLKQSGRUQTI-IHRRRGAJSA-N 0.000 description 2
- WIPAMEKBSHNFQE-IUCAKERBSA-N Pro-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@@H]1CCCN1 WIPAMEKBSHNFQE-IUCAKERBSA-N 0.000 description 2
- AJBQTGZIZQXBLT-STQMWFEESA-N Pro-Phe-Gly Chemical compound C([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 AJBQTGZIZQXBLT-STQMWFEESA-N 0.000 description 2
- FHZJRBVMLGOHBX-GUBZILKMSA-N Pro-Pro-Asp Chemical compound OC(=O)C[C@H](NC(=O)[C@@H]1CCCN1C(=O)[C@@H]1CCCN1)C(O)=O FHZJRBVMLGOHBX-GUBZILKMSA-N 0.000 description 2
- GMJDSFYVTAMIBF-FXQIFTODSA-N Pro-Ser-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GMJDSFYVTAMIBF-FXQIFTODSA-N 0.000 description 2
- MKGIILKDUGDRRO-FXQIFTODSA-N Pro-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 MKGIILKDUGDRRO-FXQIFTODSA-N 0.000 description 2
- KWMZPPWYBVZIER-XGEHTFHBSA-N Pro-Ser-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KWMZPPWYBVZIER-XGEHTFHBSA-N 0.000 description 2
- 241000169446 Promethis Species 0.000 description 2
- 108020005067 RNA Splice Sites Proteins 0.000 description 2
- 229920002684 Sepharose Polymers 0.000 description 2
- BKOKTRCZXRIQPX-ZLUOBGJFSA-N Ser-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N BKOKTRCZXRIQPX-ZLUOBGJFSA-N 0.000 description 2
- IDQFQFVEWMWRQQ-DLOVCJGASA-N Ser-Ala-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IDQFQFVEWMWRQQ-DLOVCJGASA-N 0.000 description 2
- HQTKVSCNCDLXSX-BQBZGAKWSA-N Ser-Arg-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O HQTKVSCNCDLXSX-BQBZGAKWSA-N 0.000 description 2
- VQBLHWSPVYYZTB-DCAQKATOSA-N Ser-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N VQBLHWSPVYYZTB-DCAQKATOSA-N 0.000 description 2
- XVAUJOAYHWWNQF-ZLUOBGJFSA-N Ser-Asn-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O XVAUJOAYHWWNQF-ZLUOBGJFSA-N 0.000 description 2
- OBXVZEAMXFSGPU-FXQIFTODSA-N Ser-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N)CN=C(N)N OBXVZEAMXFSGPU-FXQIFTODSA-N 0.000 description 2
- UBRXAVQWXOWRSJ-ZLUOBGJFSA-N Ser-Asn-Asp Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CO)N)C(=O)N UBRXAVQWXOWRSJ-ZLUOBGJFSA-N 0.000 description 2
- DKKGAAJTDKHWOD-BIIVOSGPSA-N Ser-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N)C(=O)O DKKGAAJTDKHWOD-BIIVOSGPSA-N 0.000 description 2
- ICHZYBVODUVUKN-SRVKXCTJSA-N Ser-Asn-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ICHZYBVODUVUKN-SRVKXCTJSA-N 0.000 description 2
- CDVFZMOFNJPUDD-ACZMJKKPSA-N Ser-Gln-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CDVFZMOFNJPUDD-ACZMJKKPSA-N 0.000 description 2
- XWCYBVBLJRWOFR-WDSKDSINSA-N Ser-Gln-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O XWCYBVBLJRWOFR-WDSKDSINSA-N 0.000 description 2
- BQWCDDAISCPDQV-XHNCKOQMSA-N Ser-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CO)N)C(=O)O BQWCDDAISCPDQV-XHNCKOQMSA-N 0.000 description 2
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 2
- GZBKRJVCRMZAST-XKBZYTNZSA-N Ser-Glu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZBKRJVCRMZAST-XKBZYTNZSA-N 0.000 description 2
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 2
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 2
- SFTZWNJFZYOLBD-ZDLURKLDSA-N Ser-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO SFTZWNJFZYOLBD-ZDLURKLDSA-N 0.000 description 2
- IXZHZUGGKLRHJD-DCAQKATOSA-N Ser-Leu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IXZHZUGGKLRHJD-DCAQKATOSA-N 0.000 description 2
- FKYWFUYPVKLJLP-DCAQKATOSA-N Ser-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FKYWFUYPVKLJLP-DCAQKATOSA-N 0.000 description 2
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 2
- WLJPJRGQRNCIQS-ZLUOBGJFSA-N Ser-Ser-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O WLJPJRGQRNCIQS-ZLUOBGJFSA-N 0.000 description 2
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 2
- XQJCEKXQUJQNNK-ZLUOBGJFSA-N Ser-Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O XQJCEKXQUJQNNK-ZLUOBGJFSA-N 0.000 description 2
- UYLKOSODXYSWMQ-XGEHTFHBSA-N Ser-Thr-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CO)N)O UYLKOSODXYSWMQ-XGEHTFHBSA-N 0.000 description 2
- OJFFAQFRCVPHNN-JYBASQMISA-N Ser-Thr-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O OJFFAQFRCVPHNN-JYBASQMISA-N 0.000 description 2
- BDMWLJLPPUCLNV-XGEHTFHBSA-N Ser-Thr-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BDMWLJLPPUCLNV-XGEHTFHBSA-N 0.000 description 2
- ZVBCMFDJIMUELU-BZSNNMDCSA-N Ser-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CO)N ZVBCMFDJIMUELU-BZSNNMDCSA-N 0.000 description 2
- VVKVHAOOUGNDPJ-SRVKXCTJSA-N Ser-Tyr-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O VVKVHAOOUGNDPJ-SRVKXCTJSA-N 0.000 description 2
- PCMZJFMUYWIERL-ZKWXMUAHSA-N Ser-Val-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PCMZJFMUYWIERL-ZKWXMUAHSA-N 0.000 description 2
- JZRYFUGREMECBH-XPUUQOCRSA-N Ser-Val-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O JZRYFUGREMECBH-XPUUQOCRSA-N 0.000 description 2
- HNDMFDBQXYZSRM-IHRRRGAJSA-N Ser-Val-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HNDMFDBQXYZSRM-IHRRRGAJSA-N 0.000 description 2
- 102100026842 Serine-pyruvate aminotransferase Human genes 0.000 description 2
- 241000977068 Simian Adeno-associated virus Species 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- 108091027967 Small hairpin RNA Proteins 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 241000256251 Spodoptera frugiperda Species 0.000 description 2
- VIBXMCZWVUOZLA-OLHMAJIHSA-N Thr-Asn-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O VIBXMCZWVUOZLA-OLHMAJIHSA-N 0.000 description 2
- TZKPNGDGUVREEB-FOHZUACHSA-N Thr-Asn-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O TZKPNGDGUVREEB-FOHZUACHSA-N 0.000 description 2
- JTEICXDKGWKRRV-HJGDQZAQSA-N Thr-Asn-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O JTEICXDKGWKRRV-HJGDQZAQSA-N 0.000 description 2
- QNJZOAHSYPXTAB-VEVYYDQMSA-N Thr-Asn-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O QNJZOAHSYPXTAB-VEVYYDQMSA-N 0.000 description 2
- JBHMLZSKIXMVFS-XVSYOHENSA-N Thr-Asn-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JBHMLZSKIXMVFS-XVSYOHENSA-N 0.000 description 2
- VUVCRYXYUUPGSB-GLLZPBPUSA-N Thr-Gln-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O VUVCRYXYUUPGSB-GLLZPBPUSA-N 0.000 description 2
- UHBPFYOQQPFKQR-JHEQGTHGSA-N Thr-Gln-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O UHBPFYOQQPFKQR-JHEQGTHGSA-N 0.000 description 2
- DIPIPFHFLPTCLK-LOKLDPHHSA-N Thr-Gln-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N)O DIPIPFHFLPTCLK-LOKLDPHHSA-N 0.000 description 2
- FHDLKMFZKRUQCE-HJGDQZAQSA-N Thr-Glu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHDLKMFZKRUQCE-HJGDQZAQSA-N 0.000 description 2
- VULNJDORNLBPNG-SWRJLBSHSA-N Thr-Glu-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O VULNJDORNLBPNG-SWRJLBSHSA-N 0.000 description 2
- KBBRNEDOYWMIJP-KYNKHSRBSA-N Thr-Gly-Thr Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O KBBRNEDOYWMIJP-KYNKHSRBSA-N 0.000 description 2
- NCXVJIQMWSGRHY-KXNHARMFSA-N Thr-Leu-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O NCXVJIQMWSGRHY-KXNHARMFSA-N 0.000 description 2
- MGJLBZFUXUGMML-VOAKCMCISA-N Thr-Lys-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MGJLBZFUXUGMML-VOAKCMCISA-N 0.000 description 2
- ABWNZPOIUJMNKT-IXOXFDKPSA-N Thr-Phe-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O ABWNZPOIUJMNKT-IXOXFDKPSA-N 0.000 description 2
- SGAOHNPSEPVAFP-ZDLURKLDSA-N Thr-Ser-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SGAOHNPSEPVAFP-ZDLURKLDSA-N 0.000 description 2
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 2
- WKGAAMOJPMBBMC-IXOXFDKPSA-N Thr-Ser-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WKGAAMOJPMBBMC-IXOXFDKPSA-N 0.000 description 2
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 2
- QYDKSNXSBXZPFK-ZJDVBMNYSA-N Thr-Thr-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYDKSNXSBXZPFK-ZJDVBMNYSA-N 0.000 description 2
- BBPCSGKKPJUYRB-UVOCVTCTSA-N Thr-Thr-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O BBPCSGKKPJUYRB-UVOCVTCTSA-N 0.000 description 2
- JAWUQFCGNVEDRN-MEYUZBJRSA-N Thr-Tyr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O JAWUQFCGNVEDRN-MEYUZBJRSA-N 0.000 description 2
- OGOYMQWIWHGTGH-KZVJFYERSA-N Thr-Val-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O OGOYMQWIWHGTGH-KZVJFYERSA-N 0.000 description 2
- QNXZCKMXHPULME-ZNSHCXBVSA-N Thr-Val-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N)O QNXZCKMXHPULME-ZNSHCXBVSA-N 0.000 description 2
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 2
- 239000004473 Threonine Substances 0.000 description 2
- 108010022394 Threonine synthase Proteins 0.000 description 2
- 108700009124 Transcription Initiation Site Proteins 0.000 description 2
- HJWVPKJHHLZCNH-DVXDUOKCSA-N Trp-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CC=3C4=CC=CC=C4NC=3)C)C(O)=O)=CNC2=C1 HJWVPKJHHLZCNH-DVXDUOKCSA-N 0.000 description 2
- TUUXFNQXSFNFLX-XIRDDKMYSA-N Trp-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N TUUXFNQXSFNFLX-XIRDDKMYSA-N 0.000 description 2
- RERRMBXDSFMBQE-ZFWWWQNUSA-N Trp-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N RERRMBXDSFMBQE-ZFWWWQNUSA-N 0.000 description 2
- GEGYPBOPIGNZIF-CWRNSKLLSA-N Trp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O GEGYPBOPIGNZIF-CWRNSKLLSA-N 0.000 description 2
- YCQXZDHDSUHUSG-FJHTZYQYSA-N Trp-Thr-Ala Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O)=CNC2=C1 YCQXZDHDSUHUSG-FJHTZYQYSA-N 0.000 description 2
- HHPSUFUXXBOFQY-AQZXSJQPSA-N Trp-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O HHPSUFUXXBOFQY-AQZXSJQPSA-N 0.000 description 2
- VCXWRWYFJLXITF-AUTRQRHGSA-N Tyr-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 VCXWRWYFJLXITF-AUTRQRHGSA-N 0.000 description 2
- ZWZOCUWOXSDYFZ-CQDKDKBSSA-N Tyr-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZWZOCUWOXSDYFZ-CQDKDKBSSA-N 0.000 description 2
- HTHCZRWCFXMENJ-KKUMJFAQSA-N Tyr-Arg-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HTHCZRWCFXMENJ-KKUMJFAQSA-N 0.000 description 2
- XHALUUQSNXSPLP-UFYCRDLUSA-N Tyr-Arg-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 XHALUUQSNXSPLP-UFYCRDLUSA-N 0.000 description 2
- QARCDOCCDOLJSF-HJPIBITLSA-N Tyr-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QARCDOCCDOLJSF-HJPIBITLSA-N 0.000 description 2
- OLYXUGBVBGSZDN-ACRUOGEOSA-N Tyr-Leu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 OLYXUGBVBGSZDN-ACRUOGEOSA-N 0.000 description 2
- GITNQBVCEQBDQC-KKUMJFAQSA-N Tyr-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O GITNQBVCEQBDQC-KKUMJFAQSA-N 0.000 description 2
- CNNVVEPJTFOGHI-ACRUOGEOSA-N Tyr-Lys-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CNNVVEPJTFOGHI-ACRUOGEOSA-N 0.000 description 2
- DJIJBQYBDKGDIS-JYJNAYRXSA-N Tyr-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(C)C)C(O)=O DJIJBQYBDKGDIS-JYJNAYRXSA-N 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- UEOOXDLMQZBPFR-ZKWXMUAHSA-N Val-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N UEOOXDLMQZBPFR-ZKWXMUAHSA-N 0.000 description 2
- UDLYXGYWTVOIKU-QXEWZRGKSA-N Val-Asn-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UDLYXGYWTVOIKU-QXEWZRGKSA-N 0.000 description 2
- GNWUWQAVVJQREM-NHCYSSNCSA-N Val-Asn-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N GNWUWQAVVJQREM-NHCYSSNCSA-N 0.000 description 2
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 2
- ZQGPWORGSNRQLN-NHCYSSNCSA-N Val-Asp-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZQGPWORGSNRQLN-NHCYSSNCSA-N 0.000 description 2
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 2
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 2
- VVZDBPBZHLQPPB-XVKPBYJWSA-N Val-Glu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VVZDBPBZHLQPPB-XVKPBYJWSA-N 0.000 description 2
- MHAHQDBEIDPFQS-NHCYSSNCSA-N Val-Glu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)C(C)C MHAHQDBEIDPFQS-NHCYSSNCSA-N 0.000 description 2
- WDIGUPHXPBMODF-UMNHJUIQSA-N Val-Glu-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N WDIGUPHXPBMODF-UMNHJUIQSA-N 0.000 description 2
- FXVDGDZRYLFQKY-WPRPVWTQSA-N Val-Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C FXVDGDZRYLFQKY-WPRPVWTQSA-N 0.000 description 2
- YTPLVNUZZOBFFC-SCZZXKLOSA-N Val-Gly-Pro Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N1CCC[C@@H]1C(O)=O YTPLVNUZZOBFFC-SCZZXKLOSA-N 0.000 description 2
- LAYSXAOGWHKNED-XPUUQOCRSA-N Val-Gly-Ser Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LAYSXAOGWHKNED-XPUUQOCRSA-N 0.000 description 2
- JVYIGCARISMLMV-HOCLYGCPSA-N Val-Gly-Trp Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N JVYIGCARISMLMV-HOCLYGCPSA-N 0.000 description 2
- XBRMBDFYOFARST-AVGNSLFASA-N Val-His-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C(C)C)C(=O)O)N XBRMBDFYOFARST-AVGNSLFASA-N 0.000 description 2
- KDKLLPMFFGYQJD-CYDGBPFRSA-N Val-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N KDKLLPMFFGYQJD-CYDGBPFRSA-N 0.000 description 2
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 2
- YMTOEGGOCHVGEH-IHRRRGAJSA-N Val-Lys-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O YMTOEGGOCHVGEH-IHRRRGAJSA-N 0.000 description 2
- JAKHAONCJJZVHT-DCAQKATOSA-N Val-Lys-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)O)N JAKHAONCJJZVHT-DCAQKATOSA-N 0.000 description 2
- YKNOJPJWNVHORX-UNQGMJICSA-N Val-Phe-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YKNOJPJWNVHORX-UNQGMJICSA-N 0.000 description 2
- QSPOLEBZTMESFY-SRVKXCTJSA-N Val-Pro-Val Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O QSPOLEBZTMESFY-SRVKXCTJSA-N 0.000 description 2
- RYHUIHUOYRNNIE-NRPADANISA-N Val-Ser-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RYHUIHUOYRNNIE-NRPADANISA-N 0.000 description 2
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 2
- UJMCYJKPDFQLHX-XGEHTFHBSA-N Val-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N)O UJMCYJKPDFQLHX-XGEHTFHBSA-N 0.000 description 2
- HTONZBWRYUKUKC-RCWTZXSCSA-N Val-Thr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HTONZBWRYUKUKC-RCWTZXSCSA-N 0.000 description 2
- NGXQOQNXSGOYOI-BQFCYCMXSA-N Val-Trp-Gln Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O)=CNC2=C1 NGXQOQNXSGOYOI-BQFCYCMXSA-N 0.000 description 2
- DFQZDQPLWBSFEJ-LSJOCFKGSA-N Val-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N DFQZDQPLWBSFEJ-LSJOCFKGSA-N 0.000 description 2
- JVGDAEKKZKKZFO-RCWTZXSCSA-N Val-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)N)O JVGDAEKKZKKZFO-RCWTZXSCSA-N 0.000 description 2
- 102000005789 Vascular Endothelial Growth Factors Human genes 0.000 description 2
- 108010019530 Vascular Endothelial Growth Factors Proteins 0.000 description 2
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 2
- 102100040092 X-linked retinitis pigmentosa GTPase regulator Human genes 0.000 description 2
- 230000001594 aberrant effect Effects 0.000 description 2
- 238000001261 affinity purification Methods 0.000 description 2
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 2
- 108010041407 alanylaspartic acid Proteins 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 2
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 2
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 2
- 238000003149 assay kit Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- OWMVSZAMULFTJU-UHFFFAOYSA-N bis-tris Chemical compound OCCN(CCO)C(CO)(CO)CO OWMVSZAMULFTJU-UHFFFAOYSA-N 0.000 description 2
- 239000013592 cell lysate Substances 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 102000004419 dihydrofolate reductase Human genes 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 238000005538 encapsulation Methods 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 230000009368 gene silencing by RNA Effects 0.000 description 2
- 238000010353 genetic engineering Methods 0.000 description 2
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 2
- 108010027668 glycyl-alanyl-valine Proteins 0.000 description 2
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 2
- 108010081551 glycylphenylalanine Proteins 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- GRRNUXAQVGOGFE-NZSRVPFOSA-N hygromycin B Chemical compound O[C@@H]1[C@@H](NC)C[C@@H](N)[C@H](O)[C@H]1O[C@H]1[C@H]2O[C@@]3([C@@H]([C@@H](O)[C@@H](O)[C@@H](C(N)CO)O3)O)O[C@H]2[C@@H](O)[C@@H](CO)O1 GRRNUXAQVGOGFE-NZSRVPFOSA-N 0.000 description 2
- 229940097277 hygromycin b Drugs 0.000 description 2
- 238000011534 incubation Methods 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 229940125396 insulin Drugs 0.000 description 2
- 229940047122 interleukins Drugs 0.000 description 2
- 102000008371 intracellularly ATP-gated chloride channel activity proteins Human genes 0.000 description 2
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 2
- 238000004020 luminiscence type Methods 0.000 description 2
- 239000006166 lysate Substances 0.000 description 2
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 2
- 108010076718 lysyl-glutamyl-tryptophan Proteins 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000003278 mimic effect Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 231100000219 mutagenic Toxicity 0.000 description 2
- 230000003505 mutagenic effect Effects 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 229920002401 polyacrylamide Polymers 0.000 description 2
- 229940002612 prodrug Drugs 0.000 description 2
- 239000000651 prodrug Substances 0.000 description 2
- 108010070643 prolylglutamic acid Proteins 0.000 description 2
- 238000001243 protein synthesis Methods 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 230000003362 replicative effect Effects 0.000 description 2
- 239000010979 ruby Substances 0.000 description 2
- 229910001750 ruby Inorganic materials 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 238000001890 transfection Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000032258 transport Effects 0.000 description 2
- 230000010415 tropism Effects 0.000 description 2
- 108010035534 tyrosyl-leucyl-alanine Proteins 0.000 description 2
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 2
- 108010015385 valyl-prolyl-proline Proteins 0.000 description 2
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 1
- PQFMROVJTOPVDF-JBDRJPRFSA-N (2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-amino-3-carboxypropanoyl]amino]-3-carboxypropanoyl]amino]-4-carboxybutanoyl]amino]butanedioic acid Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O PQFMROVJTOPVDF-JBDRJPRFSA-N 0.000 description 1
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- 101150084750 1 gene Proteins 0.000 description 1
- ZIIUUSVHCHPIQD-UHFFFAOYSA-N 2,4,6-trimethyl-N-[3-(trifluoromethyl)phenyl]benzenesulfonamide Chemical compound CC1=CC(C)=CC(C)=C1S(=O)(=O)NC1=CC=CC(C(F)(F)F)=C1 ZIIUUSVHCHPIQD-UHFFFAOYSA-N 0.000 description 1
- 241000649045 Adeno-associated virus 10 Species 0.000 description 1
- 241000649046 Adeno-associated virus 11 Species 0.000 description 1
- 241000649047 Adeno-associated virus 12 Species 0.000 description 1
- 241000425548 Adeno-associated virus 3A Species 0.000 description 1
- 241000958487 Adeno-associated virus 3B Species 0.000 description 1
- 241000256173 Aedes albopictus Species 0.000 description 1
- BYXHQQCXAJARLQ-ZLUOBGJFSA-N Ala-Ala-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O BYXHQQCXAJARLQ-ZLUOBGJFSA-N 0.000 description 1
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 1
- WYPUMLRSQMKIJU-BPNCWPANSA-N Ala-Arg-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WYPUMLRSQMKIJU-BPNCWPANSA-N 0.000 description 1
- YSMPVONNIWLJML-FXQIFTODSA-N Ala-Asp-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(O)=O YSMPVONNIWLJML-FXQIFTODSA-N 0.000 description 1
- DAEFQZCYZKRTLR-ZLUOBGJFSA-N Ala-Cys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O DAEFQZCYZKRTLR-ZLUOBGJFSA-N 0.000 description 1
- OQCPATDFWYYDDX-HGNGGELXSA-N Ala-Gln-His Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O OQCPATDFWYYDDX-HGNGGELXSA-N 0.000 description 1
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 1
- VGPWRRFOPXVGOH-BYPYZUCNSA-N Ala-Gly-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)NCC(O)=O VGPWRRFOPXVGOH-BYPYZUCNSA-N 0.000 description 1
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 1
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 1
- SUMYEVXWCAYLLJ-GUBZILKMSA-N Ala-Leu-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O SUMYEVXWCAYLLJ-GUBZILKMSA-N 0.000 description 1
- MNZHHDPWDWQJCQ-YUMQZZPRSA-N Ala-Leu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O MNZHHDPWDWQJCQ-YUMQZZPRSA-N 0.000 description 1
- AWZKCUCQJNTBAD-SRVKXCTJSA-N Ala-Leu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN AWZKCUCQJNTBAD-SRVKXCTJSA-N 0.000 description 1
- FCXAUASCMJOFEY-NDKCEZKHSA-N Ala-Leu-Thr-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O FCXAUASCMJOFEY-NDKCEZKHSA-N 0.000 description 1
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 1
- OMDNCNKNEGFOMM-BQBZGAKWSA-N Ala-Met-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O OMDNCNKNEGFOMM-BQBZGAKWSA-N 0.000 description 1
- DXTYEWAQOXYRHZ-KKXDTOCCSA-N Ala-Phe-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N DXTYEWAQOXYRHZ-KKXDTOCCSA-N 0.000 description 1
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 1
- YNOCMHZSWJMGBB-GCJQMDKQSA-N Ala-Thr-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O YNOCMHZSWJMGBB-GCJQMDKQSA-N 0.000 description 1
- AAWLEICNDUHIJM-MBLNEYKQSA-N Ala-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](C)N)O AAWLEICNDUHIJM-MBLNEYKQSA-N 0.000 description 1
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 1
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 1
- SFPRJVVDZNLUTG-OWLDWWDNSA-N Ala-Trp-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SFPRJVVDZNLUTG-OWLDWWDNSA-N 0.000 description 1
- AOAKQKVICDWCLB-UWJYBYFXSA-N Ala-Tyr-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AOAKQKVICDWCLB-UWJYBYFXSA-N 0.000 description 1
- NLYYHIKRBRMAJV-AEJSXWLSSA-N Ala-Val-Pro Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N NLYYHIKRBRMAJV-AEJSXWLSSA-N 0.000 description 1
- 108010088751 Albumins Proteins 0.000 description 1
- 102000009027 Albumins Human genes 0.000 description 1
- 102100022712 Alpha-1-antitrypsin Human genes 0.000 description 1
- 235000002198 Annona diversifolia Nutrition 0.000 description 1
- 102100029470 Apolipoprotein E Human genes 0.000 description 1
- 101710095339 Apolipoprotein E Proteins 0.000 description 1
- DCGLNNVKIZXQOJ-FXQIFTODSA-N Arg-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N DCGLNNVKIZXQOJ-FXQIFTODSA-N 0.000 description 1
- BVBKBQRPOJFCQM-DCAQKATOSA-N Arg-Asn-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BVBKBQRPOJFCQM-DCAQKATOSA-N 0.000 description 1
- OCOZPTHLDVSFCZ-BPUTZDHNSA-N Arg-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N OCOZPTHLDVSFCZ-BPUTZDHNSA-N 0.000 description 1
- PQWTZSNVWSOFFK-FXQIFTODSA-N Arg-Asp-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)CN=C(N)N PQWTZSNVWSOFFK-FXQIFTODSA-N 0.000 description 1
- ALOVURZCXKYKJC-NAKRPEOUSA-N Arg-Asp-Gln-Ser Chemical compound N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O ALOVURZCXKYKJC-NAKRPEOUSA-N 0.000 description 1
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 1
- BQBPFMNVOWDLHO-XIRDDKMYSA-N Arg-Gln-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N BQBPFMNVOWDLHO-XIRDDKMYSA-N 0.000 description 1
- UFBURHXMKFQVLM-CIUDSAMLSA-N Arg-Glu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O UFBURHXMKFQVLM-CIUDSAMLSA-N 0.000 description 1
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 1
- NKNILFJYKKHBKE-WPRPVWTQSA-N Arg-Gly-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O NKNILFJYKKHBKE-WPRPVWTQSA-N 0.000 description 1
- RKQRHMKFNBYOTN-IHRRRGAJSA-N Arg-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N RKQRHMKFNBYOTN-IHRRRGAJSA-N 0.000 description 1
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 1
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 1
- JEOCWTUOMKEEMF-RHYQMDGZSA-N Arg-Leu-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JEOCWTUOMKEEMF-RHYQMDGZSA-N 0.000 description 1
- FSNVAJOPUDVQAR-AVGNSLFASA-N Arg-Lys-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FSNVAJOPUDVQAR-AVGNSLFASA-N 0.000 description 1
- BSYKSCBTTQKOJG-GUBZILKMSA-N Arg-Pro-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BSYKSCBTTQKOJG-GUBZILKMSA-N 0.000 description 1
- WKPXXXUSUHAXDE-SRVKXCTJSA-N Arg-Pro-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O WKPXXXUSUHAXDE-SRVKXCTJSA-N 0.000 description 1
- UULLJGQFCDXVTQ-CYDGBPFRSA-N Arg-Pro-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UULLJGQFCDXVTQ-CYDGBPFRSA-N 0.000 description 1
- NGYHSXDNNOFHNE-AVGNSLFASA-N Arg-Pro-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O NGYHSXDNNOFHNE-AVGNSLFASA-N 0.000 description 1
- AWMAZIIEFPFHCP-RCWTZXSCSA-N Arg-Pro-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O AWMAZIIEFPFHCP-RCWTZXSCSA-N 0.000 description 1
- JOTRDIXZHNQYGP-DCAQKATOSA-N Arg-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N JOTRDIXZHNQYGP-DCAQKATOSA-N 0.000 description 1
- ICRHGPYYXMWHIE-LPEHRKFASA-N Arg-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ICRHGPYYXMWHIE-LPEHRKFASA-N 0.000 description 1
- XMGVWQWEWWULNS-BPUTZDHNSA-N Arg-Trp-Ser Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N XMGVWQWEWWULNS-BPUTZDHNSA-N 0.000 description 1
- WOZDCBHUGJVJPL-AVGNSLFASA-N Arg-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WOZDCBHUGJVJPL-AVGNSLFASA-N 0.000 description 1
- HZPSDHRYYIORKR-WHFBIAKZSA-N Asn-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O HZPSDHRYYIORKR-WHFBIAKZSA-N 0.000 description 1
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 1
- IARGXWMWRFOQPG-GCJQMDKQSA-N Asn-Ala-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IARGXWMWRFOQPG-GCJQMDKQSA-N 0.000 description 1
- VDCIPFYVCICPEC-FXQIFTODSA-N Asn-Arg-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O VDCIPFYVCICPEC-FXQIFTODSA-N 0.000 description 1
- CIBWFJFMOBIFTE-CIUDSAMLSA-N Asn-Arg-Gln Chemical compound C(C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N CIBWFJFMOBIFTE-CIUDSAMLSA-N 0.000 description 1
- BVLIJXXSXBUGEC-SRVKXCTJSA-N Asn-Asn-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVLIJXXSXBUGEC-SRVKXCTJSA-N 0.000 description 1
- IYVSIZAXNLOKFQ-BYULHYEWSA-N Asn-Asp-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IYVSIZAXNLOKFQ-BYULHYEWSA-N 0.000 description 1
- QYXNFROWLZPWPC-FXQIFTODSA-N Asn-Glu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QYXNFROWLZPWPC-FXQIFTODSA-N 0.000 description 1
- DXVMJJNAOVECBA-WHFBIAKZSA-N Asn-Gly-Asn Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O DXVMJJNAOVECBA-WHFBIAKZSA-N 0.000 description 1
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 1
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 1
- FTCGGKNCJZOPNB-WHFBIAKZSA-N Asn-Gly-Ser Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FTCGGKNCJZOPNB-WHFBIAKZSA-N 0.000 description 1
- RAQMSGVCGSJKCL-FOHZUACHSA-N Asn-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(N)=O RAQMSGVCGSJKCL-FOHZUACHSA-N 0.000 description 1
- OOWSBIOUKIUWLO-RCOVLWMOSA-N Asn-Gly-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O OOWSBIOUKIUWLO-RCOVLWMOSA-N 0.000 description 1
- WQLJRNRLHWJIRW-KKUMJFAQSA-N Asn-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N)O WQLJRNRLHWJIRW-KKUMJFAQSA-N 0.000 description 1
- FVKHEKVYFTZWDX-GHCJXIJMSA-N Asn-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)N)N FVKHEKVYFTZWDX-GHCJXIJMSA-N 0.000 description 1
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 1
- UBGGJTMETLEXJD-DCAQKATOSA-N Asn-Leu-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(O)=O UBGGJTMETLEXJD-DCAQKATOSA-N 0.000 description 1
- JLNFZLNDHONLND-GARJFASQSA-N Asn-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N JLNFZLNDHONLND-GARJFASQSA-N 0.000 description 1
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 1
- PLTGTJAZQRGMPP-FXQIFTODSA-N Asn-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O PLTGTJAZQRGMPP-FXQIFTODSA-N 0.000 description 1
- BYLSYQASFJJBCL-DCAQKATOSA-N Asn-Pro-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BYLSYQASFJJBCL-DCAQKATOSA-N 0.000 description 1
- JWQWPRCDYWNVNM-ACZMJKKPSA-N Asn-Ser-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N JWQWPRCDYWNVNM-ACZMJKKPSA-N 0.000 description 1
- MKJBPDLENBUHQU-CIUDSAMLSA-N Asn-Ser-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O MKJBPDLENBUHQU-CIUDSAMLSA-N 0.000 description 1
- VLDRQOHCMKCXLY-SRVKXCTJSA-N Asn-Ser-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VLDRQOHCMKCXLY-SRVKXCTJSA-N 0.000 description 1
- SNYCNNPOFYBCEK-ZLUOBGJFSA-N Asn-Ser-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O SNYCNNPOFYBCEK-ZLUOBGJFSA-N 0.000 description 1
- QUMKPKWYDVMGNT-NUMRIWBASA-N Asn-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QUMKPKWYDVMGNT-NUMRIWBASA-N 0.000 description 1
- AMGQTNHANMRPOE-LKXGYXEUSA-N Asn-Thr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O AMGQTNHANMRPOE-LKXGYXEUSA-N 0.000 description 1
- QNNBHTFDFFFHGC-KKUMJFAQSA-N Asn-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QNNBHTFDFFFHGC-KKUMJFAQSA-N 0.000 description 1
- CGYKCTPUGXFPMG-IHPCNDPISA-N Asn-Tyr-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O CGYKCTPUGXFPMG-IHPCNDPISA-N 0.000 description 1
- LRCIOEVFVGXZKB-BZSNNMDCSA-N Asn-Tyr-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LRCIOEVFVGXZKB-BZSNNMDCSA-N 0.000 description 1
- ZAESWDKAMDVHLL-RCOVLWMOSA-N Asn-Val-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O ZAESWDKAMDVHLL-RCOVLWMOSA-N 0.000 description 1
- HPNDBHLITCHRSO-WHFBIAKZSA-N Asp-Ala-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)NCC(O)=O HPNDBHLITCHRSO-WHFBIAKZSA-N 0.000 description 1
- YNQIDCRRTWGHJD-ZLUOBGJFSA-N Asp-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(O)=O YNQIDCRRTWGHJD-ZLUOBGJFSA-N 0.000 description 1
- RDRMWJBLOSRRAW-BYULHYEWSA-N Asp-Asn-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O RDRMWJBLOSRRAW-BYULHYEWSA-N 0.000 description 1
- VPSHHQXIWLGVDD-ZLUOBGJFSA-N Asp-Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VPSHHQXIWLGVDD-ZLUOBGJFSA-N 0.000 description 1
- WCFCYFDBMNFSPA-ACZMJKKPSA-N Asp-Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O WCFCYFDBMNFSPA-ACZMJKKPSA-N 0.000 description 1
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 1
- ZCKYZTGLXIEOKS-CIUDSAMLSA-N Asp-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N ZCKYZTGLXIEOKS-CIUDSAMLSA-N 0.000 description 1
- CELPEWWLSXMVPH-CIUDSAMLSA-N Asp-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O CELPEWWLSXMVPH-CIUDSAMLSA-N 0.000 description 1
- VZNOVQKGJQJOCS-SRVKXCTJSA-N Asp-Asp-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VZNOVQKGJQJOCS-SRVKXCTJSA-N 0.000 description 1
- LJRPYAZQQWHEEV-FXQIFTODSA-N Asp-Gln-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O LJRPYAZQQWHEEV-FXQIFTODSA-N 0.000 description 1
- DXQOQMCLWWADMU-ACZMJKKPSA-N Asp-Gln-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DXQOQMCLWWADMU-ACZMJKKPSA-N 0.000 description 1
- XDGBFDYXZCMYEX-NUMRIWBASA-N Asp-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)O XDGBFDYXZCMYEX-NUMRIWBASA-N 0.000 description 1
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 1
- POTCZYQVVNXUIG-BQBZGAKWSA-N Asp-Gly-Pro Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O POTCZYQVVNXUIG-BQBZGAKWSA-N 0.000 description 1
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 1
- GBSUGIXJAAKZOW-GMOBBJLQSA-N Asp-Ile-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GBSUGIXJAAKZOW-GMOBBJLQSA-N 0.000 description 1
- RTXQQDVBACBSCW-CFMVVWHZSA-N Asp-Ile-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RTXQQDVBACBSCW-CFMVVWHZSA-N 0.000 description 1
- PAYPSKIBMDHZPI-CIUDSAMLSA-N Asp-Leu-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PAYPSKIBMDHZPI-CIUDSAMLSA-N 0.000 description 1
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 1
- LIVXPXUVXFRWNY-CIUDSAMLSA-N Asp-Lys-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O LIVXPXUVXFRWNY-CIUDSAMLSA-N 0.000 description 1
- LTCKTLYKRMCFOC-KKUMJFAQSA-N Asp-Phe-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LTCKTLYKRMCFOC-KKUMJFAQSA-N 0.000 description 1
- USNJAPJZSGTTPX-XVSYOHENSA-N Asp-Phe-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O USNJAPJZSGTTPX-XVSYOHENSA-N 0.000 description 1
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 1
- VNXQRBXEQXLERQ-CIUDSAMLSA-N Asp-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N VNXQRBXEQXLERQ-CIUDSAMLSA-N 0.000 description 1
- JSHWXQIZOCVWIA-ZKWXMUAHSA-N Asp-Ser-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O JSHWXQIZOCVWIA-ZKWXMUAHSA-N 0.000 description 1
- UTLCRGFJFSZWAW-OLHMAJIHSA-N Asp-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O UTLCRGFJFSZWAW-OLHMAJIHSA-N 0.000 description 1
- IWLZBRTUIVXZJD-OLHMAJIHSA-N Asp-Thr-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O IWLZBRTUIVXZJD-OLHMAJIHSA-N 0.000 description 1
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 1
- MRYDJCIIVRXVGG-QEJZJMRPSA-N Asp-Trp-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(O)=O MRYDJCIIVRXVGG-QEJZJMRPSA-N 0.000 description 1
- IHZFGJLKDYINPV-XIRDDKMYSA-N Asp-Trp-His Chemical compound C([C@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H](CC(O)=O)N)C(O)=O)C1=CN=CN1 IHZFGJLKDYINPV-XIRDDKMYSA-N 0.000 description 1
- CZIVKMOEXPILDK-SRVKXCTJSA-N Asp-Tyr-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O CZIVKMOEXPILDK-SRVKXCTJSA-N 0.000 description 1
- XQFLFQWOBXPMHW-NHCYSSNCSA-N Asp-Val-His Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O XQFLFQWOBXPMHW-NHCYSSNCSA-N 0.000 description 1
- GXIUDSXIUSTSLO-QXEWZRGKSA-N Asp-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)O)N GXIUDSXIUSTSLO-QXEWZRGKSA-N 0.000 description 1
- 102100022005 B-lymphocyte antigen CD20 Human genes 0.000 description 1
- 102100021277 Beta-secretase 2 Human genes 0.000 description 1
- 101710150190 Beta-secretase 2 Proteins 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 101100028791 Caenorhabditis elegans pbs-5 gene Proteins 0.000 description 1
- 101100315624 Caenorhabditis elegans tyr-1 gene Proteins 0.000 description 1
- 241000282832 Camelidae Species 0.000 description 1
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 108020004638 Circular DNA Proteins 0.000 description 1
- LBOLGUYQEPZSKM-YUMQZZPRSA-N Cys-Gly-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CS)N LBOLGUYQEPZSKM-YUMQZZPRSA-N 0.000 description 1
- 201000003883 Cystic fibrosis Diseases 0.000 description 1
- 241000701022 Cytomegalovirus Species 0.000 description 1
- 108010080611 Cytosine Deaminase Proteins 0.000 description 1
- 230000004543 DNA replication Effects 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 241000255925 Diptera Species 0.000 description 1
- 241000283086 Equidae Species 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 108091029865 Exogenous DNA Proteins 0.000 description 1
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 1
- MLZRSFQRBDNJON-GUBZILKMSA-N Gln-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MLZRSFQRBDNJON-GUBZILKMSA-N 0.000 description 1
- PGPJSRSLQNXBDT-YUMQZZPRSA-N Gln-Arg-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O PGPJSRSLQNXBDT-YUMQZZPRSA-N 0.000 description 1
- RRYLMJWPWBJFPZ-ACZMJKKPSA-N Gln-Asn-Asp Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N RRYLMJWPWBJFPZ-ACZMJKKPSA-N 0.000 description 1
- ZPDVKYLJTOFQJV-WDSKDSINSA-N Gln-Asn-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O ZPDVKYLJTOFQJV-WDSKDSINSA-N 0.000 description 1
- RMOCFPBLHAOTDU-ACZMJKKPSA-N Gln-Asn-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RMOCFPBLHAOTDU-ACZMJKKPSA-N 0.000 description 1
- MGJMFSBEMSNYJL-AVGNSLFASA-N Gln-Asn-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MGJMFSBEMSNYJL-AVGNSLFASA-N 0.000 description 1
- PKVWNYGXMNWJSI-CIUDSAMLSA-N Gln-Gln-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O PKVWNYGXMNWJSI-CIUDSAMLSA-N 0.000 description 1
- KCJJFESQRXGTGC-BQBZGAKWSA-N Gln-Glu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O KCJJFESQRXGTGC-BQBZGAKWSA-N 0.000 description 1
- VGTDBGYFVWOQTI-RYUDHWBXSA-N Gln-Gly-Phe Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VGTDBGYFVWOQTI-RYUDHWBXSA-N 0.000 description 1
- JXFLPKSDLDEOQK-JHEQGTHGSA-N Gln-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O JXFLPKSDLDEOQK-JHEQGTHGSA-N 0.000 description 1
- YRWWJCDWLVXTHN-LAEOZQHASA-N Gln-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N YRWWJCDWLVXTHN-LAEOZQHASA-N 0.000 description 1
- QKCZZAZNMMVICF-DCAQKATOSA-N Gln-Leu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O QKCZZAZNMMVICF-DCAQKATOSA-N 0.000 description 1
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 1
- IULKWYSYZSURJK-AVGNSLFASA-N Gln-Leu-Lys Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O IULKWYSYZSURJK-AVGNSLFASA-N 0.000 description 1
- UWKPRVKWEKEMSY-DCAQKATOSA-N Gln-Lys-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWKPRVKWEKEMSY-DCAQKATOSA-N 0.000 description 1
- KLKYKPXITJBSNI-CIUDSAMLSA-N Gln-Met-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O KLKYKPXITJBSNI-CIUDSAMLSA-N 0.000 description 1
- AQPZYBSRDRZBAG-AVGNSLFASA-N Gln-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N AQPZYBSRDRZBAG-AVGNSLFASA-N 0.000 description 1
- DRNMNLKUUKKPIA-HTUGSXCWSA-N Gln-Phe-Thr Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)CCC(N)=O)C(O)=O DRNMNLKUUKKPIA-HTUGSXCWSA-N 0.000 description 1
- DOQUICBEISTQHE-CIUDSAMLSA-N Gln-Pro-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O DOQUICBEISTQHE-CIUDSAMLSA-N 0.000 description 1
- FQCILXROGNOZON-YUMQZZPRSA-N Gln-Pro-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O FQCILXROGNOZON-YUMQZZPRSA-N 0.000 description 1
- WLRYGVYQFXRJDA-DCAQKATOSA-N Gln-Pro-Pro Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 WLRYGVYQFXRJDA-DCAQKATOSA-N 0.000 description 1
- OKARHJKJTKFQBM-ACZMJKKPSA-N Gln-Ser-Asn Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OKARHJKJTKFQBM-ACZMJKKPSA-N 0.000 description 1
- VOUSELYGTNGEPB-NUMRIWBASA-N Gln-Thr-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O VOUSELYGTNGEPB-NUMRIWBASA-N 0.000 description 1
- CGYFDYFOAWDTPI-VJBMBRPKSA-N Gln-Trp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)O)NC(=O)[C@H](CCC(=O)N)N CGYFDYFOAWDTPI-VJBMBRPKSA-N 0.000 description 1
- 108010044091 Globulins Proteins 0.000 description 1
- 102000006395 Globulins Human genes 0.000 description 1
- PBEQPAZRHDVJQI-SRVKXCTJSA-N Glu-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)O)N PBEQPAZRHDVJQI-SRVKXCTJSA-N 0.000 description 1
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 1
- MLCPTRRNICEKIS-FXQIFTODSA-N Glu-Asn-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MLCPTRRNICEKIS-FXQIFTODSA-N 0.000 description 1
- NTBDVNJIWCKURJ-ACZMJKKPSA-N Glu-Asp-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NTBDVNJIWCKURJ-ACZMJKKPSA-N 0.000 description 1
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 1
- WPLGNDORMXTMQS-FXQIFTODSA-N Glu-Gln-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O WPLGNDORMXTMQS-FXQIFTODSA-N 0.000 description 1
- ILGFBUGLBSAQQB-GUBZILKMSA-N Glu-Glu-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ILGFBUGLBSAQQB-GUBZILKMSA-N 0.000 description 1
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 1
- KUTPGXNAAOQSPD-LPEHRKFASA-N Glu-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O KUTPGXNAAOQSPD-LPEHRKFASA-N 0.000 description 1
- QJCKNLPMTPXXEM-AUTRQRHGSA-N Glu-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O QJCKNLPMTPXXEM-AUTRQRHGSA-N 0.000 description 1
- UHVIQGKBMXEVGN-WDSKDSINSA-N Glu-Gly-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UHVIQGKBMXEVGN-WDSKDSINSA-N 0.000 description 1
- CAVMESABQIKFKT-IUCAKERBSA-N Glu-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N CAVMESABQIKFKT-IUCAKERBSA-N 0.000 description 1
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 1
- HPJLZFTUUJKWAJ-JHEQGTHGSA-N Glu-Gly-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HPJLZFTUUJKWAJ-JHEQGTHGSA-N 0.000 description 1
- QIQABBIDHGQXGA-ZPFDUUQYSA-N Glu-Ile-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QIQABBIDHGQXGA-ZPFDUUQYSA-N 0.000 description 1
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 1
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 1
- QDMVXRNLOPTPIE-WDCWCFNPSA-N Glu-Lys-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QDMVXRNLOPTPIE-WDCWCFNPSA-N 0.000 description 1
- AOCARQDSFTWWFT-DCAQKATOSA-N Glu-Met-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AOCARQDSFTWWFT-DCAQKATOSA-N 0.000 description 1
- FQFWFZWOHOEVMZ-IHRRRGAJSA-N Glu-Phe-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O FQFWFZWOHOEVMZ-IHRRRGAJSA-N 0.000 description 1
- WZAYJXZPSJOXCP-QAETUUGQSA-N Glu-Phe-Gln-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](NC(=O)[C@H](CCC(O)=O)N)CC1=CC=CC=C1 WZAYJXZPSJOXCP-QAETUUGQSA-N 0.000 description 1
- SYWCGQOIIARSIX-SRVKXCTJSA-N Glu-Pro-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O SYWCGQOIIARSIX-SRVKXCTJSA-N 0.000 description 1
- DCBSZJJHOTXMHY-DCAQKATOSA-N Glu-Pro-Pro Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DCBSZJJHOTXMHY-DCAQKATOSA-N 0.000 description 1
- SWDNPSMMEWRNOH-HJGDQZAQSA-N Glu-Pro-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWDNPSMMEWRNOH-HJGDQZAQSA-N 0.000 description 1
- NNQDRRUXFJYCCJ-NHCYSSNCSA-N Glu-Pro-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O NNQDRRUXFJYCCJ-NHCYSSNCSA-N 0.000 description 1
- SYAYROHMAIHWFB-KBIXCLLPSA-N Glu-Ser-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYAYROHMAIHWFB-KBIXCLLPSA-N 0.000 description 1
- GPSHCSTUYOQPAI-JHEQGTHGSA-N Glu-Thr-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O GPSHCSTUYOQPAI-JHEQGTHGSA-N 0.000 description 1
- DDXZHOHEABQXSE-NKIYYHGXSA-N Glu-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O DDXZHOHEABQXSE-NKIYYHGXSA-N 0.000 description 1
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 1
- PHONXOACARQMPM-BQBZGAKWSA-N Gly-Ala-Met Chemical compound [H]NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O PHONXOACARQMPM-BQBZGAKWSA-N 0.000 description 1
- CLODWIOAKCSBAN-BQBZGAKWSA-N Gly-Arg-Asp Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O CLODWIOAKCSBAN-BQBZGAKWSA-N 0.000 description 1
- DWUKOTKSTDWGAE-BQBZGAKWSA-N Gly-Asn-Arg Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DWUKOTKSTDWGAE-BQBZGAKWSA-N 0.000 description 1
- FMVLWTYYODVFRG-BQBZGAKWSA-N Gly-Asn-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN FMVLWTYYODVFRG-BQBZGAKWSA-N 0.000 description 1
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 1
- XQHSBNVACKQWAV-WHFBIAKZSA-N Gly-Asp-Asn Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XQHSBNVACKQWAV-WHFBIAKZSA-N 0.000 description 1
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 1
- JPWIMMUNWUKOAD-STQMWFEESA-N Gly-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN JPWIMMUNWUKOAD-STQMWFEESA-N 0.000 description 1
- KTSZUNRRYXPZTK-BQBZGAKWSA-N Gly-Gln-Glu Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KTSZUNRRYXPZTK-BQBZGAKWSA-N 0.000 description 1
- YZPVGIVFMZLQMM-YUMQZZPRSA-N Gly-Gln-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN YZPVGIVFMZLQMM-YUMQZZPRSA-N 0.000 description 1
- PABFFPWEJMEVEC-JGVFFNPUSA-N Gly-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)CN)C(=O)O PABFFPWEJMEVEC-JGVFFNPUSA-N 0.000 description 1
- QPDUVFSVVAOUHE-XVKPBYJWSA-N Gly-Gln-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)CN)C(O)=O QPDUVFSVVAOUHE-XVKPBYJWSA-N 0.000 description 1
- QSVCIFZPGLOZGH-WDSKDSINSA-N Gly-Glu-Ser Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QSVCIFZPGLOZGH-WDSKDSINSA-N 0.000 description 1
- KMSGYZQRXPUKGI-BYPYZUCNSA-N Gly-Gly-Asn Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O KMSGYZQRXPUKGI-BYPYZUCNSA-N 0.000 description 1
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 1
- ALOBJFDJTMQQPW-ONGXEEELSA-N Gly-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)CN ALOBJFDJTMQQPW-ONGXEEELSA-N 0.000 description 1
- SXJHOPPTOJACOA-QXEWZRGKSA-N Gly-Ile-Arg Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SXJHOPPTOJACOA-QXEWZRGKSA-N 0.000 description 1
- NSTUFLGQJCOCDL-UWVGGRQHSA-N Gly-Leu-Arg Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NSTUFLGQJCOCDL-UWVGGRQHSA-N 0.000 description 1
- TWTPDFFBLQEBOE-IUCAKERBSA-N Gly-Leu-Gln Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O TWTPDFFBLQEBOE-IUCAKERBSA-N 0.000 description 1
- CCBIBMKQNXHNIN-ZETCQYMHSA-N Gly-Leu-Gly Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CCBIBMKQNXHNIN-ZETCQYMHSA-N 0.000 description 1
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 1
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 1
- CVFOYJJOZYYEPE-KBPBESRZSA-N Gly-Lys-Tyr Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CVFOYJJOZYYEPE-KBPBESRZSA-N 0.000 description 1
- BBTCXWTXOXUNFX-IUCAKERBSA-N Gly-Met-Arg Chemical compound CSCC[C@H](NC(=O)CN)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O BBTCXWTXOXUNFX-IUCAKERBSA-N 0.000 description 1
- OMOZPGCHVWOXHN-BQBZGAKWSA-N Gly-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)CN OMOZPGCHVWOXHN-BQBZGAKWSA-N 0.000 description 1
- IXHQLZIWBCQBLQ-STQMWFEESA-N Gly-Pro-Phe Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IXHQLZIWBCQBLQ-STQMWFEESA-N 0.000 description 1
- OOCFXNOVSLSHAB-IUCAKERBSA-N Gly-Pro-Pro Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 OOCFXNOVSLSHAB-IUCAKERBSA-N 0.000 description 1
- OHUKZZYSJBKFRR-WHFBIAKZSA-N Gly-Ser-Asp Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O OHUKZZYSJBKFRR-WHFBIAKZSA-N 0.000 description 1
- SOEGEPHNZOISMT-BYPYZUCNSA-N Gly-Ser-Gly Chemical compound NCC(=O)N[C@@H](CO)C(=O)NCC(O)=O SOEGEPHNZOISMT-BYPYZUCNSA-N 0.000 description 1
- WNGHUXFWEWTKAO-YUMQZZPRSA-N Gly-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN WNGHUXFWEWTKAO-YUMQZZPRSA-N 0.000 description 1
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 1
- LCRDMSSAKLTKBU-ZDLURKLDSA-N Gly-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN LCRDMSSAKLTKBU-ZDLURKLDSA-N 0.000 description 1
- FKESCSGWBPUTPN-FOHZUACHSA-N Gly-Thr-Asn Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O FKESCSGWBPUTPN-FOHZUACHSA-N 0.000 description 1
- JQFILXICXLDTRR-FBCQKBJTSA-N Gly-Thr-Gly Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)NCC(O)=O JQFILXICXLDTRR-FBCQKBJTSA-N 0.000 description 1
- ZZWUYQXMIFTIIY-WEDXCCLWSA-N Gly-Thr-Leu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O ZZWUYQXMIFTIIY-WEDXCCLWSA-N 0.000 description 1
- XHVONGZZVUUORG-WEDXCCLWSA-N Gly-Thr-Lys Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCCN XHVONGZZVUUORG-WEDXCCLWSA-N 0.000 description 1
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 1
- RCHFYMASWAZQQZ-ZANVPECISA-N Gly-Trp-Ala Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)CN)=CNC2=C1 RCHFYMASWAZQQZ-ZANVPECISA-N 0.000 description 1
- YJDALMUYJIENAG-QWRGUYRKSA-N Gly-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN)O YJDALMUYJIENAG-QWRGUYRKSA-N 0.000 description 1
- WRFOZIJRODPLIA-QWRGUYRKSA-N Gly-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN)O WRFOZIJRODPLIA-QWRGUYRKSA-N 0.000 description 1
- GJHWILMUOANXTG-WPRPVWTQSA-N Gly-Val-Arg Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GJHWILMUOANXTG-WPRPVWTQSA-N 0.000 description 1
- YDIDLLVFCYSXNY-RCOVLWMOSA-N Gly-Val-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)CN YDIDLLVFCYSXNY-RCOVLWMOSA-N 0.000 description 1
- MUGLKCQHTUFLGF-WPRPVWTQSA-N Gly-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)CN MUGLKCQHTUFLGF-WPRPVWTQSA-N 0.000 description 1
- IZVICCORZOSGPT-JSGCOSHPSA-N Gly-Val-Tyr Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IZVICCORZOSGPT-JSGCOSHPSA-N 0.000 description 1
- KZTLOHBDLMIFSH-XVYDVKMFSA-N His-Ala-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O KZTLOHBDLMIFSH-XVYDVKMFSA-N 0.000 description 1
- ZJSMFRTVYSLKQU-DJFWLOJKSA-N His-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CN=CN1)N ZJSMFRTVYSLKQU-DJFWLOJKSA-N 0.000 description 1
- PMWSGVRIMIFXQH-KKUMJFAQSA-N His-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1NC=NC=1)C1=CN=CN1 PMWSGVRIMIFXQH-KKUMJFAQSA-N 0.000 description 1
- JENKOCSDMSVWPY-SRVKXCTJSA-N His-Leu-Asn Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O JENKOCSDMSVWPY-SRVKXCTJSA-N 0.000 description 1
- KHUFDBQXGLEIHC-BZSNNMDCSA-N His-Leu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 KHUFDBQXGLEIHC-BZSNNMDCSA-N 0.000 description 1
- BFOGZWSSGMLYKV-DCAQKATOSA-N His-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC1=CN=CN1)N BFOGZWSSGMLYKV-DCAQKATOSA-N 0.000 description 1
- GIRSNERMXCMDBO-GARJFASQSA-N His-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O GIRSNERMXCMDBO-GARJFASQSA-N 0.000 description 1
- VXZZUXWAOMWWJH-QTKMDUPCSA-N His-Thr-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O VXZZUXWAOMWWJH-QTKMDUPCSA-N 0.000 description 1
- 101000897405 Homo sapiens B-lymphocyte antigen CD20 Proteins 0.000 description 1
- 241000701085 Human alphaherpesvirus 3 Species 0.000 description 1
- WUEIUSDAECDLQO-NAKRPEOUSA-N Ile-Ala-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)O)N WUEIUSDAECDLQO-NAKRPEOUSA-N 0.000 description 1
- FVEWRQXNISSYFO-ZPFDUUQYSA-N Ile-Arg-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N FVEWRQXNISSYFO-ZPFDUUQYSA-N 0.000 description 1
- QTUSJASXLGLJSR-OSUNSFLBSA-N Ile-Arg-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N QTUSJASXLGLJSR-OSUNSFLBSA-N 0.000 description 1
- UKTUOMWSJPXODT-GUDRVLHUSA-N Ile-Asn-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N UKTUOMWSJPXODT-GUDRVLHUSA-N 0.000 description 1
- HVWXAQVMRBKKFE-UGYAYLCHSA-N Ile-Asp-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HVWXAQVMRBKKFE-UGYAYLCHSA-N 0.000 description 1
- UDLAWRKOVFDKFL-PEFMBERDSA-N Ile-Asp-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N UDLAWRKOVFDKFL-PEFMBERDSA-N 0.000 description 1
- REJKOQYVFDEZHA-SLBDDTMCSA-N Ile-Asp-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N REJKOQYVFDEZHA-SLBDDTMCSA-N 0.000 description 1
- LKACSKJPTFSBHR-MNXVOIDGSA-N Ile-Gln-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N LKACSKJPTFSBHR-MNXVOIDGSA-N 0.000 description 1
- HTDRTKMNJRRYOJ-SIUGBPQLSA-N Ile-Gln-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HTDRTKMNJRRYOJ-SIUGBPQLSA-N 0.000 description 1
- DFJJAVZIHDFOGQ-MNXVOIDGSA-N Ile-Glu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DFJJAVZIHDFOGQ-MNXVOIDGSA-N 0.000 description 1
- SPQWWEZBHXHUJN-KBIXCLLPSA-N Ile-Glu-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O SPQWWEZBHXHUJN-KBIXCLLPSA-N 0.000 description 1
- LEHPJMKVGFPSSP-ZQINRCPSSA-N Ile-Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 LEHPJMKVGFPSSP-ZQINRCPSSA-N 0.000 description 1
- CDGLBYSAZFIIJO-RCOVLWMOSA-N Ile-Gly-Gly Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O CDGLBYSAZFIIJO-RCOVLWMOSA-N 0.000 description 1
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 1
- CKRFDMPBSWYOBT-PPCPHDFISA-N Ile-Lys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CKRFDMPBSWYOBT-PPCPHDFISA-N 0.000 description 1
- FFJQAEYLAQMGDL-MGHWNKPDSA-N Ile-Lys-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FFJQAEYLAQMGDL-MGHWNKPDSA-N 0.000 description 1
- HQEPKOFULQTSFV-JURCDPSOSA-N Ile-Phe-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)O)N HQEPKOFULQTSFV-JURCDPSOSA-N 0.000 description 1
- VEPIBPGLTLPBDW-URLPEUOOSA-N Ile-Phe-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N VEPIBPGLTLPBDW-URLPEUOOSA-N 0.000 description 1
- JODPUDMBQBIWCK-GHCJXIJMSA-N Ile-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O JODPUDMBQBIWCK-GHCJXIJMSA-N 0.000 description 1
- WLRJHVNFGAOYPS-HJPIBITLSA-N Ile-Ser-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N WLRJHVNFGAOYPS-HJPIBITLSA-N 0.000 description 1
- NXRNRBOKDBIVKQ-CXTHYWKRSA-N Ile-Tyr-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N NXRNRBOKDBIVKQ-CXTHYWKRSA-N 0.000 description 1
- YJRSIJZUIUANHO-NAKRPEOUSA-N Ile-Val-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(=O)O)N YJRSIJZUIUANHO-NAKRPEOUSA-N 0.000 description 1
- 108020005350 Initiator Codon Proteins 0.000 description 1
- 108010002350 Interleukin-2 Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 108010025815 Kanamycin Kinase Proteins 0.000 description 1
- 241000242362 Kordia Species 0.000 description 1
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 1
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- FBOZXECLQNJBKD-ZDUSSCGKSA-N L-methotrexate Chemical compound C=1N=C2N=C(N)N=C(N)C2=NC=1CN(C)C1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 FBOZXECLQNJBKD-ZDUSSCGKSA-N 0.000 description 1
- 241000282838 Lama Species 0.000 description 1
- IBMVEYRWAWIOTN-RWMBFGLXSA-N Leu-Arg-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(O)=O IBMVEYRWAWIOTN-RWMBFGLXSA-N 0.000 description 1
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 1
- PIHFVNPEAHFNLN-KKUMJFAQSA-N Leu-Cys-Tyr Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N PIHFVNPEAHFNLN-KKUMJFAQSA-N 0.000 description 1
- VPKIQULSKFVCSM-SRVKXCTJSA-N Leu-Gln-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VPKIQULSKFVCSM-SRVKXCTJSA-N 0.000 description 1
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 1
- FQZPTCNSNPWHLJ-AVGNSLFASA-N Leu-Gln-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O FQZPTCNSNPWHLJ-AVGNSLFASA-N 0.000 description 1
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 description 1
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 1
- LAPSXOAUPNOINL-YUMQZZPRSA-N Leu-Gly-Asp Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O LAPSXOAUPNOINL-YUMQZZPRSA-N 0.000 description 1
- APFJUBGRZGMQFF-QWRGUYRKSA-N Leu-Gly-Lys Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN APFJUBGRZGMQFF-QWRGUYRKSA-N 0.000 description 1
- KEVYYIMVELOXCT-KBPBESRZSA-N Leu-Gly-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KEVYYIMVELOXCT-KBPBESRZSA-N 0.000 description 1
- JRJLGNFWYFSJHB-HOCLYGCPSA-N Leu-Gly-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O JRJLGNFWYFSJHB-HOCLYGCPSA-N 0.000 description 1
- HGFGEMSVBMCFKK-MNXVOIDGSA-N Leu-Ile-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HGFGEMSVBMCFKK-MNXVOIDGSA-N 0.000 description 1
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 1
- WXUOJXIGOPMDJM-SRVKXCTJSA-N Leu-Lys-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O WXUOJXIGOPMDJM-SRVKXCTJSA-N 0.000 description 1
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 1
- RTIRBWJPYJYTLO-MELADBBJSA-N Leu-Lys-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N RTIRBWJPYJYTLO-MELADBBJSA-N 0.000 description 1
- DDVHDMSBLRAKNV-IHRRRGAJSA-N Leu-Met-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O DDVHDMSBLRAKNV-IHRRRGAJSA-N 0.000 description 1
- WMIOEVKKYIMVKI-DCAQKATOSA-N Leu-Pro-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WMIOEVKKYIMVKI-DCAQKATOSA-N 0.000 description 1
- YRRCOJOXAJNSAX-IHRRRGAJSA-N Leu-Pro-Lys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N YRRCOJOXAJNSAX-IHRRRGAJSA-N 0.000 description 1
- IDGZVZJLYFTXSL-DCAQKATOSA-N Leu-Ser-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IDGZVZJLYFTXSL-DCAQKATOSA-N 0.000 description 1
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 1
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 1
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 1
- CNWDWAMPKVYJJB-NUTKFTJISA-N Leu-Trp-Ala Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](C)C(O)=O)=CNC2=C1 CNWDWAMPKVYJJB-NUTKFTJISA-N 0.000 description 1
- HOMFINRJHIIZNJ-HOCLYGCPSA-N Leu-Trp-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O HOMFINRJHIIZNJ-HOCLYGCPSA-N 0.000 description 1
- WQWZXKWOEVSGQM-DCAQKATOSA-N Lys-Ala-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN WQWZXKWOEVSGQM-DCAQKATOSA-N 0.000 description 1
- NLOZZWJNIKKYSC-WDSOQIARSA-N Lys-Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 NLOZZWJNIKKYSC-WDSOQIARSA-N 0.000 description 1
- NCTDKZKNBDZDOL-GARJFASQSA-N Lys-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O NCTDKZKNBDZDOL-GARJFASQSA-N 0.000 description 1
- GGNOBVSOZPHLCE-GUBZILKMSA-N Lys-Gln-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O GGNOBVSOZPHLCE-GUBZILKMSA-N 0.000 description 1
- MRWXLRGAFDOILG-DCAQKATOSA-N Lys-Gln-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MRWXLRGAFDOILG-DCAQKATOSA-N 0.000 description 1
- DRCILAJNUJKAHC-SRVKXCTJSA-N Lys-Glu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DRCILAJNUJKAHC-SRVKXCTJSA-N 0.000 description 1
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 1
- NKKFVJRLCCUJNA-QWRGUYRKSA-N Lys-Gly-Lys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN NKKFVJRLCCUJNA-QWRGUYRKSA-N 0.000 description 1
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 1
- PINHPJWGVBKQII-SRVKXCTJSA-N Lys-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N PINHPJWGVBKQII-SRVKXCTJSA-N 0.000 description 1
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 1
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 1
- WBSCNDJQPKSPII-KKUMJFAQSA-N Lys-Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O WBSCNDJQPKSPII-KKUMJFAQSA-N 0.000 description 1
- IPSDPDAOSAEWCN-RHYQMDGZSA-N Lys-Met-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IPSDPDAOSAEWCN-RHYQMDGZSA-N 0.000 description 1
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 1
- OBZHNHBAAVEWKI-DCAQKATOSA-N Lys-Pro-Asn Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O OBZHNHBAAVEWKI-DCAQKATOSA-N 0.000 description 1
- LUTDBHBIHHREDC-IHRRRGAJSA-N Lys-Pro-Lys Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O LUTDBHBIHHREDC-IHRRRGAJSA-N 0.000 description 1
- YTJFXEDRUOQGSP-DCAQKATOSA-N Lys-Pro-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YTJFXEDRUOQGSP-DCAQKATOSA-N 0.000 description 1
- HKXSZKJMDBHOTG-CIUDSAMLSA-N Lys-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN HKXSZKJMDBHOTG-CIUDSAMLSA-N 0.000 description 1
- SBQDRNOLGSYHQA-YUMQZZPRSA-N Lys-Ser-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SBQDRNOLGSYHQA-YUMQZZPRSA-N 0.000 description 1
- DLCAXBGXGOVUCD-PPCPHDFISA-N Lys-Thr-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DLCAXBGXGOVUCD-PPCPHDFISA-N 0.000 description 1
- RMOKGALPSPOYKE-KATARQTJSA-N Lys-Thr-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMOKGALPSPOYKE-KATARQTJSA-N 0.000 description 1
- PELXPRPDQRFBGQ-KKUMJFAQSA-N Lys-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N)O PELXPRPDQRFBGQ-KKUMJFAQSA-N 0.000 description 1
- XYLSGAWRCZECIQ-JYJNAYRXSA-N Lys-Tyr-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 XYLSGAWRCZECIQ-JYJNAYRXSA-N 0.000 description 1
- PSVAVKGDUAKZKU-BZSNNMDCSA-N Lys-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCCCN)N)O PSVAVKGDUAKZKU-BZSNNMDCSA-N 0.000 description 1
- MIMXMVDLMDMOJD-BZSNNMDCSA-N Lys-Tyr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O MIMXMVDLMDMOJD-BZSNNMDCSA-N 0.000 description 1
- OHXUUQDOBQKSNB-AVGNSLFASA-N Lys-Val-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OHXUUQDOBQKSNB-AVGNSLFASA-N 0.000 description 1
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- ZAJNRWKGHWGPDQ-SDDRHHMPSA-N Met-Arg-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N ZAJNRWKGHWGPDQ-SDDRHHMPSA-N 0.000 description 1
- MDXAULHWGWETHF-SRVKXCTJSA-N Met-Arg-Val Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CCCNC(N)=N MDXAULHWGWETHF-SRVKXCTJSA-N 0.000 description 1
- UZVWDRPUTHXQAM-FXQIFTODSA-N Met-Asp-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O UZVWDRPUTHXQAM-FXQIFTODSA-N 0.000 description 1
- HWROAFGWPQUPTE-OSUNSFLBSA-N Met-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CCSC)N HWROAFGWPQUPTE-OSUNSFLBSA-N 0.000 description 1
- WUYLWZRHRLLEGB-AVGNSLFASA-N Met-Met-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O WUYLWZRHRLLEGB-AVGNSLFASA-N 0.000 description 1
- NLDXSXDCNZIQCN-ULQDDVLXSA-N Met-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CCSC)CC1=CC=CC=C1 NLDXSXDCNZIQCN-ULQDDVLXSA-N 0.000 description 1
- CIDICGYKRUTYLE-FXQIFTODSA-N Met-Ser-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O CIDICGYKRUTYLE-FXQIFTODSA-N 0.000 description 1
- RDLSEGZJMYGFNS-FXQIFTODSA-N Met-Ser-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RDLSEGZJMYGFNS-FXQIFTODSA-N 0.000 description 1
- LXCSZPUQKMTXNW-BQBZGAKWSA-N Met-Ser-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O LXCSZPUQKMTXNW-BQBZGAKWSA-N 0.000 description 1
- GMMLGMFBYCFCCX-KZVJFYERSA-N Met-Thr-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O GMMLGMFBYCFCCX-KZVJFYERSA-N 0.000 description 1
- KLGIQJRMFHIGCQ-ZFWWWQNUSA-N Met-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCSC)C(=O)NCC(O)=O)=CNC2=C1 KLGIQJRMFHIGCQ-ZFWWWQNUSA-N 0.000 description 1
- HOTNHEUETJELDL-BPNCWPANSA-N Met-Tyr-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCSC)N HOTNHEUETJELDL-BPNCWPANSA-N 0.000 description 1
- VYDLZDRMOFYOGV-TUAOUCFPSA-N Met-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N VYDLZDRMOFYOGV-TUAOUCFPSA-N 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 1
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 1
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 1
- 101100342977 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) leu-1 gene Proteins 0.000 description 1
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 1
- 108091081548 Palindromic sequence Proteins 0.000 description 1
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 1
- UHRNIXJAGGLKHP-DLOVCJGASA-N Phe-Ala-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O UHRNIXJAGGLKHP-DLOVCJGASA-N 0.000 description 1
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 1
- HCTXJGRYAACKOB-SRVKXCTJSA-N Phe-Asn-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HCTXJGRYAACKOB-SRVKXCTJSA-N 0.000 description 1
- MRNRMSDVVSKPGM-AVGNSLFASA-N Phe-Asn-Gln Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MRNRMSDVVSKPGM-AVGNSLFASA-N 0.000 description 1
- KAHUBGWSIQNZQQ-KKUMJFAQSA-N Phe-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KAHUBGWSIQNZQQ-KKUMJFAQSA-N 0.000 description 1
- CDNPIRSCAFMMBE-SRVKXCTJSA-N Phe-Asn-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O CDNPIRSCAFMMBE-SRVKXCTJSA-N 0.000 description 1
- VLZGUAUYZGQKPM-DRZSPHRISA-N Phe-Gln-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O VLZGUAUYZGQKPM-DRZSPHRISA-N 0.000 description 1
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 1
- OPEVYHFJXLCCRT-AVGNSLFASA-N Phe-Gln-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O OPEVYHFJXLCCRT-AVGNSLFASA-N 0.000 description 1
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 1
- MGECUMGTSHYHEJ-QEWYBTABSA-N Phe-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MGECUMGTSHYHEJ-QEWYBTABSA-N 0.000 description 1
- PSKRILMFHNIUAO-JYJNAYRXSA-N Phe-Glu-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N PSKRILMFHNIUAO-JYJNAYRXSA-N 0.000 description 1
- NPLGQVKZFGJWAI-QWHCGFSZSA-N Phe-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O NPLGQVKZFGJWAI-QWHCGFSZSA-N 0.000 description 1
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 1
- PEFJUUYFEGBXFA-BZSNNMDCSA-N Phe-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 PEFJUUYFEGBXFA-BZSNNMDCSA-N 0.000 description 1
- KLXQWABNAWDRAY-ACRUOGEOSA-N Phe-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 KLXQWABNAWDRAY-ACRUOGEOSA-N 0.000 description 1
- OKQQWSNUSQURLI-JYJNAYRXSA-N Phe-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CC=CC=C1)N OKQQWSNUSQURLI-JYJNAYRXSA-N 0.000 description 1
- PBWNICYZGJQKJV-BZSNNMDCSA-N Phe-Phe-Cys Chemical compound N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CS)C(O)=O PBWNICYZGJQKJV-BZSNNMDCSA-N 0.000 description 1
- RVEVENLSADZUMS-IHRRRGAJSA-N Phe-Pro-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O RVEVENLSADZUMS-IHRRRGAJSA-N 0.000 description 1
- MMJJFXWMCMJMQA-STQMWFEESA-N Phe-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CC=CC=C1 MMJJFXWMCMJMQA-STQMWFEESA-N 0.000 description 1
- YMIZSYUAZJSOFL-SRVKXCTJSA-N Phe-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O YMIZSYUAZJSOFL-SRVKXCTJSA-N 0.000 description 1
- HBXAOEBRGLCLIW-AVGNSLFASA-N Phe-Ser-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HBXAOEBRGLCLIW-AVGNSLFASA-N 0.000 description 1
- MCIXMYKSPQUMJG-SRVKXCTJSA-N Phe-Ser-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MCIXMYKSPQUMJG-SRVKXCTJSA-N 0.000 description 1
- IAOZOFPONWDXNT-IXOXFDKPSA-N Phe-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IAOZOFPONWDXNT-IXOXFDKPSA-N 0.000 description 1
- XNMYNGDKJNOKHH-BZSNNMDCSA-N Phe-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XNMYNGDKJNOKHH-BZSNNMDCSA-N 0.000 description 1
- PTDAGKJHZBGDKD-OEAJRASXSA-N Phe-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N)O PTDAGKJHZBGDKD-OEAJRASXSA-N 0.000 description 1
- 108010064785 Phospholipases Proteins 0.000 description 1
- 102000015439 Phospholipases Human genes 0.000 description 1
- 102000004160 Phosphoric Monoester Hydrolases Human genes 0.000 description 1
- 108090000608 Phosphoric Monoester Hydrolases Proteins 0.000 description 1
- APKRGYLBSCWJJP-FXQIFTODSA-N Pro-Ala-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O APKRGYLBSCWJJP-FXQIFTODSA-N 0.000 description 1
- IWNOFCGBMSFTBC-CIUDSAMLSA-N Pro-Ala-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IWNOFCGBMSFTBC-CIUDSAMLSA-N 0.000 description 1
- FYQSMXKJYTZYRP-DCAQKATOSA-N Pro-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 FYQSMXKJYTZYRP-DCAQKATOSA-N 0.000 description 1
- NHDVNAKDACFHPX-GUBZILKMSA-N Pro-Arg-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O NHDVNAKDACFHPX-GUBZILKMSA-N 0.000 description 1
- CYQQWUPHIZVCNY-GUBZILKMSA-N Pro-Arg-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CYQQWUPHIZVCNY-GUBZILKMSA-N 0.000 description 1
- INXAPZFIOVGHSV-CIUDSAMLSA-N Pro-Asn-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 INXAPZFIOVGHSV-CIUDSAMLSA-N 0.000 description 1
- UTAUEDINXUMHLG-FXQIFTODSA-N Pro-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 UTAUEDINXUMHLG-FXQIFTODSA-N 0.000 description 1
- YKQNVTOIYFQMLW-IHRRRGAJSA-N Pro-Cys-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 YKQNVTOIYFQMLW-IHRRRGAJSA-N 0.000 description 1
- UPJGUQPLYWTISV-GUBZILKMSA-N Pro-Gln-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UPJGUQPLYWTISV-GUBZILKMSA-N 0.000 description 1
- NMELOOXSGDRBRU-YUMQZZPRSA-N Pro-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1 NMELOOXSGDRBRU-YUMQZZPRSA-N 0.000 description 1
- CLNJSLSHKJECME-BQBZGAKWSA-N Pro-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H]1CCCN1 CLNJSLSHKJECME-BQBZGAKWSA-N 0.000 description 1
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 1
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 1
- QEWBZBLXDKIQPS-STQMWFEESA-N Pro-Gly-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QEWBZBLXDKIQPS-STQMWFEESA-N 0.000 description 1
- INDVYIOKMXFQFM-SRVKXCTJSA-N Pro-Lys-Gln Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O INDVYIOKMXFQFM-SRVKXCTJSA-N 0.000 description 1
- ZZCJYPLMOPTZFC-SRVKXCTJSA-N Pro-Met-Met Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCSC)C(O)=O ZZCJYPLMOPTZFC-SRVKXCTJSA-N 0.000 description 1
- WLJYLAQSUSIQNH-GUBZILKMSA-N Pro-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@@H]1CCCN1 WLJYLAQSUSIQNH-GUBZILKMSA-N 0.000 description 1
- RFWXYTJSVDUBBZ-DCAQKATOSA-N Pro-Pro-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 RFWXYTJSVDUBBZ-DCAQKATOSA-N 0.000 description 1
- QAAYIXYLEMRULP-SRVKXCTJSA-N Pro-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 QAAYIXYLEMRULP-SRVKXCTJSA-N 0.000 description 1
- BGWKULMLUIUPKY-BQBZGAKWSA-N Pro-Ser-Gly Chemical compound OC(=O)CNC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BGWKULMLUIUPKY-BQBZGAKWSA-N 0.000 description 1
- ITUDDXVFGFEKPD-NAKRPEOUSA-N Pro-Ser-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ITUDDXVFGFEKPD-NAKRPEOUSA-N 0.000 description 1
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 1
- AIOWVDNPESPXRB-YTWAJWBKSA-N Pro-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2)O AIOWVDNPESPXRB-YTWAJWBKSA-N 0.000 description 1
- CXGLFEOYCJFKPR-RCWTZXSCSA-N Pro-Thr-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O CXGLFEOYCJFKPR-RCWTZXSCSA-N 0.000 description 1
- YHUBAXGAAYULJY-ULQDDVLXSA-N Pro-Tyr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O YHUBAXGAAYULJY-ULQDDVLXSA-N 0.000 description 1
- XDKKMRPRRCOELJ-GUBZILKMSA-N Pro-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 XDKKMRPRRCOELJ-GUBZILKMSA-N 0.000 description 1
- IIRBTQHFVNGPMQ-AVGNSLFASA-N Pro-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 IIRBTQHFVNGPMQ-AVGNSLFASA-N 0.000 description 1
- FIODMZKLZFLYQP-GUBZILKMSA-N Pro-Val-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FIODMZKLZFLYQP-GUBZILKMSA-N 0.000 description 1
- WDVSHHCDHLJJJR-UHFFFAOYSA-N Proflavine Chemical compound C1=CC(N)=CC2=NC3=CC(N)=CC=C3C=C21 WDVSHHCDHLJJJR-UHFFFAOYSA-N 0.000 description 1
- 101710150114 Protein rep Proteins 0.000 description 1
- 238000012228 RNA interference-mediated gene silencing Methods 0.000 description 1
- 108091030071 RNAI Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 108020005091 Replication Origin Proteins 0.000 description 1
- 101710152114 Replication protein Proteins 0.000 description 1
- FIXILCYTSAUERA-FXQIFTODSA-N Ser-Ala-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FIXILCYTSAUERA-FXQIFTODSA-N 0.000 description 1
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 1
- YQHZVYJAGWMHES-ZLUOBGJFSA-N Ser-Ala-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YQHZVYJAGWMHES-ZLUOBGJFSA-N 0.000 description 1
- WDXYVIIVDIDOSX-DCAQKATOSA-N Ser-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N WDXYVIIVDIDOSX-DCAQKATOSA-N 0.000 description 1
- OYEDZGNMSBZCIM-XGEHTFHBSA-N Ser-Arg-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OYEDZGNMSBZCIM-XGEHTFHBSA-N 0.000 description 1
- VAUMZJHYZQXZBQ-WHFBIAKZSA-N Ser-Asn-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O VAUMZJHYZQXZBQ-WHFBIAKZSA-N 0.000 description 1
- YMEXHZTVKDAKIY-GHCJXIJMSA-N Ser-Asn-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO)C(O)=O YMEXHZTVKDAKIY-GHCJXIJMSA-N 0.000 description 1
- QPFJSHSJFIYDJZ-GHCJXIJMSA-N Ser-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO QPFJSHSJFIYDJZ-GHCJXIJMSA-N 0.000 description 1
- BGOWRLSWJCVYAQ-CIUDSAMLSA-N Ser-Asp-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BGOWRLSWJCVYAQ-CIUDSAMLSA-N 0.000 description 1
- IXUGADGDCQDLSA-FXQIFTODSA-N Ser-Gln-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N IXUGADGDCQDLSA-FXQIFTODSA-N 0.000 description 1
- YPUSXTWURJANKF-KBIXCLLPSA-N Ser-Gln-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YPUSXTWURJANKF-KBIXCLLPSA-N 0.000 description 1
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 1
- CLKKNZQUQMZDGD-SRVKXCTJSA-N Ser-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CN=CN1 CLKKNZQUQMZDGD-SRVKXCTJSA-N 0.000 description 1
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 1
- QYSFWUIXDFJUDW-DCAQKATOSA-N Ser-Leu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYSFWUIXDFJUDW-DCAQKATOSA-N 0.000 description 1
- GJFYFGOEWLDQGW-GUBZILKMSA-N Ser-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GJFYFGOEWLDQGW-GUBZILKMSA-N 0.000 description 1
- HEUVHBXOVZONPU-BJDJZHNGSA-N Ser-Leu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HEUVHBXOVZONPU-BJDJZHNGSA-N 0.000 description 1
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 1
- PTWIYDNFWPXQSD-GARJFASQSA-N Ser-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CO)N)C(=O)O PTWIYDNFWPXQSD-GARJFASQSA-N 0.000 description 1
- PMCMLDNPAZUYGI-DCAQKATOSA-N Ser-Lys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMCMLDNPAZUYGI-DCAQKATOSA-N 0.000 description 1
- NQZFFLBPNDLTPO-DLOVCJGASA-N Ser-Phe-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CO)N NQZFFLBPNDLTPO-DLOVCJGASA-N 0.000 description 1
- RRVFEDGUXSYWOW-BZSNNMDCSA-N Ser-Phe-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RRVFEDGUXSYWOW-BZSNNMDCSA-N 0.000 description 1
- CKDXFSPMIDSMGV-GUBZILKMSA-N Ser-Pro-Val Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O CKDXFSPMIDSMGV-GUBZILKMSA-N 0.000 description 1
- FZXOPYUEQGDGMS-ACZMJKKPSA-N Ser-Ser-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O FZXOPYUEQGDGMS-ACZMJKKPSA-N 0.000 description 1
- SQHKXWODKJDZRC-LKXGYXEUSA-N Ser-Thr-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQHKXWODKJDZRC-LKXGYXEUSA-N 0.000 description 1
- WUXCHQZLUHBSDJ-LKXGYXEUSA-N Ser-Thr-Asp Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WUXCHQZLUHBSDJ-LKXGYXEUSA-N 0.000 description 1
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 1
- NERYDXBVARJIQS-JYBASQMISA-N Ser-Trp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CO)N)O NERYDXBVARJIQS-JYBASQMISA-N 0.000 description 1
- QYBRQMLZDDJBSW-AVGNSLFASA-N Ser-Tyr-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O QYBRQMLZDDJBSW-AVGNSLFASA-N 0.000 description 1
- UKKROEYWYIHWBD-ZKWXMUAHSA-N Ser-Val-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O UKKROEYWYIHWBD-ZKWXMUAHSA-N 0.000 description 1
- LLSLRQOEAFCZLW-NRPADANISA-N Ser-Val-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LLSLRQOEAFCZLW-NRPADANISA-N 0.000 description 1
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 1
- ANOQEBQWIAYIMV-AEJSXWLSSA-N Ser-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ANOQEBQWIAYIMV-AEJSXWLSSA-N 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 108010034546 Serratia marcescens nuclease Proteins 0.000 description 1
- 241000700584 Simplexvirus Species 0.000 description 1
- 229930006000 Sucrose Natural products 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- DFTCYYILCSQGIZ-GCJQMDKQSA-N Thr-Ala-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFTCYYILCSQGIZ-GCJQMDKQSA-N 0.000 description 1
- YRNBANYVJJBGDI-VZFHVOOUSA-N Thr-Ala-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(=O)O)N)O YRNBANYVJJBGDI-VZFHVOOUSA-N 0.000 description 1
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 1
- CAJFZCICSVBOJK-SHGPDSBTSA-N Thr-Ala-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAJFZCICSVBOJK-SHGPDSBTSA-N 0.000 description 1
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 1
- PQLXHSACXPGWPD-GSSVUCPTSA-N Thr-Asn-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PQLXHSACXPGWPD-GSSVUCPTSA-N 0.000 description 1
- VXMHQKHDKCATDV-VEVYYDQMSA-N Thr-Asp-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VXMHQKHDKCATDV-VEVYYDQMSA-N 0.000 description 1
- APIQKJYZDWVOCE-VEVYYDQMSA-N Thr-Asp-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O APIQKJYZDWVOCE-VEVYYDQMSA-N 0.000 description 1
- OHAJHDJOCKKJLV-LKXGYXEUSA-N Thr-Asp-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O OHAJHDJOCKKJLV-LKXGYXEUSA-N 0.000 description 1
- LIXBDERDAGNVAV-XKBZYTNZSA-N Thr-Gln-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O LIXBDERDAGNVAV-XKBZYTNZSA-N 0.000 description 1
- KGKWKSSSQGGYAU-SUSMZKCASA-N Thr-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N)O KGKWKSSSQGGYAU-SUSMZKCASA-N 0.000 description 1
- GKWNLDNXMMLRMC-GLLZPBPUSA-N Thr-Glu-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O GKWNLDNXMMLRMC-GLLZPBPUSA-N 0.000 description 1
- XFTYVCHLARBHBQ-FOHZUACHSA-N Thr-Gly-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XFTYVCHLARBHBQ-FOHZUACHSA-N 0.000 description 1
- IMULJHHGAUZZFE-MBLNEYKQSA-N Thr-Gly-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IMULJHHGAUZZFE-MBLNEYKQSA-N 0.000 description 1
- DJDSEDOKJTZBAR-ZDLURKLDSA-N Thr-Gly-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O DJDSEDOKJTZBAR-ZDLURKLDSA-N 0.000 description 1
- KRGDDWVBBDLPSJ-CUJWVEQBSA-N Thr-His-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O KRGDDWVBBDLPSJ-CUJWVEQBSA-N 0.000 description 1
- RRRRCRYTLZVCEN-HJGDQZAQSA-N Thr-Leu-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O RRRRCRYTLZVCEN-HJGDQZAQSA-N 0.000 description 1
- YOOAQCZYZHGUAZ-KATARQTJSA-N Thr-Leu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YOOAQCZYZHGUAZ-KATARQTJSA-N 0.000 description 1
- CJXURNZYNHCYFD-WDCWCFNPSA-N Thr-Lys-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O CJXURNZYNHCYFD-WDCWCFNPSA-N 0.000 description 1
- SPVHQURZJCUDQC-VOAKCMCISA-N Thr-Lys-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O SPVHQURZJCUDQC-VOAKCMCISA-N 0.000 description 1
- QNCFWHZVRNXAKW-OEAJRASXSA-N Thr-Lys-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNCFWHZVRNXAKW-OEAJRASXSA-N 0.000 description 1
- WRUWXBBEFUTJOU-XGEHTFHBSA-N Thr-Met-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N)O WRUWXBBEFUTJOU-XGEHTFHBSA-N 0.000 description 1
- NZRUWPIYECBYRK-HTUGSXCWSA-N Thr-Phe-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O NZRUWPIYECBYRK-HTUGSXCWSA-N 0.000 description 1
- MXDOAJQRJBMGMO-FJXKBIBVSA-N Thr-Pro-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O MXDOAJQRJBMGMO-FJXKBIBVSA-N 0.000 description 1
- FWTFAZKJORVTIR-VZFHVOOUSA-N Thr-Ser-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O FWTFAZKJORVTIR-VZFHVOOUSA-N 0.000 description 1
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 1
- MFMGPEKYBXFIRF-SUSMZKCASA-N Thr-Thr-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MFMGPEKYBXFIRF-SUSMZKCASA-N 0.000 description 1
- PJCYRZVSACOYSN-ZJDVBMNYSA-N Thr-Thr-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O PJCYRZVSACOYSN-ZJDVBMNYSA-N 0.000 description 1
- LECUEEHKUFYOOV-ZJDVBMNYSA-N Thr-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)[C@@H](C)O LECUEEHKUFYOOV-ZJDVBMNYSA-N 0.000 description 1
- ZEJBJDHSQPOVJV-UAXMHLISSA-N Thr-Trp-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZEJBJDHSQPOVJV-UAXMHLISSA-N 0.000 description 1
- CYCGARJWIQWPQM-YJRXYDGGSA-N Thr-Tyr-Ser Chemical compound C[C@@H](O)[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CO)C([O-])=O)CC1=CC=C(O)C=C1 CYCGARJWIQWPQM-YJRXYDGGSA-N 0.000 description 1
- LVRFMARKDGGZMX-IZPVPAKOSA-N Thr-Tyr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=C(O)C=C1 LVRFMARKDGGZMX-IZPVPAKOSA-N 0.000 description 1
- BKIOKSLLAAZYTC-KKHAAJSZSA-N Thr-Val-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O BKIOKSLLAAZYTC-KKHAAJSZSA-N 0.000 description 1
- 102000006601 Thymidine Kinase Human genes 0.000 description 1
- 108020004440 Thymidine kinase Proteins 0.000 description 1
- AUYYCJSJGJYCDS-LBPRGKRZSA-N Thyrolar Chemical class IC1=CC(C[C@H](N)C(O)=O)=CC(I)=C1OC1=CC=C(O)C(I)=C1 AUYYCJSJGJYCDS-LBPRGKRZSA-N 0.000 description 1
- 102000002248 Thyroxine-Binding Globulin Human genes 0.000 description 1
- 108010000259 Thyroxine-Binding Globulin Proteins 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- VZBWRZGNEPBRDE-HZUKXOBISA-N Trp-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N VZBWRZGNEPBRDE-HZUKXOBISA-N 0.000 description 1
- SSNGFWKILJLTQM-QEJZJMRPSA-N Trp-Gln-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SSNGFWKILJLTQM-QEJZJMRPSA-N 0.000 description 1
- OBWQLWYNNZPWGX-QEJZJMRPSA-N Trp-Gln-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O OBWQLWYNNZPWGX-QEJZJMRPSA-N 0.000 description 1
- FNOQJVHFVLVMOS-AAEUAGOBSA-N Trp-Gly-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N FNOQJVHFVLVMOS-AAEUAGOBSA-N 0.000 description 1
- NWQCKAPDGQMZQN-IHPCNDPISA-N Trp-Lys-Leu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O NWQCKAPDGQMZQN-IHPCNDPISA-N 0.000 description 1
- CSOBBJWWODOYGW-ILWGZMRPSA-N Trp-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CNC4=CC=CC=C43)N)C(=O)O CSOBBJWWODOYGW-ILWGZMRPSA-N 0.000 description 1
- SUEGAFMNTXXNLR-WFBYXXMGSA-N Trp-Ser-Ala Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O SUEGAFMNTXXNLR-WFBYXXMGSA-N 0.000 description 1
- UIRPULWLRODAEQ-QEJZJMRPSA-N Trp-Ser-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 UIRPULWLRODAEQ-QEJZJMRPSA-N 0.000 description 1
- COLXBVRHSKPKIE-NYVOZVTQSA-N Trp-Trp-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O COLXBVRHSKPKIE-NYVOZVTQSA-N 0.000 description 1
- UGFOSENEZHEQKX-PJODQICGSA-N Trp-Val-Ala Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(=O)N[C@@H](C)C(O)=O UGFOSENEZHEQKX-PJODQICGSA-N 0.000 description 1
- NMOIRIIIUVELLY-WDSOQIARSA-N Trp-Val-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)C(C)C)=CNC2=C1 NMOIRIIIUVELLY-WDSOQIARSA-N 0.000 description 1
- BURPTJBFWIOHEY-UWJYBYFXSA-N Tyr-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 BURPTJBFWIOHEY-UWJYBYFXSA-N 0.000 description 1
- NOXKHHXSHQFSGJ-FQPOAREZSA-N Tyr-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NOXKHHXSHQFSGJ-FQPOAREZSA-N 0.000 description 1
- GFHYISDTIWZUSU-QWRGUYRKSA-N Tyr-Asn-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GFHYISDTIWZUSU-QWRGUYRKSA-N 0.000 description 1
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 1
- NSTPFWRAIDTNGH-BZSNNMDCSA-N Tyr-Asn-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NSTPFWRAIDTNGH-BZSNNMDCSA-N 0.000 description 1
- IXTQGBGHWQEEDE-AVGNSLFASA-N Tyr-Asp-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 IXTQGBGHWQEEDE-AVGNSLFASA-N 0.000 description 1
- RYSNTWVRSLCAJZ-RYUDHWBXSA-N Tyr-Gln-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 RYSNTWVRSLCAJZ-RYUDHWBXSA-N 0.000 description 1
- JWGXUKHIKXZWNG-RYUDHWBXSA-N Tyr-Gly-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O JWGXUKHIKXZWNG-RYUDHWBXSA-N 0.000 description 1
- QAYSODICXVZUIA-WLTAIBSBSA-N Tyr-Gly-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O QAYSODICXVZUIA-WLTAIBSBSA-N 0.000 description 1
- MVFQLSPDMMFCMW-KKUMJFAQSA-N Tyr-Leu-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O MVFQLSPDMMFCMW-KKUMJFAQSA-N 0.000 description 1
- NSGZILIDHCIZAM-KKUMJFAQSA-N Tyr-Leu-Ser Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NSGZILIDHCIZAM-KKUMJFAQSA-N 0.000 description 1
- QFXVAFIHVWXXBJ-AVGNSLFASA-N Tyr-Ser-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O QFXVAFIHVWXXBJ-AVGNSLFASA-N 0.000 description 1
- HRHYJNLMIJWGLF-BZSNNMDCSA-N Tyr-Ser-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 HRHYJNLMIJWGLF-BZSNNMDCSA-N 0.000 description 1
- PLVVHGFEMSDRET-IHPCNDPISA-N Tyr-Ser-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC3=CC=C(C=C3)O)N PLVVHGFEMSDRET-IHPCNDPISA-N 0.000 description 1
- BIVIUZRBCAUNPW-JRQIVUDYSA-N Tyr-Thr-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O BIVIUZRBCAUNPW-JRQIVUDYSA-N 0.000 description 1
- SMUWZUSWMWVOSL-JYJNAYRXSA-N Tyr-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N SMUWZUSWMWVOSL-JYJNAYRXSA-N 0.000 description 1
- IZFVRRYRMQFVGX-NRPADANISA-N Val-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N IZFVRRYRMQFVGX-NRPADANISA-N 0.000 description 1
- KKHRWGYHBZORMQ-NHCYSSNCSA-N Val-Arg-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKHRWGYHBZORMQ-NHCYSSNCSA-N 0.000 description 1
- COYSIHFOCOMGCF-WPRPVWTQSA-N Val-Arg-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-WPRPVWTQSA-N 0.000 description 1
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 1
- QGFPYRPIUXBYGR-YDHLFZDLSA-N Val-Asn-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N QGFPYRPIUXBYGR-YDHLFZDLSA-N 0.000 description 1
- IDKGBVZGNTYYCC-QXEWZRGKSA-N Val-Asn-Pro Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(O)=O IDKGBVZGNTYYCC-QXEWZRGKSA-N 0.000 description 1
- JLFKWDAZBRYCGX-ZKWXMUAHSA-N Val-Asn-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N JLFKWDAZBRYCGX-ZKWXMUAHSA-N 0.000 description 1
- DBOXBUDEAJVKRE-LSJOCFKGSA-N Val-Asn-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N DBOXBUDEAJVKRE-LSJOCFKGSA-N 0.000 description 1
- XQVRMLRMTAGSFJ-QXEWZRGKSA-N Val-Asp-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XQVRMLRMTAGSFJ-QXEWZRGKSA-N 0.000 description 1
- XLDYBRXERHITNH-QSFUFRPTSA-N Val-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)C(C)C XLDYBRXERHITNH-QSFUFRPTSA-N 0.000 description 1
- OVLIFGQSBSNGHY-KKHAAJSZSA-N Val-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N)O OVLIFGQSBSNGHY-KKHAAJSZSA-N 0.000 description 1
- CPTQYHDSVGVGDZ-UKJIMTQDSA-N Val-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N CPTQYHDSVGVGDZ-UKJIMTQDSA-N 0.000 description 1
- JXGWQYWDUOWQHA-DZKIICNBSA-N Val-Gln-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N JXGWQYWDUOWQHA-DZKIICNBSA-N 0.000 description 1
- AGKDVLSDNSTLFA-UMNHJUIQSA-N Val-Gln-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N AGKDVLSDNSTLFA-UMNHJUIQSA-N 0.000 description 1
- XGJLNBNZNMVJRS-NRPADANISA-N Val-Glu-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O XGJLNBNZNMVJRS-NRPADANISA-N 0.000 description 1
- UEHRGZCNLSWGHK-DLOVCJGASA-N Val-Glu-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UEHRGZCNLSWGHK-DLOVCJGASA-N 0.000 description 1
- BEGDZYNDCNEGJZ-XVKPBYJWSA-N Val-Gly-Gln Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O BEGDZYNDCNEGJZ-XVKPBYJWSA-N 0.000 description 1
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 1
- FEFZWCSXEMVSPO-LSJOCFKGSA-N Val-His-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](C)C(O)=O FEFZWCSXEMVSPO-LSJOCFKGSA-N 0.000 description 1
- DAVNYIUELQBTAP-XUXIUFHCSA-N Val-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N DAVNYIUELQBTAP-XUXIUFHCSA-N 0.000 description 1
- QRVPEKJBBRYISE-XUXIUFHCSA-N Val-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N QRVPEKJBBRYISE-XUXIUFHCSA-N 0.000 description 1
- XPKCFQZDQGVJCX-RHYQMDGZSA-N Val-Lys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N)O XPKCFQZDQGVJCX-RHYQMDGZSA-N 0.000 description 1
- YLRAFVVWZRSZQC-DZKIICNBSA-N Val-Phe-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YLRAFVVWZRSZQC-DZKIICNBSA-N 0.000 description 1
- VIKZGAUAKQZDOF-NRPADANISA-N Val-Ser-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O VIKZGAUAKQZDOF-NRPADANISA-N 0.000 description 1
- UQMPYVLTQCGRSK-IFFSRLJSSA-N Val-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N)O UQMPYVLTQCGRSK-IFFSRLJSSA-N 0.000 description 1
- YQYFYUSYEDNLSD-YEPSODPASA-N Val-Thr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O YQYFYUSYEDNLSD-YEPSODPASA-N 0.000 description 1
- NLNCNKIVJPEFBC-DLOVCJGASA-N Val-Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O NLNCNKIVJPEFBC-DLOVCJGASA-N 0.000 description 1
- 108020005202 Viral DNA Proteins 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N adenyl group Chemical group N1=CN=C2N=CNC2=C1N GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- UCTWMZQNUQWSLP-UHFFFAOYSA-N adrenaline Chemical compound CNCC(O)C1=CC=C(O)C(O)=C1 UCTWMZQNUQWSLP-UHFFFAOYSA-N 0.000 description 1
- 108010011559 alanylphenylalanine Proteins 0.000 description 1
- 229950006993 alipogene tiparvovec Drugs 0.000 description 1
- 108010050122 alpha 1-Antitrypsin Proteins 0.000 description 1
- 229940024142 alpha 1-antitrypsin Drugs 0.000 description 1
- 102000006646 aminoglycoside phosphotransferase Human genes 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 206010003246 arthritis Diseases 0.000 description 1
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 1
- 108010021908 aspartyl-aspartyl-glutamyl-aspartic acid Proteins 0.000 description 1
- 108010068265 aspartyltyrosine Proteins 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 244000309464 bull Species 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 230000022131 cell cycle Effects 0.000 description 1
- 230000005101 cell tropism Effects 0.000 description 1
- 108091092356 cellular DNA Proteins 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 208000035250 cutaneous malignant susceptibility to 1 melanoma Diseases 0.000 description 1
- 108010016616 cysteinylglycine Proteins 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000009510 drug design Methods 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 210000001163 endosome Anatomy 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 230000017188 evasion or tolerance of host immune response Effects 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 239000013604 expression vector Substances 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 239000012467 final product Substances 0.000 description 1
- 102000034287 fluorescent proteins Human genes 0.000 description 1
- 108091006047 fluorescent proteins Proteins 0.000 description 1
- 235000021588 free fatty acids Nutrition 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 1
- IRSCQMHQWWYFCW-UHFFFAOYSA-N ganciclovir Chemical compound O=C1NC(N)=NC2=C1N=CN2COC(CO)CO IRSCQMHQWWYFCW-UHFFFAOYSA-N 0.000 description 1
- 229960002963 ganciclovir Drugs 0.000 description 1
- 230000004077 genetic alteration Effects 0.000 description 1
- 231100000118 genetic alteration Toxicity 0.000 description 1
- 230000009395 genetic defect Effects 0.000 description 1
- 125000000291 glutamic acid group Chemical group N[C@@H](CCC(O)=O)C(=O)* 0.000 description 1
- 108010079547 glutamylmethionine Proteins 0.000 description 1
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 1
- 108010001064 glycyl-glycyl-glycyl-glycine Proteins 0.000 description 1
- 108010051307 glycyl-glycyl-proline Proteins 0.000 description 1
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 1
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 238000003306 harvesting Methods 0.000 description 1
- 230000035876 healing Effects 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 208000009429 hemophilia B Diseases 0.000 description 1
- 108010028295 histidylhistidine Proteins 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 210000005260 human cell Anatomy 0.000 description 1
- 230000003301 hydrolyzing effect Effects 0.000 description 1
- 230000028993 immune response Effects 0.000 description 1
- 238000003018 immunoassay Methods 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 230000010189 intracellular transport Effects 0.000 description 1
- 108010027338 isoleucylcysteine Proteins 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 230000002147 killing effect Effects 0.000 description 1
- 238000011031 large-scale manufacturing process Methods 0.000 description 1
- 125000001909 leucine group Chemical group [H]N(*)C(C(*)=O)C([H])([H])C(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 1
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 239000012160 loading buffer Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000012139 lysis buffer Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 1
- 108010034507 methionyltryptophan Proteins 0.000 description 1
- 229960000485 methotrexate Drugs 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000007479 molecular analysis Methods 0.000 description 1
- 210000005087 mononuclear cell Anatomy 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 208000014500 neuronal tumor Diseases 0.000 description 1
- 230000003472 neutralizing effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 108010084572 phenylalanyl-valine Proteins 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 1
- 239000008055 phosphate buffer solution Substances 0.000 description 1
- 150000003904 phospholipids Chemical class 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 230000003389 potentiating effect Effects 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 108010090894 prolylleucine Proteins 0.000 description 1
- 125000000561 purinyl group Chemical group N1=C(N=C2N=CNC2=C1)* 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 238000002804 saturated mutagenesis Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 125000003607 serino group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C(O[H])([H])[H] 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- 125000000341 threoninyl group Chemical group [H]OC([H])(C([H])([H])[H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 239000005495 thyroid hormone Substances 0.000 description 1
- 229940036555 thyroid hormone Drugs 0.000 description 1
- XUIIKFGFIJCVMT-UHFFFAOYSA-N thyroxine-binding globulin Natural products IC1=CC(CC([NH3+])C([O-])=O)=CC(I)=C1OC1=CC(I)=C(O)C(I)=C1 XUIIKFGFIJCVMT-UHFFFAOYSA-N 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 231100000167 toxic agent Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
- 239000003440 toxic substance Substances 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 108010029384 tryptophyl-histidine Proteins 0.000 description 1
- 108010078580 tyrosylleucine Proteins 0.000 description 1
- 238000000108 ultra-filtration Methods 0.000 description 1
- 230000024275 uncoating of virus Effects 0.000 description 1
- 241001529453 unidentified herpesvirus Species 0.000 description 1
- 238000011870 unpaired t-test Methods 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 210000003462 vein Anatomy 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/86—Viral vectors
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/005—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N7/00—Viruses; Bacteriophages; Compositions thereof; Preparation or purification thereof
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2710/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
- C12N2710/00011—Details
- C12N2710/14011—Baculoviridae
- C12N2710/14041—Use of virus, viral particle or viral elements as a vector
- C12N2710/14043—Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vectore
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2750/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
- C12N2750/00011—Details
- C12N2750/14011—Parvoviridae
- C12N2750/14111—Dependovirus, e.g. adenoassociated viruses
- C12N2750/14121—Viruses as such, e.g. new isolates, mutants or their genomic sequences
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2750/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
- C12N2750/00011—Details
- C12N2750/14011—Parvoviridae
- C12N2750/14111—Dependovirus, e.g. adenoassociated viruses
- C12N2750/14122—New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2750/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
- C12N2750/00011—Details
- C12N2750/14011—Parvoviridae
- C12N2750/14111—Dependovirus, e.g. adenoassociated viruses
- C12N2750/14141—Use of virus, viral particle or viral elements as a vector
- C12N2750/14143—Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2750/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
- C12N2750/00011—Details
- C12N2750/14011—Parvoviridae
- C12N2750/14111—Dependovirus, e.g. adenoassociated viruses
- C12N2750/14151—Methods of production or purification of viral material
- C12N2750/14152—Methods of production or purification of viral material relating to complementing cells and packaging systems for producing virus or viral particles
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2799/00—Uses of viruses
- C12N2799/02—Uses of viruses as vector
- C12N2799/021—Uses of viruses as vector for the expression of a heterologous nucleic acid
- C12N2799/026—Uses of viruses as vector for the expression of a heterologous nucleic acid where the vector is derived from a baculovirus
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
- C12N2830/008—Vector systems having a special element relevant for transcription cell type or tissue specific enhancer/promoter combination
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Virology (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Immunology (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Peptides Or Proteins (AREA)
Abstract
본 발명은 곤충 세포에서 아데노 연관 바이러스 벡터의 생산에 관한 것이다. 따라서, 상기 곤충 세포는 아데노 연관 바이러스(AAV) 캡시드 단백질을 암호화하는 제1 뉴클레오타이드 서열을 포함하며, 여기서 AAV VP1 캡시드 단백질의 번역을 위한 개시 코돈은 논-ATG, 차선 개시 코돈이며, 그리고 하나 이상의 아미노산 잔기에 대한 코딩 서열은 상기 차선 번역 개시 코돈과 제1 아미노산 잔기가 알라닌, 글리신, 발린, 아스파트산 또는 글루탐산인 와일드 타입 캡시드 단백질 아미노산 서열의 위치 2의 아미노산 잔기에 상응하는 아미노산 잔기를 암호화하는 코돈 사이에 삽입된다. 상기 곤충 세포는 적어도 하나의 AAV 역위 말단 반복(ITR) 뉴클레오타이드 서열을 포함하는 제2 뉴클레오타이드 서열; 곤충 세포에서 발현을 위한 발현 조절 서열에 작동가능하게 연결된 Rep52 또는 Rep40 코딩 서열을 포함하는 제3 뉴클레오타이드 서열; 및 곤충 세포에서 발현을 위한 발현 조절 서열에 작동가능하게 연결된 Rep78 또는 Rep68 코딩 서열을 포함하는 제4 뉴클레오타이드 서열을 더 포함한다. 본 발명은 또한 변경된 비율의 상기 바이러스 캡시드 단백질들을 갖는 아데노 연관 바이러스 벡터에 관한 것이다.
Description
본 발명은 곤충 세포에서 아데노 연관 바이러스의 생산 및 향상된 감염성을 제공하는 아데노 연관 바이러스에 관한 것이다.
아데노 연관 바이러스(AAV)는 인간 유전자 치료를 위한 가장 유망한 바이러스 벡터 중 하나로서 간주될 수 있다. AAV는 분열하지 않는 인간 세포뿐만 아니라 분열하는 인간 세포를 효율적으로 감염시키는 능력을 가지며, 상기 AAV 바이러스 게놈은 숙주 세포의 게놈 내의 단일 염색체 위치 내로 통합되며, 그리고 가장 중요하게도, AAV가 많은 인간들 내에 존재함에도 불구하고, 이는 어떠한 질병과 결코 연관되지 않아왔다. 이러한 이점의 견지에서, 재조합 아데노 연관 바이러스(rAAV)는 B형 혈우병, 악성 흑색종, 낭성 섬유증, 및 기타 질병을 위한 유전자 치료 임상 시험에 평가되어 지고 있다. 유럽에서 1차 유전자 치료약, Alipogene tiparvovec (Glybera®, uniQure)의 수많은 임상 시험 및 최근의 승인은 AAV가 임상 수행의 중심이 되는 가능성을 받쳐준다.
일반적으로, 재조합 AAV를 위한 두 가지 주요 타입의 생산 시스템이 있다. 한편으로, (293 세포, COS 세포, HeLa 세포, KB 세포와 같은) 포유류 세포 타입에서 통상적인 생산 시스템이 있으며, 다른 한편으로 보다 최근에, 곤충 세포를 이용한 생산 시스템이 개발되었다.
포유류 생산 시스템은 여러 결점을 겪으며, 그 중에서 치료 용도를 위해 가장 중요한 것은 세포당 만들어지는 rAAV 파티클의 제한된 수이다(104 파티클의 자릿수(Clark, 2002, Kidney Int. 61(Suppl. 1): 9-15에서 리뷰됨)). 임상 연구를 위해, 1015 이상의 rAAV 파티클이 요구될 수 있다. 이러한 수의 rAAV 파티클을 생산하기 위해, 5,000 175-cm2 플라스크의 세포들과 동등한, 약 1011 배양된 인간 293 세포를 이용한 트랜스펙션 및 배양이 요구될 수 있으며, 이는 최대 1011 293 세포의 트랜스펙팅을 의미한다. 임상 시험을 위한 재료를 확보하기 위해 포유류 세포 배양 시스템을 이용한 rAAV의 대규모 생산은 이미 문제가 있으며, 상업적 규모의 생산은 전혀 실현 가능한 것이 아닌 것으로 입증되었다. 또한, 포유류 세포 배양에서 생산되는 임상용 벡터는 포유류 숙주 세포에 존재하는 원하지 않는, 아마도 병원성의 물질로 오염될 위험이 항상 존재한다.
포유류 생산 시스템의 이러한 문제를 해소하기 위해, 곤충 세포를 이용한 AAV 생산 시스템이 개발되었다(Urabe 등, 2002, Hum. Gene Ther. 13: 1935-1943; US 20030148506 및 US 20040197895). 곤충 세포에서 AAV의 생산에 있어서, 3가지 AAV 캡시드 단백질(VP1, VP2 및 VP3)의 정확한 화학양론을 달성하기 위해 일부 변형이 필요하였으며, 이는 2가지 스플라이스 억셉터 사이트의 교대적인 사용과 곤충 세포에 의해 정밀하게 재생성되지 않는 VP2에 대한 ACG 개시 코돈의 차선적인 활용의 조합에 의존한다. 곤충 세포에서 캡시드 단백질의 정확한 화학양론을 모방하기 위해, Urabe 등(2002, 상기 참조)은 스플라이싱이 필요없이 3가지 모든 VP 단백질을 발현할 수 있는 단일의 폴리시스트론성 메신저 내로 전사되는 구조체를 사용하며, 그리고 여기서 가장 상류의 개시 코돈은 차선의 개시 코돈 ACG로 대체된다.
WO2007/046703에는 곤충 세포에서 AAV 캡시드 단백질의 화학양론의 최적화에 의한 생산에 기초한 배큘로바이러스-생산된 rAAV 벡터의 감염성의 추가 향상이 개시되어 있다.
Kohlbrenner 등(2005, Mol. Ther. 12: 1217-25)은 Urabe 등에 의해 사용된 2가지 Rep 단백질의 발현을 위한 배큘로바이러스 구조체가 내재된 불안정성을 겪는 것으로 보고하였다. Urabe의 오리지널 벡터에서 2가지 Rep 유전자의 회문구조 배향(palindromic orientation)을 분열시키고, Rep52 및 Rep78을 발현하는 2가지 개별 배큘로바이러스 벡터를 디자인함으로써, Kohlbrenner 등(2005, 상기 참조)은 벡터의 패시징 안정성을 증가시켰다. 그러나, 적어도 5 패시지에 걸쳐 곤충 세포에서 2 독립적인 배큘로바이러스-Rep 구조체로부터 Rep78 및 Rep52의 일관된 발현에도 불구하고, rAAV 벡터 수율은 Urabe 등(2002, 상기 참조)에 의해 디자인된 오리지널 배큘로바이러스-Rep 구조체에 비해 5 내지 10 배 더 낮다.
WO2009/014445는 차등 코돈 바이어스(differential codon biases)를 갖는 반복된 코딩 서열을 이용하여 배큘로바이러스-기초 rAAV 벡터 생산 중에 안정성을 향상시키는 대안을 제공한다.
Urabe 등(J. Virol., 2006, 80(4):1874-1885)은 VP1 캡시드 단백질의 개시 코돈으로서 ACG를 사용하는 배큘로바이러스 시스템에서 생산된 AAV5 파티클이 저조한 감염성을 가지며, 그리고 ACG 개시 코돈으로부터 발현된 VP1을 갖는 AAV2와 대조적으로, AAV5 VP1 코딩 서열에서 +4 위치를 G-잔기로 돌연변이시키는 것은 감염성을 향상시키지 못한 것으로 보고한다. Urabe 등은 이 문제를, 비리온의 감염성을 향상시키기 위해 AAV5 VP1의 적어도 49 아미노산이 AAV2 VP1의 이에 상응하는 부분으로 대체된 키메릭 AAV2/5 VP1 단백질을 제작함으로써 해소하였다. 따라서, 광범위한 변형 없이 감염성을 보유하는 ACG 개시 코돈으로부터 발현된 AAV5 VP1에 대한 요구가 당해 기술분야에 여전히 존재한다.
그러나, 본 발명자들은 AAV 벡터, 특히 배큘로바이러스 시스템에서 생산된, Urabe(Urabe 등, 2002, Hum. Gene Ther. 13: 1935-1943), WO2007/046703 또는 WO2009/014445에 따라 변형된 논-키메릭 AAV5 벡터와 같은 AAV5 벡터가 예를 들어, 통상적인 포유류 293 세포에서 생산된 이에 상응하는 AAV 벡터와 비교하여 마우스에서 시험관 내 및 생체 내 시험에서 감소된 감염성을 나타낸다는 것을 발견하였다. 이런 이유로, 향상된 감염성을 갖는 rAAV 벡터를 위한 배큘로바이러스-기초 생산 시스템에 대한 요구가 여전히 존재한다.
발명의 설명
발명의 간단한 설명
제1 견지로, 본 발명은 오픈 리딩 프레임을 포함하는 뉴클레오타이드 서열을 갖는 핵산 분자로서, 여기서 상기 리딩 프레임은 5'에서 3' 순서로
(i) CTG, ACG, TTG 및 GTG로 구성된 그룹으로부터 선택된 차선 번역 개시 코돈인, 제1 코돈;
(ii) 알라닌, 글리신, 발린, 아스파트산 및 글루탐산으로 구성된 그룹으로부터 선택된 아미노산 잔기를 암호화하는 제2 코돈;
(iii) 선택적으로, 상기 제2 코돈 다음에 부가적인 아미노산 잔기를 암호화하는 하나 이상의 코돈; 및
(iv) 아데노 연관 바이러스(AAV) 캡시드 단백질을 암호화하는 서열로서, 이에 따른 서열은 단지 VP1 번역 개시 코돈만이 결핍된, 아데노 연관 바이러스(AAV) 캡시드 단백질을 암호화하는 서열
을 포함하거나, 또는 이로 구성되는, 핵산 분자에 관한 것이다.
바람직한 구현으로, AAV 캡시드 단백질은 AAV 혈청형 5, AAV 혈청형 8, 또는 AAV 혈청형 9 캡시드 단백질이며, 보다 바람직하게 AAV 캡시드 단백질은 SEQ ID NO: 22, 28, 30, 71 및 73으로 구성된 그룹으로부터 선택된 아미노산 서열을 갖는다.
변형적으로 또는 어떠한 이전의 구현과 조합으로, 보다 바람직한 구현으로 제2 코돈은 알라닌을 암호화한다.
변형적으로 또는 어떠한 이전의 구현과 조합으로, 보다 바람직한 구현으로 제2 코돈은 GCT, GCC, GCA, GCG 및 GGU로 구성된 그룹으로부터 선택되며, 바람직하게 여기서 상기 코돈은 GCT이다.
제2 견지로, 본 발명은 본 발명에 따른 핵산 분자를 포함하는 핵산 구조체로서, 여기서 아데노 연관 바이러스(AAV) 캡시드 단백질을 암호화하는 리딩 프레임의 뉴클레오타이드 서열은 곤충 세포에서 발현을 위한 발현 조절 서열에 작동가능하게 연결된 핵산 구조체에 관한 것이다.
변형적으로 또는 어떠한 이전의 구현과 조합으로, 보다 바람직한 구현으로, 리딩 프레임의 뉴클레오타이드 서열은 폴리헤드론(polyhedron) 프로모터, p10 프로모터, 4xHsp27 EcRE+minimal Hsp70 프로모터, deltaE1 프로모터, E1 프로모터로 구성된 그룹으로부터 선택된 프로모터에 작동적으로 연결된다. 본 발명의 바람직한 구현으로, 상기 구조체는 곤충-호환성 벡터, 바람직하게 배큘로바이러스 벡터이다.
변형적으로 또는 어떠한 이전의 구현과 조합으로, 핵산 분자는 SEQ ID NO: 51, 69, 42, 47, 48 및 50으로 구성된 그룹으로부터 선택된 오픈 리딩 프레임, 바람직하게는 SEQ ID NO:51 또는 SEQ ID NO:69, 보다 바람직하게 SEQ ID NO:51의 오픈 리딩 프레임을 포함한다.
제3 견지로, 본 발명은 본 발명에 따른 핵산 구조체를 포함하는 곤충 세포에 관한 것이다.
변형적으로 또는 어떠한 이전의 구현과 조합으로, 보다 바람직한 구현으로, 곤충 세포는 또한 (a) 적어도 하나의 AAV 역위 말단 반복(ITR) 뉴클레오타이드 서열을 포함하는 제2 뉴클레오타이드 서열; (b) 곤충 세포에서 발현을 위한 발현 조절 서열에 작동가능하게 연결된 Rep78 또는 Rep68 코딩 서열을 포함하는 제3 뉴클레오타이드 서열; (c) 선택적으로, 곤충 세포에서 발현을 위한 발현 조절 서열에 작동가능하게 연결된 Rep52 또는 Rep40 코딩 서열을 포함하는 제4 뉴클레오타이드 서열을 포함한다.
변형적으로 또는 어떠한 이전의 구현과 조합으로, 보다 바람직한 구현으로, 곤충 세포는 (a) 본 발명에 따른 제1 핵산 구조체로서, 이에 따라 제1 핵산 구조체는 상기 정의한 바와 같은 제3 및 제4 뉴클레오타이드 서열을 더 포함하는 제1 핵산 구조체; 및 (b) 상기 정의한 바와 같은 제2 뉴클레오타이드 서열을 포함하는 제2 핵산 구조체로서, 여기서 제2 핵산 구조체는 바람직하게 곤충 세포-호환성 벡터, 보다 바람직하게 배큘로바이러스 벡터인 제2 핵산 구조체를 포함한다.
변형적으로 또는 어떠한 이전의 구현과 조합으로, 보다 바람직한 구현으로, 제2 뉴클레오타이드 서열은 (포유류 세포에서 발현을 위한) 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열을 더 포함하며, 그리고 이에 따라 상기 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열은 곤충 세포에서 생산된 AAV 혈청형 5의 게놈 내로 통합된다.
변형적으로 또는 어떠한 이전의 구현과 조합으로, 보다 바람직한 구현으로, 제2 뉴클레오타이드 서열은 2개의 AAV ITR 뉴클레오타이드 서열을 포함하며, 그리고 여기서 상기 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열은 상기 2개의 AAV ITR 뉴클레오타이드 서열 사이에 위치한다.
변형적으로 또는 어떠한 이전의 구현과 조합으로, 보다 바람직한 구현으로, 제1 뉴클레오타이드 서열, 제2 뉴클레오타이드 서열, 제3 뉴클레오타이드 서열 및 선택적으로 제4 뉴클레오타이드 서열은 곤충 세포의 게놈 내로 안정하게 통합된다.
제4 견지로, 본 발명은 AAV 비리온으로서, 이의 게놈 내에 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열을 포함하며, 이에 따른 상기 적어도 하나의 뉴클레오타이드 서열은 바람직하게 네이티브 AAV 뉴클레오타이드 서열이 아니며, 그리고 여기서 AAV VP1 캡시드 단백질은 N 말단으로부터 C 말단으로
(i) 제1 아미노산 잔기로서, 이는 번역 개시 코돈에 의해 암호화되며, 바람직하게 CTG, ACG, TTG 및 GTG로 구성된 그룹으로부터 선택된 차선의 번역 개시 코돈에 의해 암호화되는 제1 아미노산 잔기;
(ii) 알라닌, 글리신, 발린, 아스파트산 및 글루탐산으로 구성된 그룹으로부터 선택된 제2 아미노산 잔기;
(iii) 선택적으로, 제2 아미노산 잔기 다음에 하나 이상의 부가적인 아미노산 잔기; 및
(iv) AAV VP1 캡시드 단백질의 아미노산 서열로서, 이에 따른 서열은 VP1 번역 개시 코돈에 의해 암호화된 아미노산 잔기만 결핍된 AAV VP1 캡시드 단백질의 아미노산 서열
을 포함하거나, 또는 이로 구성되는, AAV 비리온에 관한 것이다.
바람직하게, 본 발명에 따른 AAV 비리온은 팩터 IX 또는 팩터 VIII 단백질을 암호화하는 관심있는 유전자 산물을 포함한다.
제5 견지로, 본 발명은 (a) AAV가 생산되는 조건하에서 본 발명에 따른 곤충 세포를 배양하는 단계; 및 선택적으로 (b) 상기 AAV의 회수 단계를 포함하는, 곤충 세포에서 AAV를 생산하는 방법에 관한 것이다.
도 1: 리포터 트랜스진 SEAP가 은닉되어 있는 다양한 돌연변이 캡시드들을 NuPage 겔 상에서 정제 및 분석하였다. 3가지 캡시드 단백질, VP1(87kDa), VP2(72kDa) 및 VP3(62kDa)이 나타났다.
도 2: Hela 세포에서 seap 발현 카세트를 운반하는 다양한 AAV5 캡시드 돌연변이들을 이용한 시험관 내 효능 어세이. 리포터 유전자의 활성은 광 방출로서 간접적으로 측정되고, RLU(상대적인 광 유니트, relative light units)로 표현된다. NTC=네가티브 컨트롤.
도 3: Huh7 세포에서 seap 발현 카세트를 운반하는 다양한 AAV5 캡시드 돌연변이들을 이용한 시험관 내 효능 어세이. 리포터 유전자의 활성은 광 방출로서 간접적으로 측정되고, RLU(상대적인 광 유니트, relative light units)로 표현된다. NTC=네가티브 컨트롤.
도 4: C57BL/6 마우스에서 seap 발현 카세트를 운반하는 다양한 캡시드 돌연변이들의 생체 내 효능 어세이. 리포터 유전자의 활성은 광 방출로서 간접적으로 측정되고, RLU(상대적인 광 유니트, relative light units)로 표현된다.
도 5: C57BL/6 마우스에서 FIX 발현 카세트를 운반하는 다양한 AAV5 캡시드 돌연변이들의 생체 내 효능 어세이. FIX 발현은 두 가지 다른 벡터, 즉 캡시드 변이체 160 및 765의 투여시 마우스에서 모니터되었다. 두 캡시드 모두 FIX 발현 카세트를 운반한다. FIX는 특정 ELISA을 이용하여 주사 후 1주, 2주 및 4주에 혈장에서 측정된다. IU/ml은 1ml의 혈장에서 발견된 FIX 단백질의 국제 유니트를 나타낸다. PBS=인산완충용액
도 6: 리포터 트랜스진 SEAP가 은닉되어 있는 돌연변이 캡시드들을 NuPage 겔 상에서 정제 및 분석하여 3가지 캡시드 단백질 VP1, VP2 및 VP3를 나타내었다. 구조체 43의 3개의 클론들이 나타났다.
도 7: HeLa 세포(A) 및 Huh7 세포(B)에서 seap 발현 카세트를 운반하는 다양한 AAV5 캡시드 돌연변이들을 이용한 시험관 내 효능 어세이. 리포터 유전자의 활성은 광 방출로서 간접적으로 측정되고, RLU(상대적인 광 유니트, relative light units)로 표현된다(a.u.: 임의 단위(arbitrary units)).
도 2: Hela 세포에서 seap 발현 카세트를 운반하는 다양한 AAV5 캡시드 돌연변이들을 이용한 시험관 내 효능 어세이. 리포터 유전자의 활성은 광 방출로서 간접적으로 측정되고, RLU(상대적인 광 유니트, relative light units)로 표현된다. NTC=네가티브 컨트롤.
도 3: Huh7 세포에서 seap 발현 카세트를 운반하는 다양한 AAV5 캡시드 돌연변이들을 이용한 시험관 내 효능 어세이. 리포터 유전자의 활성은 광 방출로서 간접적으로 측정되고, RLU(상대적인 광 유니트, relative light units)로 표현된다. NTC=네가티브 컨트롤.
도 4: C57BL/6 마우스에서 seap 발현 카세트를 운반하는 다양한 캡시드 돌연변이들의 생체 내 효능 어세이. 리포터 유전자의 활성은 광 방출로서 간접적으로 측정되고, RLU(상대적인 광 유니트, relative light units)로 표현된다.
도 5: C57BL/6 마우스에서 FIX 발현 카세트를 운반하는 다양한 AAV5 캡시드 돌연변이들의 생체 내 효능 어세이. FIX 발현은 두 가지 다른 벡터, 즉 캡시드 변이체 160 및 765의 투여시 마우스에서 모니터되었다. 두 캡시드 모두 FIX 발현 카세트를 운반한다. FIX는 특정 ELISA을 이용하여 주사 후 1주, 2주 및 4주에 혈장에서 측정된다. IU/ml은 1ml의 혈장에서 발견된 FIX 단백질의 국제 유니트를 나타낸다. PBS=인산완충용액
도 6: 리포터 트랜스진 SEAP가 은닉되어 있는 돌연변이 캡시드들을 NuPage 겔 상에서 정제 및 분석하여 3가지 캡시드 단백질 VP1, VP2 및 VP3를 나타내었다. 구조체 43의 3개의 클론들이 나타났다.
도 7: HeLa 세포(A) 및 Huh7 세포(B)에서 seap 발현 카세트를 운반하는 다양한 AAV5 캡시드 돌연변이들을 이용한 시험관 내 효능 어세이. 리포터 유전자의 활성은 광 방출로서 간접적으로 측정되고, RLU(상대적인 광 유니트, relative light units)로 표현된다(a.u.: 임의 단위(arbitrary units)).
정의
본 명세서에 사용된, 용어 "작동가능하게 연결된(operably linked)"은 폴리뉴클레오타이드(또는 폴리펩타이드) 엘리먼트의 기능적으로 관련된 연결을 칭한다. 핵산은 다른 핵산 서열과 기능적 관계로 배치될 경우 "작동가능하게 연결"된다. 예를 들어, 전사 조절 서열이 암호 서열의 전사에 영향을 준다면 암호 서열에 작동가능하게 연결된다. 작동가능하게 연결된이란, 연결되는 DNA 서열이 전형적으로 인접해 있으며, 두 단백질 암호 리전을 인접적으로, 그리고 리딩 프레임내에 연결하는 것이 필연적인 경우를 의미한다.
"발현 조절 서열"은 작동가능하게 연결된 뉴클레오타이드 서열의 발현을 조절하는 핵산 서열을 칭한다. 발현 조절 서열은 발현 조절 서열이 핵산 서열의 전사 및/또는 번역을 조절 및 조정하는 경우 뉴클레오타이드 서열에 "작동가능하게 연결"된다. 따라서, 발현 조절 서열은 프로모터, 인핸서, 내재 리보좀 진입 부위(IRES), 전사 종결자, 단백질-암호 유전자 앞쪽에 출발 코돈, 인트론에 대한 스플라이싱 신호, 및 정지 코돈을 포함할 수 있다. 용어 "발현 조절 서열"은 최소한 이의 존재가 발현에 영향을 미치도록 디자인된 서열을 포함하도록 의도되며, 또한 부가적인 이로운 성분을 포함할 수 있다. 예를 들어, 리더 서열 및 융합 파트너 서열이 발현 조절 서열이다. 이 용어는 또한 프레임 안 및 밖에서 바람직하지 않은 잠재적인 개시 코돈이 서열로부터 제거되도록 하는 핵산 서열의 디자인을 포함할 수 있다. 이는 또한 바람직하지 않은 잠재적인 스플라이스 부위가 제거되도록 하는 핵산 서열의 디자인을 포함할 수 있다. 이는 폴리A 꼬리의 첨가, 즉, mRNA의 3'-말단에서의 아데닌 잔기의 스트링, 폴리A 서열로 칭하여지는 서열의 첨가를 지시하는 서열 또는 폴리아데닐화 서열(pA)을 포함한다. 이는 또한 mRNA 안정성을 증진시키도록 디자인될 수 있다. 예를 들어, 프로모터와 같이 전사 및 번역 안정성에 영향을 주는 발현 조절 서열뿐만 아니라 예를 들어 Kozak 서열과 같이 번역에 영향을 주는 서열이 곤충 세포에서 알려져 있다. 발현 조절 서열은 작동가능하게 연결되는 뉴클레오타이드 서열을 보다 낮은 발현 수준 또는 보다 높은 발현 수준이 이루어지도록 하는 것과 같은 특성을 가질 수 있다.
본 명세서에 사용된 용어 "프로모터(promoter)" 또는 "전사 조절 서열(transcription regulatory sequence)"은 하나 이상의 암호 서열의 전사를 조절하는 기능을 하며, 상기 암호 서열의 전사 개시 부위의 전사 방향에 대해 업스트림에 위치하며, 구조적으로 DNA-의존성 RNA 폴리머라아제에 대한 바인딩 부위, 전사 개시 부위 및 어느 다른 DNA 서열의 존재로 확인되는 핵산 프래그먼트를 칭하며, 이에 한정하는 것은 아니나 전사 인자 바인딩 부위, 리프레서 및 액티베이터 단백질 바인딩 부위, 및 프로모터로부터 전사량을 직접 또는 간접적으로 조절하는 당 기술분야의 숙련자에게 알려진 어느 다른 서열을 포함한다. "구성적인(constitutive)" 프로모터는 대부분의 생리학적 및 발생학적 조건하에서 대부분의 조직에서 활성적인 프로모터이다. "유도성(inducible)" 프로모터는 예를 들어, 화학적 유도인자의 적용에 의해 생리학적으로 또는 발생학적으로 조절되는 프로모터이다. "조직 특이적(tissue specific)" 프로모터는 특정 타입의 조직 또는 세포에만 활성적인 프로모터이다.
용어 "실질적으로 동일한(substantially identical)", "실질적 동일(substantial identity)" 또는 "본질적으로 유사한(essentially similar)" 또는 "본질적 유사성(essential similarity)"은 두 펩타이드 또는 두 뉴클레오타이드 서열이 예를 들어, 디폴트 파라미터를 이용한 GAP 또는 BESTFIT 프로그램에 의해서 최적으로 정렬될 경우, 본 명세서의 다른 곳에서 정의된 바와 같은 적어도 특정 퍼센트의 서열을 공유하는 것을 의미한다. GAP는 이들의 전장에 걸친 두 서열에 대해 매칭된 수를 최대화하고, 갭의 수를 최소화하는 Needleman과 Wunsch 글로벌 정렬 알고리즘을 사용한다. 일반적으로, GAP 디폴트 파라미터는 갭 생성 페널티 = 50(뉴클레오타이드)/8(단백질) 및 갭 연장 페널티 = 3(뉴클레오타이드)/2(단백질)로 사용된다. 뉴클레오타이드에 대해 사용되는 디폴트 스코어링 매트릭스는 nwsgapdna이며, 단백질에 대한 디폴트 스코어링 매트릭스는 Blosum62이다(Henikoff & Henikoff, 1992, PNAS 89, 915-919). RNA 서열이 DNA 서열과 본질적으로 동일하거나 특정 정도의 서열 동일성을 갖는 것으로 표현될 경우, DNA 서열에서 티민(T)은 RNA 서열의 우라실(U)와 동일한 것으로 간주되는 것은 명확하다. 퍼센트 서열 동일성에 대한 서열 정렬 및 스코어는 GCG Wisconsin Package, Version 10.3(Accelrys Inc., 9685 Scranton Road, San Diego, CA 92121-3752 USA) 또는 오픈 소스 소프트웨어 Emboss for Windows(현 버전 2.7.1-07)와 같은 컴퓨터 프로그램을 이용하여 검출될 수 있다. 택일적으로 퍼센트 유사성 또는 동일성은 FASTA, BLAST 등과 같은 데이타베이스에 대해 서치함으로써 검출될 수 있다.
본 발명의 파보바이러스 Rep 단백질 암호 뉴클레오타이드 서열은 또한 각각, 모더레이트, 또는 바람직하게 스트링전트 하이브리드화 조건하에서 SEQ ID NO.1의 뉴클레오타이드 서열과 하이브리드하는 이들의 능력으로 정의될 수 있다. 스트링전트 하이브리드화 조건은 본 명세서에서 적어도 약 25, 바람직하게 약 50뉴클레오타이드, 75 또는 100, 가장 바람직하게 약 200이상의 뉴클레오타이드의 핵산 서열이 약 1M 염, 바람직하게 6 x SSC를 포함하는 용액에서 또는 이에 상당한 이온 강도를 갖는 어느 다른 용액에서 약 65℃의 온도에서 하이브리드하고, 약 0.1M이하의 염, 바람직하게 0.2 x SSC를 포함하는 용액 또는 이에 상당한 이온 강도를 갖는 어느 다른 용액에서 65℃에서 세척하는 조건으로 정의된다. 바람직하게, 하이브리드는 밤새, 즉, 적어도 10시간동안 수행되며, 바람직하게 세척은 세척 용액을 적어도 2회 교체하여 적어도 1시간동안 수행된다. 이러한 조건은 일반적으로 약 90%이상의 서열 동일성을 갖는 서열의 특이적 하이브리드가 일어나도록 한다.
모더레이트 조건은 본 명세서에서 적어도 50 뉴클레오타이드, 바람직하게 약 200 이상의 뉴클레오타이드의 핵산 서열이 약 1M 염, 바람직하게 6 x SSC를 포함하는 용액에서 또는 이에 상당하는 이온 강도를 갖는 어느 다른 용액에서 약 45℃의 온도에서 하이브리드하고, 약 1M 염, 바람직하게 6 x SSC를 포함하는 용액 또는 이에 상당한 이온 강도를 갖는 어느 다른 용액에서 실온에서 세척하는 조건으로 정의된다. 바람직하게, 하이브리드는 밤새, 즉, 적어도 10시간동안 수행되며, 바람직하게 세척은 세척 용액을 적어도 2회 교체하여 적어도 1시간동안 수행된다. 이러한 조건은 일반적으로 약 50%이상의 서열 동일성을 갖는 서열의 특이적 하이브리드가 일어나도록 한다. 당 기술분야의 숙련자는 동일성 50-90% 사이로 달라지는 서열을 특이적으로 확인하기위해 이러한 하이브리드 조건을 변형할 수 있을 것이다.
본 발명의 상세한 설명
본 발명은 포유류 세포에서 핵산의 도입 및/또는 발현을 위한 벡터로서 사용하는 동물 파보바이러스, 특히 감염성 휴먼 또는 시미언 AAV와 같은 디펜도바이러스 및 이의 성분(예, 동물 파보바이러스 게놈)의 용도에 관한 것이다. 보다 상세하게, 본 발명은 곤충 세포에서 생성될 경우 이러한 파보바이러스 벡터의 감염성 향상에 관한 것이다.
파보바이러스과의 바이러스는 소(small) DNA 동물 바이러스이다. 파보바이러스과는 두 아과로 나뉠 수 있다: 척추동물을 감염시키는 파보비리네, 및 곤충을 감염시키는 덴소비리네. 파보바이러스 아과의 멤버는 본 명세서에서 파보바이러스로 칭하여지며, 디펜도바이러스속을 포함한다. 이의 속명으로부터 추정될 수 있는 바와 같이, 디펜도바이러스의 멤버는 이들이 일반적으로 세포배양시 생산적 감염을 위해 아데노바이러스나 헤르페스 바이러스와 같은 헬퍼 바이러스와 함께 동시감염되는 것을 필요로하는 점에서 독특하다. 디펜도바이러스속은 일반적으로 사람(예, 혈청형 1, 2, 3A, 3B, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13) 또는 영장류(예, 혈청형 1 및 4)를 감염시키는 AAV 및 다른 온혈 동물(예, 소, 개, 말, 및 양 아데노 연관 바이러스)을 감염시키는 관련 바이러스를 포함한다. 파보바이러스 및 파보바이러스과의 다른 멤버에 대한 추가 정보는 Kenneth I. Berns, "Parvoviridae: The Viruses and Their Replication," Chapter 69 in Fields Virology(3d Ed. 1996)에 기재되어 있다. 편의상 본 발명은 AAV를 기준으로 본 명세서에서 더욱 예시되고 설명된다. 그러나, 본 발명은 AAV로 한정되는 것은 아니며 다른 파보바이러스에 동등하게 적용될 수 있는 것으로 이해된다.
알려진 모든 AAV 혈청형의 게놈 조직은 매우 유사하다. AAV의 게놈은 약 5,000미만의 뉴클레오타이드(nt) 길이를 갖는 선형의 단일 스트랜드 DNA 분자이다. 역위 말단 반복(ITRs)은 비구성 복제(Rep) 단백질 및 구성(VP) 단백질에 대한 독특한 코딩 뉴클레오타이드 서열 측면에 배치된다. VP 단백질(VP1, -2 및 -3)은 캡시드를 형성한다. 말단 145nt는 자기-상보적이며(self-complementary), T-형상 헤어핀을 형성하는 에너지적으로 안정한 분자내 듀플렉스가 형성될 수 있도록 조직된다. 이러한 헤어핀 구조는 세포성 DNA 폴리머라아제 복합체에 대한 프라이머로 제공하는 바이러스 DNA 복제용 오리진으로 작용한다. 포유류 세포에서 wtAAV 감염이후에, Rep 유전자(즉, Rep78 및 Rep52)는 각각, P5 프로모터 및 P19 프로모터로부터 발현되며, 두 Rep 단백질 모두 바이러스 게놈의 복제에 일 기능을 갖는다. Rep ORF에서 스플라이싱 이벤트는 실제로 4개의 Rep 단백질(즉, Rep78, Rep68, Rep52 및 Rep40)의 발현을 일으킨다. 그러나, 포유류 세포에서 Rep78 및 Rep52 단백질을 암호하는 스플라이싱되지 않은 mRNA는 AAV 벡터 생산에 충분한 것으로 나타났다. 또한, 곤충 세포에서 Rep78 및 Rep52 단백질은 AAV 생산에 충분하다. 3가지 캡시드 단백질, VP1, VP2 및 VP3는 p40 프로모터의 단일 VP 리딩 프레임으로부터 발현된다. 포유류 세포에서 wtAAV 감염은 캡시드 단백질 생산을 위해 2개의 스플라이스 억셉터 사이트의 교대 사용과 VP2를 위한 ACG 개시 코돈의 차선 활용의 조합에 의존한다. 그러나, 이는 곤충 세포에서 정확하게 재생산되지 않아, AAV 캡시드 단백질의 정확한 화학양론(stoichiometry)을 획득하기 위해 추가 특징을 필요로 한다.
제1 견지로, 본 발명은 아데노 연관 바이러스(AAV)를 암호화하는 오픈 리딩 프레임을 포함하는 뉴클레오타이드 서열을 갖는 핵산 분자에 관한 것이다. 바람직하게 캡시드 단백질을 암호화하는 리딩 프레임은 AAV 캡시드 단백질을 암호화하는 와일드 타입 오픈 리딩 프레임에 비해, 적어도: (i) ATG 개시 코돈의 CTG, ACG, TTG 및 GTG로 구성된 그룹으로부터 선택된 차선 번역 개시 코돈으로의 대체; 및 (ii) 상기 차선 번역 개시 코돈과 캡시드 단백질 아미노산 서열의 위치 2의 아미노산 잔기, 바람직하게 와일드 타입 캡시드 단백질 아미노산 서열의 위치 2의 아미노산 잔기에 상응하는 아미노산 잔기를 암호화하는 코돈 사이에 삽입된 하나 이상의 아미노산 잔기의 삽입에 의해 변형된다. (와일드 타입) 캡시드 단백질 아미노산 서열의 위치 2는 바람직하게 (와일드 타입) AAV VP1 캡시드 단백질의 아미노산 서열의 위치 2를 칭하는 것으로 이해된다. 바람직하게, 상기 차선 번역 개시 코돈은 이의 3'-엔드에서 알라닌, 글리신, 발린, 아스파트산 및 글루탐산으로 구성된 그룹으로부터 선택된 아미노산 잔기에 대한 코돈 직후에 존재한다.
변형적으로, 이러한 견지에서 본 발명은 오픈 리딩 프레임을 포함하는 뉴클레오타이드 서열을 갖는 핵산 분자로서, 여기서 상기 리딩 프레임은 5'에서 3' 순서로
(i) CTG, ACG, TTG 및 GTG로 구성된 그룹으로부터 선택된 차선 번역 개시 코돈인, 제1 코돈;
(ii) 알라닌, 글리신, 발린, 아스파트산 및 글루탐산으로 구성된 그룹으로부터 선택된 제2 코돈;
(iii) 선택적으로, 상기 제2 코돈 다음에 부가적인 아미노산 잔기를 위한 하나 이상의 코돈; 및
(iv) AAV 캡시드 단백질을 암호화하는 서열로서, 이에 따른 서열은 VP1 번역 개시 코돈이 결핍된 것이며, 바람직하게 이에 따른 서열은 VP1 번역 개시 코돈만이 결핍된 것이거나 또는, 변형적으로 이에 따른 상기 서열은 단지 VP1 번역 개시 코돈만이 결핍된 것인 AAV 캡시드 단백질을 암호화하는 서열
을 포함하거나, 또는 이로 구성되는, 핵산 분자에 관한 것이다.
따라서, (iv)에서, 상기 서열은 바람직하게 AAV 캡시드 단백질을 암호화하는 오픈 리딩 프레임의 잔부를 포함하거나, 또는 이로 구성되며, 이에 따른 상기 잔부는 상기 캡시드 단백질을 암호화하는 와일드 타입 오픈 리딩 프레임 내의 제2 아미노산 위치에 상응하는 위치에서 시작한다.
아데노 연관 바이러스(AAV) 캡시드 단백질을 암호화하는 오픈 리딩 프레임을 포함하는 뉴클레오타이드 서열을 갖는 핵산 분자는 본 명세서에서 동물 파보바이러스의 VP1, VP2, 및 VP3 캡시드 단백질을 암호화하는, 바람직하게는 상기 3가지 모든 캡시드 단백질을 암호화하는 뉴클레오타이드 서열을 포함하는 것으로 이해된다.
구절 "CTG, ACG, TTG 및 GTG로 구성된 그룹으로부터 선택된 차선 번역 개시 코돈으로 시작하는" 또는 "CTG, ACG, TTG 및 GTG로 구성된 그룹으로부터 선택된 차선 번역 개시 코돈인, 제1 코돈"은 본 명세서에서 VP1 캡시드 단백질의 아미노 말단을 암호화하는 위치에서 아데노 연관 바이러스(AAV) 캡시드 단백질을 암호화하는 오픈 리딩 프레임의 개시 코돈이 CTG, ACG, TTG 및 GTG로 구성된 그룹으로부터 선택된 차선 번역 개시 코돈인 것을 의미하는 것으로 이해된다.
차선은 본 명세서에서 그 코돈이 정상적인 ATG 코돈에 비해 다른 이상적인 맥락에서 번역 개시에 덜 효율적인 것을 의미하는 것으로 이해된다. 바람직하게 AAV VP1 캡시드 단백질의 번역을 위한 개시 코돈은 ACG, TTG, GTG, 및 CTG로부터 선택되며, 보다 바람직하게 AAV VP1 캡시드 단백질의 번역을 위한 개시 코돈은 CTG 및 ACG로부터 선택되며, 그리고 가장 바람직하게 AAV VP1 캡시드 단백질의 번역을 위한 개시 코돈은 CTG이다. 동물 파보바이러스는 바람직하게 디펜도바이러스, 보다 바람직하게 인간 또는 시미언 아데노 연관 바이러스(AAV)이다.
특히 바람직한 구현으로, VP1의 차선 개시 코돈은 CTG이며, 하나의 부가적인 코돈이 이의 3' 엔드에서 상기 차선 개시 코돈에 바로 인접하게 도입되며, 상기 부가적인 코돈은 알라닌을 코딩한다. 바람직하게 상기 캡시드 단백질은 AAV5 캡시드 단백질이다. 이는 AAV5 비리온의 향상된 효능을 일으킨다. 용어 "효능(potency)"은 본 명세서에서 벡터가 이의 유전 물질의 발현을 유도하는 능력을 의미하는 것으로 사용된다.
상기 오픈 리딩 프레임은 알라닌, 글리신, 발린, 아스파트산 및 글루탐산으로 구성된 그룹으로부터 선택된 아미노산 잔기를 암호화하는, 바람직하게 알라닌을 암호화하는 제2 코돈을 더 포함한다. 보다 바람직하게, 제2 코돈은 GCT, GCC, GCA, GCG 및 GGU로 구성된 그룹으로부터 선택되며, 바람직하게 여기서 상기 코돈은 GCT이다. 상기 오픈 리딩 프레임은 선택적으로 예를 들어, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 또는 20, 바람직하게 60, 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15 또는 14의 부가적인 아미노산 잔기 미만의 부가적인 아미노산에 대한 코돈과 같은 제2 코돈 다음에 추가로 부가적인 아미노산 잔기를 암호화하는 하나 이상의 코돈을 포함한다. 쉽게 이해되는 바와 같이, 부가적인 아미노산 잔기를 암호화하는 코돈은 캡시드 단백질의 오픈 리딩 프레임과 함께 프레임 내에 존재한다.
일 구현으로, 오픈 리딩 프레임이 와일드 타입 캡시드 단백질과 비교되는 경우에, 캡시드 단백질을 암호화하는 오픈 리딩 프레임은 VP1의 차선 번역 개시 코돈과 이에 상응하는 와일드 타입 캡시드 단백질에서 이의 3' 엔드 상의 개시 코돈에 바로 인접한 아미노산 잔기를 암호화하는 코돈 사이에 삽입된 하나 이상의 아미노산 잔지를 암호화하는 코돈을 더 포함한다. 예를 들어, 오픈 리딩 프레임은 이에 상응하는 와일드 타입 캡시드 단백질에 비해 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 또는 20의 부가적인 아미노산 잔기에 대한 코돈을 포함한다. 바람직하게, 오픈 리딩 프레임은 이에 상응하는 와일드 타입 캡시드 단백질에 비해 60, 50, 40, 35, 30, 25, 20, 19, 18, 17, 16, 15 또는 14 미만의 부가적인 아미노산 잔기에 대한 코돈을 포함한다. 쉽게 이해되는 바와 같이, 부가적인 아미노산 잔기를 암호화하는 코돈은 캡시드 단백질의 오픈 리딩 프레임과 함께 프레임 내에 존재한다. 이에 상응하는 와일드 타입 캡시드 단백질에 비해 부가적인 아미노산 잔기를 암호화하는 이러한 코돈 중에서, 제1 코돈, 즉 이의 3' 엔드에서 차선 번역 개시 코돈에 바로 인접한 코돈은 알라닌, 글리신, 발린, 아스파트산 및 글루탐산으로 구성된 그룹으로부터 선택된 아미노산 잔기를 암호화한다. 따라서, 번역 개시 코돈과 와일드 타입 서열의 잔기 2에 상응하는 아미노산 잔기를 암호화하는 코돈 사이에 단지 하나의 부가적인 코돈이 존재하는 경우에, 그 부가적인 코돈은 알라닌, 글리신, 발린, 아스파트산 및 글루탐산으로 구성된 그룹으로부터 선택된 아미노산 잔기를 암호화한다. 번역 개시 코돈과 와일드 타입 서열의 잔기 2에 상응하는 아미노산 잔기를 암호화하는 코돈 사이에 하나가 넘는 부가적인 코돈이 존재하는 경우에, 번역 개시 코돈 바로 다음의 코돈은 알라닌, 글리신, 발린, 아스파트산 및 글루탐산으로 구성된 그룹으로부터 선택된 아미노산 잔기를 암호화한다. 바람직하게, 차선 번역 개시 코돈 (즉, 이의 3' 엔드에서) 바로 다음의 부가적인 아미노산 잔기는 알라닌, 글리신 또는 발린이며, 보다 바람직하게 알라닌이다. 즉, 본 발명의 바람직한 구현으로, 차선 번역 개시 코돈 바로 다음의 코돈은 알라닌을 암호화한다.
본 발명의 바람직한 구현으로, 차선 번역 개시 코돈 바로 다음의 코돈, 즉 제2 코돈은 GCT, GCC, GCA, GCG, GGU, GGC, GGA, GGG, GUU, GUC, GUA, GUG, GAU, GAC, GAA 및 GAG로 구성된 그룹으로부터, 바람직하게 GCT, GCC, GCA, GCG 및 GGU로 구성된 그룹으로부터 선택되며, 보다 바람직하게 상기 코돈은 GCT이다.
단계 (iv)에서 AAV 캡시드 단백질을 암호화하는 서열은 예를 들어, AAV1 - AAV13과 같이 자연적으로 발견되는 캡시드 서열일 수 있으며, 이의 뉴클레오타이드 및 아미노산 서열은 SEQ ID NO: 13 - 38 및 SEQ ID NO: 70 - 73에 나타내었다. 이런 이유로, 단계 (iv)에서 AAV 캡시드 단백질을 암호화하는 서열은 예를 들어, AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12 및 AAV13으로 구성된 그룹으로부터 선택된 캡시드 서열일 수 있다. 변형적으로, 상기 서열은 인공적인 것일 수 있으며, 예를 들어, 상기 서열은 하이브리드 형태이거나 또는 예를 들어 AcmNPv 또는 스포돕테라 프루지페르다(Spodoptera frugiperda)의 코돈 사용에 의한 것과 같이 최적화된 코돈일 수 있다. 예를 들어, 캡시드 서열은 AAV1의 VP2 및 VP3 서열로 구성될 수 있으며, 반면에 VP1 서열의 잔부는 AAV5로 구성된다. 바람직한 캡시드 단백질은 AAV5, 바람직하게 SEQ ID NO: 22로 제공된 AAV5, AAV8, 바람직하게 SEQ ID NO: 28로 제공된 AAV8 또는 AAV9, 바람직하게 SEQ ID NO: 30, SEQ ID NO: 71 또는 SEQ ID NO: 73으로 제공된 AAV9이다. 따라서, 바람직한 구현으로, AAV 캡시드 단백질은 본 발명에 따라 변형된 AAV 혈청형 5, AAV 혈청형 8, 또는 AAV 혈청형 9 캡시드 단백질이다. 캡시드 단백질이 AAV9인 경우에, 캡시드 단백질은 예를 들어, WO 03/052052 또는 WO 05/033321에 개시된 것과 같은 서열 또는 SEQ ID NO: 29, 30, 70, 71, 72, 73 또는 74로 제공된 서열을 갖는 것이 바람직하다. 보다 바람직하게, 캡시드 단백질이 AAV9인 경우에, 캡시드 단백질은 SEQ ID NO: 72 및 73에 제공된 서열을 갖는다. 보다 바람직하게, AAV 캡시드 단백질은 본 발명에 따라 변형된 AAV 혈청형 5 캡시드 단백질이다. 캡시드 단백질의 정확한 분자량뿐만 아니라 번역 개시 코돈의 정확한 위치는 다양한 파보바이러스들 간에 다를 수 있는 것으로 이해된다. 그러나, 숙련자는 AAV-5와 다른 파보바이러스로부터 뉴클레오타이드 서열에서 이에 상응하는 위치를 확인하는 방법을 알 것이다. 변형적으로, AAV 캡시드 단백질을 암호화하는 서열은 유도 진화 실험(directed evolution experiments)의 결과로서 얻어지는 것과 같은 인공 서열이다. 이는 DNA 셔플링을 통한 캡시드 라이브러리의 생성, 에러 프론(error prone) PCR, 바이오인포마틱 래셔널 디자인(bioinformatic rational design), 사이트 세츄레이티드 돌연변이 생성(site saturated mutagenesis)을 포함한다. 결과적으로 형성된 캡시드는 기존 혈청형에 기초하지만 이러한 캡시드의 특징을 향상시키는 다양한 아미노산 또는 뉴클레오타이드 변화를 함유한다. 결과적으로 형성된 캡시드는 기존 혈청형의 다양한 부분들의 조합, "셔플드 캡시드(shuffled capsids)"이거나 또는 완전히 새로운 변화, 즉, 전체 길이의 유전자 또는 단백질에 걸쳐 그룹으로 조직화되거나 펼쳐진 하나 이상의 아미노산 또는 뉴클레오타이드의 첨가, 결실 또는 치환을 함유할 수 있다. 참조 예 Schaffer 및 Maheshri; Proceedings of the 26th Annual International Conference of the IEEE EMBS San Francisco, CA, USA; September 1-5, 2004, pages 3520-3523; Asuri 등. (2012) Molecular Therapy 20(2):329-3389; Lisowski 등. (2014) Nature 506(7488):382-386(본 명세서에 참고문헌으로 편입됨)
본 발명의 바람직한 구현으로, VP3 캡시드 단백질을 암호화하는 오픈 리딩 프레임은 ACG, ATT, ATA, AGA, AGG, AAA, CTG, CTT, CTC, CTA, CGA, CGC, TTG, TAG 및 GTG로 구성된 그룹으로부터 선택된 비-고전적인 번역 개시 코돈으로 시작한다. 바람직하게, 비-고전적인 번역 개시 코돈은 GTG, CTG, ACG, TTG로 구성된 그룹으로부터 선택되며, 보다 바람직하게 비-고전적인 번역 개시 코돈은 CTG이다.
AAV 캡시드 단백질의 발현을 위한 본 발명의 바람직한 뉴클레오타이드 서열은 VP2 개시자 콘텍스트(initiator context)를 포함하는 발현 조절 서열을 포함하는 뉴클레오타이드 서열이다. VP2 개시자 콘텍스트는 본 명세서에서 VP2의 비-고전적인 번역 모방(imitation) 시작에 선행하는 다수의 뉴클레오타이드를 의미하는 것으로 이해된다. 바람직한 구현으로, VP 개시자 콘텍스트는 AAV VP1 캡시드 단백질을 암호화하는 뉴클레오타이드 서열의 차선 번역 개시 코돈의 업스트림, 바람직하게 차선 번역 개시 코돈의 상류 바로 옆, 즉, 이의 5' 엔드에서 차선 번역 개시 코돈에 바로 인접한 SEQ. ID NO: 3의 9 뉴클레오타이드 서열 또는 SEQ. ID NO: 3과 실질적으로 상동적인 뉴클레오타이드이다. SEQ. ID NO: 3의 뉴클레오타이드 서열과 실질적 동일성을 가지며, VP1의 발현을 증가시키는 것을 돕는 서열은 예를 들어, SEQ ID NO: 3의 9 뉴클레오타이드 서열과 적어도 60%, 70%, 80% 또는 90% 동일성, 바람직하게 100% 동일성을 갖는 서열이다.
AAV 캡시드 단백질의 발현을 위한 본 발명의 더욱 바람직한 뉴클레오타이드 서열은 AAV VP1 캡시드 단백질을 암호화하는 뉴클레오타이드 서열의 개시 코돈 주위에 Kozak 콘센서스(consensus)를 포함하는 발현 조절 서열을 포함하는 뉴클레오타이드 서열이다. Kozak 콘센서스 서열은 본 명세서에서 GCCRCC(NNN)G (SEQ. ID NO: 4)로 정의되며, 여기서 R은 퓨린(즉, A 또는 G)이며, 그리고 여기서 (NNN)은 본 명세서에서 앞서 정의한 바와 같은 어느 차선 개시 코돈을 나타낸다. 바람직하게, 본 발명의 뉴클레오타이드 서열 내의 Kozak 콘센서스 서열에서, R은 G이다. 따라서 Kozak 콘센서스 서열을 포함하는 AAV 캡시드 단백질의 발현을 위한 본 발명의 뉴클레오타이드 서열은 바람직하게 GCCACC(ACG)G (SEQ ID NO: 5), GCCGCC(ACG)G (SEQ ID NO: 6), GCCACC(TTG)G (SEQ ID NO: 7), GCCGCC(TTG)G (SEQ ID NO: 8), GCCACC(GTG)G (SEQ ID NO: 9), GCCGCC(GTG)G (SEQ ID NO: 10), GCCACC(CTG)G (SEQ ID NO: 11) 및 GCCGCC(CTG)G (SEQ ID NO: 12)로부터 선택되며, 보다 바람직하게 Kozak 콘센서스 서열을 포함하는 뉴클레오타이드 서열은 GCCACC(CTG)G (SEQ ID NO: 11) 및 GCCGCC(CTG)G (SEQ ID NO: 12)으로부터 선택되며, 가장 바람직하게, Kozak 콘센서스 서열을 포함하는 뉴클레오타이드 서열은 GCCGCC(CTG)G (SEQ ID NO: 12)이다. 여기서 괄호안의 뉴클레오타이드는 VP1 단백질의 개시 코돈의 위치를 나타낸다.
AAV 캡시드 단백질의 발현을 위한 본 발명의 뉴클레오타이드 서열은 또한 바람직하게 뉴클레오타이드 위치 12의 G, 뉴클레오타이드 위치 21의 A, 및 뉴클레오타이드 위치 24의 C 중에서 선택된 AAV VP1 캡시드 단백질을 암호화하는 뉴클레오타이드 서열의 적어도 하나의 변형을 포함하며, 여기서 상기 뉴클레오타이드 위치들은 예를 들어 SEQ ID NO:21에 나타낸 바와 같은 와일드 타입 뉴클레오타이드 서열의 뉴클레오타이드 위치들에 상응한다. "잠재적인/가능한 폴스 출발 사이트(potential/possible false start site)" 또는 "잠재적인/가능한 폴스 번역 개시 코돈(potential/possible false translation initiation codon)"은 본 명세서에서 캡시드 단백질(들)의 코딩 서열에 위치한 인-프레임 ATG 코돈을 의미하는 것으로 이해된다. 다른 혈청형의 VP1의 번역을 이한 가능한 폴스 출발 사이트의 제거는 곤충 세포에서 인식되는 바와 같이 추정 스플라이스 사이트의 제거가 되는 것으로 당해 기술분야의 통상의 기술자에게 잘 이해될 것이다. 예를 들어, 위치 12에서 뉴클레오타이드의 변형은, 뉴클레오타이드 T가 폴스 ATG 코돈을 일으키지 않기 때문에 재조합 AAV5에 요구되지 않는다. 예를 들어, AAV5에 대한 뉴클레오타이드 서열의 추가 변형은 SEQ ID NO:39로 나타낸 바와 같을 수 있다. 곤충 세포에서 적절한 발현을 위한 와일드 타입 AAV 서열의 다양한 변형은 예를 들어, Sambrook 및 Russell(2001) "Molecular Cloning: A Laboratory Manual(3rd edition)"(Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, New York)에 기술된 바와 같이 잘 알려진 유전공학기술의 적용에 의해 이루어진다. VP 코딩 리전의 다양한 추가 변형은 통상의 기술자에게 알려져 있으며, 이는 VP 및 비리온의 수율을 증가시키거나, 변화된 친화성 또는 비리온의 항원성을 감소시키는 것과 같은 다른 원하는 효과를 가질 수 있다. 이러한 변형은 본 발명의 범위 내에 포함된다.
바람직한 구현으로, 본 발명에 따른 핵산 분자는 SEQ ID NO: 51, 69, 41, 42, 43, 44, 45, 46, 47, 48, 50 및 52로 구성된 그룹으로부터 선택된 오픈 리딩 프레임을 포함하거나 또는 이로 구성되며, 보다 바람직하게 본 발명에 따른 핵산 분자는 SEQ ID NO: 51, 69, 42, 43, 47, 48 및 50으로 구성된 그룹으로부터 선택된 오픈 리딩 프레임을 포함하거나 또는 이로 구성되며, 그리고 보다 바람직하게 이는 SEQ ID NO: 69 또는 51을 포함하거나 또는 이로 구성되며, 그리고 보다 바람직하게 이는 SEQ ID NO:51을 포함하거나 또는 이로 구성된다.
바람직하게 AAV 캡시드 단백질을 암호화하는 본 발명의 뉴클레오타이드 서열은 곤충 세포에서 발현을 위해 발현 조절 서열에 작동가능하게 연결된다. 따라서, 제2 견지로, 본 발명은 본 발명에 따른 핵산 분자를 포함하는 핵산 구조체로서, 여기서 아데노 연관 바이러스(AAV) 캡시드 단백질을 암호화하는 오픈 리딩 프레임의 뉴클레오타이드 서열은 곤충 세포에서 발현을 위해 발현 조절 서열에 작동가능하게 연결된 핵산 구조체에 관한 것이다. 이러한 발현 조절 서열은 적어도 곤충 세포에서 활성적인 프로모터를 포함할 것이다. 곤충 숙주 세포에서 외부 유전자를 발현하기 위한 당해 기술분야의 통상의 기술자에게 알려진 기술이 본 발명을 수행하는데 사용될 수 있다. 곤충 세포에서 폴리펩타이드의 분자공학 및 발현을 위한 방법은 예를 들어, Summers 및 Smith. 1986. A Manual of Methods for Baculovirus Vectors and Insect Culture Procedures, Texas Agricultural Experimental Station Bull. No. 7555, College Station, Tex.; Luckow. 1991. In Prokop 등, Cloning and Expression of Heterologous Genes in Insect Cells with Baculovirus Vectors' Recombinant DNA Technology and Applications, 97-152; King, L. A. 및 R. D. Possee, 1992, The baculovirus expression system, Chapman 및 Hall, United Kingdom; O'Reilly, D. R., L. K. Miller, V. A. Luckow, 1992, Baculovirus Expression Vectors: A Laboratory Manual, New York; W. H. Freeman 및 Richardson, C. D., 1995, Baculovirus Expression Protocols, Methods in Molecular Biology, volume 39; US 4,745,051; US2003148506; 및 WO 03/074714에 기술되어 있다. AAV 캡시드 단백질을 암호화하는 본 발명의 뉴클레오타이드 서열의 전사를 위해 특히 적절한 프로모터는 예를 들어, SEQ ID NO:53에 제공된 polH 프로모터 및 SEQ ID NO:54에 제공된 쇼트 polH 프로모터와 같은 폴리헤드론(polH) 프로모터이다. 그러나, 곤충 세포에서 활성적인 다른 프로모터들이 당해 기술분야에 알려져 있으며, 예를 들면 폴리헤드린(polH) 프로모터, p10 프로모터, p35 프로모터, 4xHsp27 EcRE+minimal Hsp70 프로모터, deltaE1 프로모터, E1 프로모터 또는 IE-1 프로모터 및 상기 참고문헌에 기재된 다른 프로모터들이 있다.
바람직하게 곤충 세포에서 AAV 캡시드 단백질의 발현을 위한 핵산 구조체는 곤충 세포 호환성 벡터이다. "곤충 세포 호환성 벡터(insect cell-compatible vector)" 또는 "벡터(vector)"는 곤충 또는 곤충 세포의 생산적인 트랜스포메이션 또는 트랜스펙션을 할 수 있는 핵산 분자로 이해된다. 예시적인 생물학적 벡터는 플라스미드, 선형 핵산 분자, 및 재조합 바이러스를 포함한다. 곤충 세포 호환성이라면 어떠한 벡터도 사용될 수 있다. 벡터는 곤충 세포 게놈 내로 통합될 수 있으나, 곤충 세포 내에서 벡터의 존재는 영구적일 필요는 없으며 일시성 에피솜 벡터가 또한 포함된다. 상기 벡터는 예를 들어, 세포의 화학적 처리, 일렉트로포레이션 또는 감염에 의한 것과 같이 어느 알려진 수단에 의해 도입될 수 있다. 바람직한 구현으로, 상기 벡터는 배큘로바이러스, 바이러스 벡터, 또는 플라스미드이다. 보다 바람직한 구현으로, 상기 벡터는 배큘로바이러스이다. 즉, 상기 구조체는 배큘로바이러스이다. 배큘로바이러스 벡터 및 이의 사용방법은 곤충 세포의 분자공학에 관한 상기 언급된 참고문헌에 기술되어 있다.
바람직한 구현으로, 본 발명에 따른 핵산 구조체에 포함된 핵산 분자는 SEQ ID NO: 51, 69, 42, 43, 47, 48 및 50으로 구성된 그룹으로부터 선택된 오픈 리딩 프레임을 포함하거나 또는 이로 구성되며, 보다 바람직하게 이는 SEQ ID NO:51 또는 SEQ ID NO:69를 포함하거나 또는 이로 구성되며, 보다 바람직하게 이는 SEQ ID NO:51을 포함하거나 또는 이로 구성된다.
제3 견지로, 본 발명은 상기 정의된 바와 같은 본 발명의 핵산 구조체를 포함하는 곤충 세포에 관한 것이다. AAV의 복제를 가능하게 하며 배양시 유지될 수 있는 어느 곤충 세포가 본 발명에 따라 사용될 수 있다. 예를 들어, 사용되는 세포주는 스포돕테라 프루지페르다(Spodoptera frugiperda), 초파리(drosophila) 세포주, 또는 예를 들어, 아에데스 알보픽투스(Aedes albopictus) 유래 세포주와 같은 모기 세포주의 것일 수 있다. 바람직한 곤충 세포 또는 세포주는 예를 들어, expresSF+®, Drosophila Schneider 2 (S2) Cells, Se301, SeIZD2109, SeUCR1, Sf9, Sf900+, Sf21, BTI-TN-5B1-4, MG-1, Tn368, HzAm1, Ha2302, Hz2E5 및 High Five(Invitrogen)를 포함하는 배큘로바이러스 감염에 민감한 곤충종들의 세포들이다.
본 발명에 따른 바람직한 곤충 세포는 또한 (a) 적어도 하나의 AAV 역위 말단 반복(ITR) 뉴클레오타이드 서열을 포함하는 제2 뉴클레오타이드 서열; (b) 곤충 세포에서 발현을 위한 발현 조절 서열에 작동가능하게 연결된 Rep52 또는 Rep40 코딩 서열을 포함하는 제3 뉴클레오타이드 서열; 및 (c) 곤충 세포에서 발현을 위한 발현 조절 서열에 작동가능하게 연결된 Rep78 또는 Rep68 코딩 서열을 포함하는 제4 뉴클레오타이드 서열을 포함한다.
본 발명의 정황상 "적어도 하나의 AAV ITR 뉴클레오타이드 서열"은 "A", "B" 및 "C" 리전으로도 칭하여지는 주로 상보적으로, 대칭적으로 배열된 서열을 포함하는 회문적 서열을 의미하는 것으로 이해된다. 상기 ITR은 복제시 "시스" 역할을 갖는, 즉, 회문 및 회문내부의 특정 서열을 인지하는 트랜스 액팅 복제 단백질(예, Rep 78 또는 Rep68)에 대한 인식 부위인 복제 오리진으로 작용하는 것이다. 상기 ITR 서열의 대칭성에 대한 일 예외는 ITR의 "D" 리전이다. 이는 (하나의 ITR 내에 하나의 컴플리먼트를 갖지 않는) 독특한 것이다. 단일 스트랜드 DNA의 니킹이 A와 D 리전 사이의 연결지점에서 일어난다. 이는 새로운 DNA 합성이 시작되는 리전이다. 상기 D 리전은 일반적으로 회문의 일측에 위치하고, 핵산 복제 단계에 대한 방향성을 제공한다. 포유류 세포에서 AAV 복제는 전형적으로 두 ITR 서열을 갖는다. 그러나, A 리전 및 D 리전의 양 스트랜드상에 바인딩 사이트가 회문의 각 측면상에 하나씩 대칭적으로 위치하도록 ITR을 합성하는 것이 가능하다. 그 다음, 이중-스트랜드 환형 DNA 주형상에서(예, 플라스미드), Rep78- 또는 Rep68-어시스티드 핵산 복제는 양 방향으로 진행되며, 단일 ITR가 환형 벡터의 AAV 복제에 충분하다. 따라서, 하나의 ITR 뉴클레오타이드 서열이 본 발명의 정황상 사용될 수 있다. 그러나, 바람직하게, 둘 이상의 레귤라 ITRs이 사용된다. 가장 바람직하게, 두개의 ITR 서열이 사용된다. 바이러스 벡터 안전상의 견지로 세포 내로 초기 도입후 추가적인 증식이 불가능한 바이러스 벡터를 구성하는 것이 바람직할 수 있다. 수용체에서 원하지 않는 벡터 증식을 제한하는 이러한 안전성 메카니즘은 US2003148506에 기재된 바와 같은 키메릭 ITR을 갖는 rAAV를 이용함으로써 제공될 수 있다. 바람직한 구현으로, 파보바이러스 VP1, VP2 및 VP3 캡시드 단백질을 암호화하는 뉴클레오타이드 서열은 WO 2009/154452에 기재되어 있는 바와 같이 면역 회피 반복을 코딩하는 서열의 적어도 하나의 인 프레임 삽입을 포함한다. 이는 소위 자가-상보적 또는 모노머 듀플렉스 파보바이러스 비리온의 형성을 이끌며, 감소된 면역 반응을 나타내는 이점을 갖는다. 바람직한 구현으로, 파보바이러스 VP1, VP2 및 VP3 캡시드 단백질을 암호화하는 서열은 모노머 듀플렉스 또는 자가 상보적 게놈을 포함한다. 모노머 듀플렉스 AAV 벡터의 제조를 위해, AAV Rep 단백질 및 AAV 캡시드 단백질은 본 발명에 따른 곤충 세포에서 그리고 적어도 하나의 AAV ITR을 포함하는 벡터 게놈의 존재하에서 발현되며, 여기서 Rep52 및/또는 Rep40 단백질 발현은 Rep78 및/또는 Rep68 단백질 발현에 비해 증가된다. 모노머 듀플렉스 AAV 벡터는 또한 적어도 하나의 AAV ITR 측면에 위치한 벡터 게놈 구조체의 존재하에서 AAV Rep 단백질 및 AAV Cap 단백질을 곤충 세포에서 발현시킴으로써 제조될 수 있으며, 여기서 Rep78 및/또는 Rep 60의 니킹(nicking) 활성은 예를 들어 WO 2011/122950에 기재된 바와 같이 Rep52 및/또는 Rep 40의 헬리케이즈/캡슐화 활성에 비해 감소된다.
사용되는 벡터 또는 핵산 구조체의 수는 본 발명에서 제한되지 않는다. 예를 들어, 1, 2, 3, 4, 5, 6 또는 그 이상의 벡터가 본 발명에 따라 곤충 세포에서 AAV를 생산하는데 사용될 수 있다. 6 벡터가 사용되는 경우, 일 벡터는 AAV VP1을 암호화하며, 다른 벡터는 AAV VP2를 암호화하며, 또 다른 벡터는 AAV VP3를 암호화하며, 또 다른 벡터는 Rep52 또는 Rep40을 암호화하며, 한편 Rep78 또는 Rep 68은 또 다른 벡터에 의해 암호화되며, 그리고 최종의 벡터는 적어도 하나의 AAV ITR을 포함한다. 예를 들어, Rep52 및 Rep40, 및 Rep78 및 Rep 68과 같은 부가적인 벡터가 발현을 위해 사용될 수 있다. 6 벡터 보다 적은 벡터가 사용되는 경우, 그 벡터들은 적어도 하나의 AAV ITR 및 VP1, VP2, VP3, Rep52/Rep40, 및 Rep78/Rep68 코딩 서열의 다양한 조합을 포함할 수 있다. 바람직하게, 2 또는 3 벡터가 사용되며, 2 벡터를 사용하는 것이 상기한 바와 같이 보다 바람직하다. 2 벡터가 사용되는 경우에, 바람직하게 곤충 세포는 (a) 상기 정의한 바와 같은 AAV 캡시드 단백질의 발현을 위한 제1 핵산 구조체로서, 이 구조체는 또한 상기 (b) 및 (c)에서 정의한 바와 같은 제3 및 제4 뉴클레오타이드 서열을 포함하며, 상기 제3 뉴클레오타이드 서열은 곤충 세포에서 발현을 위한 적어도 하나의 발현 조절 서열에 작동가능하게 연결된 Rep52 또는 Rep40 코딩 서열을 포함하며, 그리고 상기 제4 뉴클레오타이드 서열은 곤충 세포에서 발현을 위한 적어도 하나의 발현 조절 서열에 작동가능하게 연결된 Rep78 또는 Rep68 코딩 서열을 포함하는, 제1 핵산 구조체; 및 (b) 상기 (a)에서 정의한 바와 같은 제2 뉴클레오타이드 서열을 포함하는 제2 핵산 구조체로서, 이는 적어도 하나의 AAV ITR 뉴클레오타이드 서열을 포함하는, 제2 핵산 구조체를 포함한다. 3 벡터가 사용되는 경우에, 바람직하게 캡시드 단백질의 발현을 위해 그리고 Rep52, Rep40, Rep78 및 Rep68 단백질의 발현을 위해 별도의 벡터가 사용되는 것을 제외하고 2 벡터에 사용된 것과 동일한 형태가 사용된다. 각 벡터 상의 서열은 서로에 대해 어느 순서로 존재할 수 있다. 예를 들어, 만일 하나의 벡터가 ITRs 및 VP 캡시드 단백질을 암호하는 뉴클레오타이드 서열을 포함하는 ORF를 포함하는 경우에, VP ORF는 ITR 서열간의 DNA의 복제시 VP ORF가 복제되거나 복제되지 않도록 벡터 상에 위치할 수 있다. 다른 예로, Rep 암호 서열 및/또는 VP 캡시드 단백질을 암호하는 뉴클레오타이드 서열을 포함하는 ORF는 벡터 상에 어느 순서로 존재할 수 있다. 또한, 제2, 제3 및 추가의 핵산 구조체(들)은 바람직하게 곤충 세포 호환성 벡터, 바람직하게 상기 정의한 바와 같은 배큘로바이러스 벡터이다. 변형적으로, 본 발명의 곤충 세포에서, 하나 이상의 제1 뉴클레오타이드 서열, 제2 뉴클레오타이드 서열, 제3 뉴클레오타이드 서열, 및 제4 뉴클레오타이드 서열 및 선택적으로 추가의 뉴클레오타이드 서열이 곤충 세포의 게놈 내에 안정하게 통합될 수 있다. 당해 기술분야의 통상의 기술을 가진 자는 뉴클레오타이드 서열을 곤충 게놈 내로 안정하게 도입하는 방법 및 게놈 내에 이러한 뉴클레오타이드 서열을 갖는 세포를 확인하는 방법을 알 수 있다. 게놈 내로의 도입은 예를 들어, 곤충 게놈의 리전에 고 상동성인 뉴클레오타이드 서열을 포함하는 벡터의 사용에 의해 지원될 수 있다. 트랜스포손과 같은 특정 서열의 사용은 뉴클레오타이드 서열을 게놈 내로 도입하는 다른 한 방법이다.
따라서, 바람직한 구현으로, 본 발명에 따른 곤충 세포는 (a) 본 발명에 따른 제1 핵산 구조체로서, 이에 따른 제1 핵산 구조체는 상기 정의한 바와 같은 제3 및 제4 뉴클레오타이드 서열을 더 포함하는, 제1 핵산 구조체; 및 (b) 상기 정의한 바와 같은 제2 뉴클레오타이드 서열을 포함하는 제2 핵산 구조체로서, 여기서 상기 제2 핵산 구조체는 바람직하게 곤충 세포 호환성 벡터, 보다 바람직하게 배큘로바이러스 벡터인 제2 핵산 구조체를 포함한다.
본 발명의 바람직한 구현, 본 발명의 곤충 세포에 존재하는 제2 뉴클레오타이드 서열, 즉, 적어도 하나의 AAV ITR을 포함하는 서열은, (바람직하게 포유류 세포에서 발현을 위한) 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열을 더 포함하며, 여기서 바람직하게 상기 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열은 곤충 세포에서 생산된 AAV의 게놈 내로 통합된다. 바람직하게, 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열은 포유류 세포에서 발현을 위한 서열이다. 바람직하게, 제2 뉴클레오타이드 서열은 2개의 AAV ITR 뉴클레오타이드 서열들을 포함하며, 그리고 여기서 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열은 상기 2개의 AAV ITR 뉴클레오타이드 서열들 사이에 위치한다. 바람직하게, (포유류 세포에서 발현을 위한) 관심있는 유전자 산물을 암호화하는 뉴클레오타이드 서열은, 2개의 레귤라 ITRs 사이에 존재하거나, 또는 두 D 리전들로 가공된 ITR의 어느 한 편에 위치하는 경우에 곤충 세포에서 생산된 AAV 게놈 내로 통합될 것이다. 따라서, 바람직한 구현으로, 본 발명은 본 발명에 따른 곤충 세포로서, 여기서 제2 뉴클레오타이드 서열은 2개의 AAV ITR 뉴클레오타이드 서열들을 포함하며, 그리고 여기서 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열은 상기 2개의 AAV ITR 뉴클레오타이드 서열들 사이에 위치하는, 곤충 세포를 제공한다.
전형적으로, ITRs를 포함하는 관심있는 유전자 산물은 길이 5,000 이하의 뉴클레오타이드(nt)이다. 또 다른 구현으로, 오버사이즈 DNA, 즉, 길이 5,000 nt가 넘는 DNA는 본 발명에 의해 기재된 AAV 벡터를 사용하여 시험관 내 또는 생체 내에서 발현될 수 있다. 여기서 오버사이즈 DNA는 5kbp의 최대 AAV 패키징 제한을 초과하는 DNA로서 이해된다. 따라서, 일반적으로 5.0kb보다 큰 게놈으로 암호화된 재조합 단백질을 생산할 수 있는 AAV 벡터의 생성이 또한 실현 가능하다. 예를 들어, 본 발명자들은 곤충 세포에서 부분적으로, 단방향으로 패키지된 hFVIII의 프래그먼트를 함유하는 rAAV5를 생성하였다. 적어도 5.6kb를 포함하는 벡터 게놈의 총 사이즈를 2 집단의 FVIII 프래그먼트-함유 AAV5 파티클 내로 패키지하였다. 이러한 변종 AAV5-FVIII 벡터들은 활성적으로 FVIII를 분비하는 것으로 나타났다. 이는 시험관 내에서 확인되었으며, 여기서 Huh7 세포의 감염 후에 팩터 VIII를 암호화하는 관심있는 유전자 산물을 포함하는 AAV 벡터는 활성 FVIII 단백질의 생산을 이끌었다. 마찬가지로, 마우스에서 rAAV FVIII의 꼬리 정맥 운반은 활성 FVIII 단백질의 생산을 이끌었다. 캡슐화 산물의 분자 분석은, 5.6kbp FVIII 발현 카세트가 AAV 파티클에 완전히 캡슐화되지 않음을 보여주었다. 어떠한 이론으로 국한하려는 것은 아니라, 본 발명자들은 캡슐화된 분자들의 + 및 - DNA 스트랜드는 5' 엔드가 없어진 것으로 나타났음을 제기한다. 이는 4.7-4.9kbp 제한을 가진 "헤드-풀 원리(head-full principia)"에 따른 기존 보고된 단방향(3'엔드에서 출발) 패키징 메커니즘 오퍼레이팅과 일치한다(참조 예: Wu 등 [2010] Molecular Therapy 18(1):80-86; Dong 등 [2010] Molecular Therapy 18(1):87-92; Kapranov 등 [2012] Human Gene Therapy 23:46-55; 및 특히 Lai 등 [2010] Molecular Therapy 18(1):75-79). 전체 5.6kb 벡터 게놈 중에서 단지 약 5kb만이 캡슐화되었음에도 불구하고, 그 벡터는 강력하였으며, 활성 FVIII의 발현을 이끌었다. 본 발명자들은 FVIII의 생산을 위한 정확한 주형은 + 및 - DNA 스트레인의 부분적 상보성에 기초하여 표적 세포에서 어셈블리된 다음 제2 스트랜드 합성이 이루어진 것을 보여주었다.
따라서, 본 명세서에 정의된 제2 뉴클레오타이드 서열은 곤충 세포에서 복제된 AAV 게놈 내로 편입되도록 위치하는, 포유류 세포에서 발현을 위한 적어도 하나의 "관심있는 유전자 산물(gene product of interest)"을 암호화하는 뉴클레오타이드 서열을 포함할 수 있다. 구조체가 AAV 비리온의 패키징 용량 내에 유지되는 한, 어떠한 뉴클레오타이드 서열이 본 발명에 따라 생산된 AAV로 트랜스펙션된 포유류 세포에서 더 늦은 발현을 위해 편입될 수 있다. 상기 뉴클레오타이드 서열은 예를 들자면, RNAi 제제, 즉, 예를 들어 shRNA(짧은 헤어핀 RNA) 또는 siRNA(짧은 간섭 RNA)와 같이 RNA 간섭을 할 수 있는 RNA 분자를 발현할 수 있는 단백질을 암호화할 수 있다. "siRNA"는 포유류 세포에서 독성적이지 않은 짧은 길이의 이중 스트랜드 RNA인 소 간섭 RNA를 의미한다(Elbashir 등, 2001, Nature 411: 494-98; Caplen 등, 2001, Proc. Natl. Acad. Sci. USA 98: 9742-47). 바람직한 구현으로, 제2 뉴클레오타이드 서열은 2개의 뉴클레오타이드 서열들을 포함할 수 있으며, 각각은 포유류 세포에서 발현을 위한 관심있는 하나의 유전자 산물을 암호화한다. 관심있는 산물을 암호화하는 각각의 두 뉴클레오타이드 서열들은 곤충 세포에서 복제된 rAAV 게놈 내로 편입되도록 위치한다.
포유류 세포에서 발현을 위한 관심있는 산물은 치료 유전자 산물일 수 있다. 치료 유전자 산물은 폴리펩타이드, 또는 RNA 분자(siRNA)이거나, 또는 표적 세포에서 발현시 예를 들어, 감염된 세포의 제거와 같은 예를 들어, 원하지 않는 활성의 제거, 또는 효소적 활성의 결함을 일으키는 것과 같은 유전적 결함의 보완과 같은 원하는 치료 효과를 제공하는 다른 유전자 산물일 수 있다. 치료 폴리펩타이드 유전자 산물의 예는 CFTR, 팩터 IX, 리포프로틴 리파아제(LPL, 바람직하게 LPL S447X; WO 01/00220 참조), 아폴리포프로틴 A1, 유리딘 디포스페이트 글루쿠로노실트랜스퍼라아제(Uridine Diphosphate Glucuronosyltransferase, UGT), 레티니티스 피그멘토사 GTP아제 레귤라 인터액팅 프로틴(Retinitis Pigmentosa GTPase Regulator Interacting Protein, RP-GRIP), 예를 들어 IL-10, 디스트로핀(dystrophin), PBGD, NaGLU, Treg167, Treg289, EPO, IGF, IFN, GDNF, FOXP3, Factor VIII, VEGF, AGXT과 같은 사이토카인 또는 인터루킨 및 인슐린을 포함한다. 변형적으로, 또는 제2 유전자 산물로서 부가적으로, 본 명세서에 상기 정의된 제2 뉴클레오타이드 서열은 세포 트랜스포메이션 및 발현을 평가하기 위한 마커 단백질로서 제공되는 폴리펩타이드를 암호화하는 뉴클레오타이드 서열을 포함할 수 있다. 이러한 목적을 위한 적절한 마커 단백질은 예를 들어, 형광 단백질 GFP, 및 (HAT 배지 상에서 선별을 위한) 선별 마커 유전자 HSV 티미딘 키나아제, (하이그로마이신 B 상에서의 선별을 위한) 박테리아 하이그로마이신 B 포스포트랜스퍼라아제, (G418 상에서의 선별을 위한) Tn5 아미노글리코사이드 포스포트랜스퍼라아제, 및 (메토트렉사트 상에서의 선별을 위한) 디하이드로폴레이트 환원효소(DHFR), CD20, 저 친화성 신경 성장 인자 유전자이다. 이러한 마커 유전자들을 획득하기 위한 자료 및 이의 사용 방법은 Sambrook 및 Russel(2001) "Molecular Cloning: A Laboratory Manual"(3차 개정, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, New York)에 제공된다. 또한, 본 명세서에 상기 정의된 제2 뉴클레오타이드 서열은 본 발명의 rAAV로 형질도입된 세포로부터 대상자를 치유하는 것을 가능하게 하는 페일 세이프 메커니즘(fail-safe mechanism)으로서 제공할 수 있는 폴리펩타이드를 암호화하는 뉴클레오타이드 서열을 포함할 수 있다. 종종 자살 유전자라고도 칭하여지는 이러한 뉴클레오타이드 서열은 프로드럭을 유전자 도입 세포를 사멸할 수 있는 독성 물질로 전환시킬 수 있는 단백질을 암호화하며, 유전자 도입 세포에서 상기 단백질이 발현된다. 이러한 자살 유전자의 적절한 예는, 예를 들어 이. 콜라이(E. coli) 시토신 디아미나아제 유전자 또는 헤르페스 심플렉스 바이러스(Herpes Simplex Virus), 사이토메갈로바이러스(Cytomegalovirus) 및 수두대상포진 바이러스(Varicella-Zoster virus) 중 하나를 포함하며, 이러한 케이스에서 간시클로비르(ganciclovir)가 대상자에서 유전자 도입 세포를 사멸하기 위한 프로드럭으로서 사용될 수 있다(참조 예: Clair 등, 1987, Antimicrob. Agents Chemother. 31: 844-849).
또 다른 구현으로, 관심있는 유전자 산물은 AAV 단백질일 수 있다. 특히, Rep78 또는 Rep68과 같은 Rep 단백질, 또는 이의 기능성 프래그먼트일 수 있다. 본 발명의 rAAV 게놈 상에 존재하고, 본 발명의 rAAV로 유전자 도입된 포유류 세포에서 발현될 경우에, Rep78 및/또는 Rep68을 암호화하는 뉴클레오타이드 서열은 유전자 도입된 포유류 세포의 게놈 내로 rAAV의 통합을 가능하게 한다. rAAV-유전자 도입되거나 감연된 포유류 세포에서 Rep78 및/또는 Rep68의 발현은 rAAV에 의해 세포에 도입된 관심있는 어느 다른 유전자 산물의 장기간 또는 영구적인 발현을 가능하게 함으로써 rAAV의 특정 사용에 대한 이점을 제공할 수 있다.
본 발명의 rAAV 벡터에서, 포유류 세포에서 발현을 위한 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열(들)은, 바람직하게 예를 들어, 프로모터와 같은 적어도 하나의 포유류 세포 호환성 발현 조절 서열에 작동가능하게 연결된다. 다수의 이러한 프로모터가 당해 기술분야에 알려져 있다(Sambrook 및 Russel, 2001, 상기 참조). CMV 프로모터와 같은 다수의 세포 타입들에서 광범위하게 발현되는 구성적인 프로모터들이 사용될 수 있다. 그러나, 유도성의 조직-특이적, 세포 타입-특이적 또는 세포 사이클-특이적인 프로모터가 보다 바람직할 것이다. 예를 들어, 간-특이적 발현을 위해 프로모터는 α1-안티-트립신 프로모터, 티로이드 호르몬-바인딩 글로블린 프로모터, 알부민 프로모터, LPS(티록신-바인딩 글로블린) 프로모터, HCR-ApoCII 하이브리드 프로모터, HCR-hAAT 하이브리드 프로모터 및 아폴리포프로틴 E 프로모터, LP1, HLP, 미니멀 TTR 프로모터, FVIII 프로모터, 하이페론 인핸서, ealb-hAAT로부터 선택될 수 있다. 다른 예는 종양-선별성, 특히 신경 세포 종양-선별성 발현을 위한 E2F 프로모터(Parr 등, 1997, Nat. Med. 3:1145-9) 또는 혈액 단핵구 세포에 사용하기 위한 IL-2 프로모터(Hagenbaugh 등, 1997, J Exp Med; 185: 2101-10)를 포함한다.
AAV는 또한 다수의 포유류 세포를 감염시킬 수 있다. 참조 예, Tratschin 등, Mol. Cell Biol., 5(11):3251-3260(1985) 및 Grimm 등, Hum. Gene Ther., 10(15):2445-2450(1999). 그러나, 인간 활막 섬유아세포(human synovial fibroblasts)의 AAV 트랜스덕션이 유사한 뮤린 세포에서보다 더 현저히 효율적이며(Jennings 등, Arthritis Res, 3:1 (2001)), AAV의 세포 친화성은 혈청형에 따라 다르다. 참조 예, Davidson 등, Proc. Natl. Acad. Sci. USA, 97(7):3428-3432 (2000)(포유류 CNS 세포 트로피즘 및 트랜스덕션 효율성에 관한 AAV2, AAV4, 및 AAV5 간의 차이에 대해 논의되어 있음).
곤충 세포에서 AAV의 생산을 위해 본 발명에서 사용될 수 있는 AAV 서열은 어느 AAV 혈청형의 게놈으로부터 유래될 수 있다. 일반적으로, 상기 AAV 혈청형은 아미노산 및 핵산 수준에서 현저한 상동성의 게놈 서열을 가지며, 동일한 셋트의 유전적 기능을 제공하며, 본질적으로 물리적 및 기능적으로 동등한 비리온을 생산하며, 그리고 사실상 동일한 메커니즘에 의해 복제 및 어셈블리된다. 다양한 AAV 혈청형의 게놈 서열 및 게놈 유사성의 개관에 대하여 하기를 참조바람. GenBank Accession number U89790; GenBank Accession number J01901; GenBank Accession number AF043303; GenBank Accession number AF085716; Chlorini 등(1997, J. Vir. 71: 6823-33); Srivastava 등(1983, J. Vir. 45:555-64); Chlorini 등(1999, J. Vir. 73:1309-1319); Rutledge 등(1998, J. Vir. 72:309-319); 및 Wu 등 (2000, J. Vir. 74: 8635-47). 인간 또는 시미언 아데노 연관 바이러스(AAV) 혈청형이 본 발명의 정황에 사용하기에 바람직한 AAV 뉴클레오타이드 서열의 공급원이며, 보다 바람직하게는 일반적으로 인간에 감염하는 AAV 혈청형(예, 혈청형 1, 2, 3A, 3B, 4, 5, 6, 7, 8, 9, 10, 11, 12 및 13) 또는 영장류에 감염하는 AAV 혈청형(예, 혈청형 1 및 4)이다.
바람직하게 본 발명의 정황에서 사용되는 AAV ITR 서열은 AAV1, AAV2, AAV5 및/또는 AAV4로부터 유래된다. 마찬가지로, Rep52, Rep40, Rep78 및/또는 Rep68 코딩 서열은 바람직하게 AAV1, AAV2, 및/또는 AAV4로부터 유래된다. 본 발명의 정황에서 사용되는 VP1, VP2, 및 VP3 캡시드 단백질에 대한 코딩 서열은 알려진 42 혈청형들 중 어느 것으로부터, 보다 바람직하게 AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8 또는 AAV9로부터, 또는 예를 들어 캡시드 셔플링 기술 및 AAV 캡시드 라이브러리에 의해 획득되는 새롭게 개발된 AAV-류 파티클로부터 취해질 수 있다. 바람직한 구현으로, VP1, VP2, 및 VP3 캡시드 단백질에 대한 코딩 서열은 AAV5 또는 AAV8로부터 취해지며, 보다 바람직하게 AAV5로부터 취해진다.
대부분의 혈청형 중에서 AAV Rep 및 ITR 서열은 특히 보존된다. 다양한 AAV 혈청형들의 Rep78 단백질은 예를 들어, 89% 이상 동일하며, AAV2, AAV3A, AAV3B, 및 AAV6 사이에 게놈 수준에서 총 뉴클레오타이드 서열 동일성은 약 82%이다(Bantel-Schaal 등, 1999, J. Virol., 73(2):939-947). 더욱이, 다수 AAV 혈청형들의 Rep 서열 및 ITRs는 포유류 세포에서 AAV 파티클의 생산시 다른 혈청형들의 상응하는 서열들을 효율적으로 교차-보완(즉, 기능적으로 대체)하는 것으로 알려져 있다. US2003148506에는 AAV Rep 및 ITR 서열이 또한 곤충 세포에서 다른 AAV Rep 및 ITR 서열을 효율적으로 교차-보완한다는 것이 보고되어 있다.
상기 AAV VP 단백질은 AAV 비린온의 세포 친화성을 결정하는 것으로 알려져 있다. VP 단백질 암호화 서열은 다른 AAV 혈청형 중에서 Rep 단백질 및 유전자 보다 덜 보존된다. 다른 혈청형의 상응하는 서열을 교차-보완하는 Rep 및 ITR 서열의 능력은 한 혈청형(예, AAV3)의 캡시드 단백질 및 다른 AAV 혈청형(예, AAV2)의 Rep 및/또는 ITR 서열을 포함하는 슈도타입 AAV 파티클의 생산을 가능하게 한다. 이러한 슈도타입 AAV 파티클은 본 발명의 일부이다.
변형된 "AAV" 서열이 또한, 예를 들어 곤충 세포에서 rAAV 벡터의 생산을 위해 본 발명의 정황에 사용될 수 있다. 이러한 변형된 서열은 예를 들어, AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8 또는 AAV9 ITR, Rep 또는 VP에 대해 적어도 약 70%, 적어도 약 75%, 적어도 약 80%, 적어도 약 85%, 적어도 약 90%, 적어도 약 95%, 또는 그 이상의 뉴클레오타이드 및/또는 아미노산 서열 동일성을 갖는 서열(예, 약 75-99% 뉴클레오타이드 서열 동일성을 갖는 서열)을 포함하며, 와일드 타입 AAV ITR, Rep 또는 VP 서열 대신에 사용될 수 있다.
다수의 측면에서 다른 AAV 혈청형과 유사함에도 불구하고, AAV5는 다른 알려진 인간 및 시미언 혈청형보다 다른 인간 및 시미언 AAV 혈청형과 상이하다. 이의 견지에서, AAV5의 생산은 곤충 세포에서 다른 혈청형의 생산과 상이할 수 있다. 본 발명의 방법이 rAAV5를 생산하는데 사용되는 경우에, 하나 이상의 벡터의 경우에 AAV5 ITR을 포함하는 하나의 뉴클레오타이드 서열을 집합적으로 포함하는 하나 이상의 벡터는, 하나의 뉴클레오타이드 서열이 AAV5 Rep52 및/또는 Rep40 코딩 서열을 포함하고, 하나의 뉴클레오타이드 서열은 AAV5 Rep78 및/또는 Rep68 코딩 서열을 포함하는 것이 바람직하다. 이러한 ITR 및 Rep 서열은 곤충 세포에서 rAAV5 또는 슈도타입 rAAV5 벡터의 효율적인 생산을 확보하는데 필요시 변형될 수 있다.
바람직한 구현으로, 제1 뉴클레오타이드 서열, 제2 뉴클레오타이드 서열, 제3 뉴클레오타이드 서열 및 선택적으로 제4 뉴클레오타이드 서열은 곤충 세포의 게놈 내로 안정하게 통합된다.
다른 견지로, 본 발명은 AAV 비리온에 관한 것이다. 바람직하게, AAV 비리온은 이의 게놈 내에 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열을 포함하며, 여기서 상기 적어도 하나의 뉴클레오타이드 서열은 바람직하게 네이티브 AAV 뉴클레오타이드 서열이 아니며, 그리고 여기서 AAV VP1 캡시드 단백질은 N-말단 엔드로부터 C-말단 엔드로
(i) 제1 아미노산 잔기로서, 번역 개시 코돈에 의해 암호화되며, 바람직하게는 CTG, ACG, TTG 및 GTG로 구성된 그룹으로부터 선택된 차선 번역 개시 코돈에 의해 암호화된 제1 아미노산 잔기;
(ii) 알라닌, 글리신, 발린, 아스파트산 및 글루탐산으로 구성된 그룹으로부터 선택된 제2 아미노산 잔기;
(iii) 선택적으로, 상기 제2 아미노산 잔기 다음에 하나 이상의 부가적인 아미노산 잔기; 및,
(iv) AAV VP1 캡시드 단백질의 아미노산 서열로서, 이에 따른 상기 서열은 VP1 번역 개시 코돈에 의해 암호화된 아미노산 잔기가 결핍된 AAV VP1 캡시드 단백질의 아미노산 서열
을 포함하거나, 또는 이로 구성된다. 바람직하게 이에 따른 상기 서열은 VP1 번역 개시 코돈에 의해 암호화된 아미노산 잔기만 결핍되거나, 또는 변형적으로 이에 따른 상기 서열은 단지 VP1 번역 개시 코돈만이 결핍된 것이다.
바람직하게, VP1 번역 개시 코돈에 의해 암호화된 아미노산 잔기만 결핍된 AAV VP1 캡시드 단백질의 아미노산 서열은 자연적으로 발생하는 VP1 개시 코돈에 의해 암호화된 아미노산 잔기만 결핍된 AAV VP1 캡시드 단백질의 자연적으로 발생하는 아미노산 서열이다. 차선 번역 개시 코돈에 의해 암호화된 제1 아미노산 잔기는 전형적으로 메티오닌 잔기이다.
변형적으로, 이러한 견지에서 본 발명은 AAV 비리온에 관한 것으로, 여기서 AAV 비리온은 이의 게놈 내에 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열을 포함하며, 이에 따른 적어도 하나의 뉴클레오타이드 서열은 바람직하게 네이티브 AAV 뉴클레오타이드 서열이 아니며, 그리고 여기서 AAV VP1 캡시드는 개시 코돈과 와일드 타입 캡시 단백질의 위치 2의 아미노산 잔기에 상응하는 아미노산 잔기 사이에 삽입된 하나 이상의 부가적인 아미노산 잔기를 가지며, 여기서 상기 개시 코돈 바로 다음의 부가적인 아미노산 잔기는 알라닌, 글리신, 발린, 아스파트산 및 글루탐산으로 구성된 그룹으로부터 선택된다.
바람직하게, 본 발명에 따른 비리온에서, AAV VP1, VP2, 및 VP3 캡시드 단백질의 화학양론은 다음과 같다; VP1의 양: (a)는 VP2의 양의 적어도 100, 105, 110, 120, 150, 200 또는 400%이거나; 또는 (b)는 VP3의 양의 적어도 8, 10, 10.5, 11, 12, 15, 20 또는 40%이거나; 또는 (c)는 적어도 (a) 및 (b) 모두에서 정의한 바와 같다. 바람직하게, VP1, VP2 및 VP3의 양은 각각의 VP1, VP2 및 VP3에 공통적인 에피토프를 인지하는 항체를 이용하여 검출된다. VP1, VP2 및/또는 VP3의 상대적인 양을 정량화하는 것을 가능하게 하는 다양한 면역어세이가 당해 기술분야에서 이용가능하다(참조 예, Using Antibodies, E. Harlow 및 D. Lane, 1999, Cold Spring Harbor Laboratory Press, New York). 각각의 상기 3개의 캡시드 단백질에 공통적인 에피토프를 인지하는 적절한 항체는 예를 들어 마우스 안티-Cap B1 항체(Progen(독일)로부터 상업적으로 구입가능함)이다.
본 발명에 따른 바람직한 AAV는 이의 게놈 내에 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열을 포함하는 비리온이며, 이에 따른 적어도 하나의 뉴클레오타이드 서열은 바람직하게 네이티브 AAV 뉴클레오타이드 서열이 아니며, 그리고 이에 따른 AAV 비리온은 아미노산 위치 1에 메티오닌, 트레오닌, 루신 또는 발린을 포함하는 VP1 캡시드 단백질을 포함한다. 본 발명에 따른 보다 바람직한 AAV 비리온은 상기 정의한 바와 같은 캡시드 단백질들의 비를 가지며 아미노산 위치 1에 루신 또는 발린을 포함하는 VP1 캡시드 단백질을 포함한다. 예를 들어, 본 명세서에서 하기에 정의된 바와 같은 방법으로 상기 정의된 바와 같은 곤충 세포로부터 획득가능한 AAV 비리온이 보다 바람직하다. VP1 캡시드 단백질의 위치 1에 트레오닌 또는 루신을 포함하는 AAV 비리온이 보다 바람직하며, 보다 바람직하게는 트레오닌 잔기를 포함하는 AAV 비리온이다.
본 발명의 AAV 비리온의 이점은 이들의 향상된 감염성이다. 어떠한 이론으로 규정하려는 것은 아니나, 특히 감염성은 캡시드 내의 VP2 및/또는 VP3의 양과 관련하여 캡시드 내의 VP1 단백질의 양의 증가에 따라 증가하는 것으로 보인다. AAV 비리온의 감염성은 본 명세서에서 비리온에 포함된 트랜스진(이식 유전자)의 트랜스덕션(형질도입)의 효능을 의미하는 것으로 이해되며, 트랜스진의 발현율 및 트랜스진으로부터 발현된 산물의 양이나 활성으로부터 추정될 수 있다.
바람직하게, 본 발명의 AAV 비리온은 CFTR, 팩터 IX, 리포프로틴 리파아제(LPL, 바람직하게 LPL S447X; WO 01/00220 참조), 아폴리포프로틴 A1, 유리딘 디포스페이트 글루쿠로노실트랜스퍼라아제(Uridine Diphosphate Glucuronosyltransferase, UGT), 레티니티스 피그멘토사 GTP아제 레귤라 인터액팅 프로틴(Retinitis Pigmentosa GTPase Regulator Interacting Protein, RP-GRIP), 예를 들어 IL-10, 디스트로핀(dystrophin), PBGD, NaGLU, Treg167, Treg289, EPO, IGF, IFN, GDNF, FOXP3, Factor VIII, VEGF, AGXT과 같은 사이토카인 또는 인터루킨 및 인슐린으로 구성된 그룹으로부터 선택된 폴리펩타이드 유전자 산물을 암호화하는 관심있는 유전자 산물을 포함한다. 보다 바람직하게, 관심있는 유전자 산물은 팩터 IX 또는 팩터 VIII 단백질을 암호화한다.
따라서, 다른 견지로 본 발명은 곤충 세포에서 AAV를 생산하는 방법에 관한 것이다. 바람직하게 상기 방법은 (a) AAV가 생산되는 조건하에서 본 명세서에 정의된 바와 같은 곤충 세포를 배양하는 단계; 및 선택적으로 (b) 상기 AAV의 회수 단계를 포함한다. 배양시 곤충 세포를 위한 성장 조건, 및 배양시 곤충 세포에서 이종 산물의 생산은 당해 기술분야에 잘 알려져 있으며, 예를 들어 곤충 세포의 분자공학에 대한 상기 인용된 참고문헌에 기재되어 있다.
바람직하게 상기 방법은 안티-AAV 항체, 바람직하게 고정화된 항체를 이용한 AAV의 친화-정제 단계를 더 포함한다. 안티-AAV 항체는 바람직하게 모노클로날 항체이다. 특히 적절한 항체는 예를 들어 낙타 또는 라마로부터 획득가능한 단일 사슬 낙타과 항체 또는 이의 프래그먼트이다(참조 예, Muyldermans, 2001, Biotechnol. 74: 277-302). AAV의 친화-정제를 위한 항체는 바람직하게 AAV 캡시드 단백질 상의 에피토프에 특이적으로 바인딩하는 항체이며, 여기서 바람직하게 상기 에피토프는 하나 이상의 AAV 혈청형의 캡시드 단백질 상에 존재하는 에피토프이다. 예를 들어, 상기 항체는 AAV 캡시드에 대한 특이적 바인딩에 기초하여 얻어지거나 선택될 수 있으나, 또한 동시에 이는 AAV1, AAV3 및 AAV5 캡시드에 특이적으로 바인딩할 수 있다.
본 명세서 및 청구범위에서, 동사 "포함하다" 및 이의 동사활용은 상기 단어에 후속하는 아이템들이 포함되나 특별히 언급되지 않은 아이템들이 배제되지 않음을 의미하는 것으로 비제한적 관점으로 사용된다. 또한, 부정관사 "a" 또는 "an"에 의한 엘리먼트에 대한 언급은 그 문맥이 분명히 하나이거나 엘리먼트들중 단지 하나임을 요구하지 않는한 하나 이상의 엘리먼트가 존재할 가능성을 배제하지 않는다. 따라서, 부정관사 "a" 또는 "an"은 보통 "적어도 하나(at least one)"를 의미한다.
본 명세서에 인용된 모든 특허 및 문헌 참고문헌은 본 명세서에 참고문헌으로 편입된다.
하기 실시예는 단지 예시적인 목적으로 제공되며, 어느 방식으로 본 발명의 범위를 한정하려는 것은 아니다.
실시예
1. 서론
rAAV의 생산을 위한 초기 배큘로바이러스 시스템은 Urabe 등(Urabe 등[2002] Human Gene Therapy 13(16):1935-1943)에 의해 기술되었으며, 3가지 배큘로바이러스, 즉 Bac-Rep, Bac-cap 및 Bac-vec으로 구성되며, 이의 SF9와 같은 곤충 세포 내로의 동시 감염은 rAAV의 생성을 이끌었다. 이렇게 생산된 rAAV의 특성, 즉, 효능을 포함하는 물리적 및 분자적 특성은 포유류에서 생성된 rAAV와 현저히 다르지 않았다(Urabe[2002] 상기 참조). 곤충 세포에서 rAAV 벡터의 효율적인 생성을 완수하기 위해, 공정에 요구되는 AAV 단백질은 적절한 수준으로 발현되어야 하는 것이었다. 이는 Rep 및 Cap 단백질을 암호화하는 다수의 오페론 적응을 요구하였다. 와일드 타입 AAV는 각각 두 가지 개별적인 프로모터 p5 및 p19로부터 라지 Rep78 대 스몰 Rep52를 발현하며, 이 두 메신저의 스플라이싱은 Rep68 및 Rep52 변이체의 생성을 일으킨다. 이 오페론 기구는 Rep78의 제한된 발현을 일으키며, 상대적으로 더 높은 Rep52 발현을 일으킨다. 낮은 78 대 52 비율을 모방하기 위해, Urabe 및 동료들은 Rep78의 발현이 급초기 1 유전자(Δ IE -1)에 대하여 부분적으로 결실된 프로모터에 의해 유도되는 것에 반해, Rep52 발현은 강한 폴리헤드린 프로모터(polh)에 의해 조절되는 DNA 카세트를 제작하였다. 라지 및 스몰 Reps의 스플라이싱된 변이체들은 곤충 세포에서 관찰되지 않았으며, 이는 아마도 포유류 및 곤충 세포 사이에 스플라이싱 공정의 차이와 관련이 있는 것으로 보인다. 극복해야할 또 다른 기술적 과제는 상기 3가지 주 바이러스 단백질(VP's)의 발현과 관련되었다. 와일드 타입 AAV는 p40 프로모터로부터 VP1, 2 및 3을 발현한다. 발생된 메신저 RNA는 두 종으로 스플라이싱된다: VP1 발현에 책임있는 하나와 그리고 한편으로는 "리키 리보솜 스케닝 메커니즘(leaky ribosomal scanning mechanism)"을 통해 VP2 및 VP3 모두를 발현하는 두 번째 것. 여기서 단백질은 고전적이지 않은 출발로부터 개시되는 것으로, 즉 ACG는 리보좀 복합체에 의해 종종 놓쳐지며, 이는 VP3의 고전적인 출발이 찾아낼 때까지 계속 진행된다. 척추동물 세포와 곤충 세포 사이의 스플라이싱 기구의 차이에 기인하여, 상기한 메커니즘은 곤충 세포에서 적절한 캡시드의 생성을 일으키지 못하였다. Urabe 등은 VP1의 번역 출발이 ACG로 변하고, VP1에 선행하는 9 뉴클레오타이드로 구성된 개시 콘텍스트가 VP2에 선행하는 것들로 변하는 식으로 VP2에서 발견된 것들과 유사한 VP1의 번역 출발의 변형을 도입하기로 결정하였다. 이러한 유전적 변경은 단일 폴리시스트론성 mRNA로부터 캡시드 내로 적절히 어셈블리할 수 있는 정확한 화학양론으로 상기 3가지 VPs의 발현을 일으켰다. 다른 한편으로 트랜스진 카세트는 복제 및 패키징을 위해 단지 인 트랜스(in trans)로 요구되는 요소로서 ITRs 측면에 위치한 포유류 기초 시스템에 대해 이전에 기재된 것과 유사하였다.
다른 원하는 특성들을 가진 새로 발견된 AAV 혈청형의 수가 증가함에 따라, BEV 시스템에서 이러한 캡시드들의 생성에 대한 요구가 존재한다. 곤충 세포에서 AAV2의 성공적인 생산이 나타났음에도 불구하고, AAV2에 적응된 시스템에서 모든 혈청형이 동등하게 잘 수행되는 것은 아니다. 최적의 생산 및 효능을 위해 새로운 혈청형을 적응시키는 것은 사소한 과업이 아니며, 맞춤형 접근이 요구될 것이다. 곤충 세포에서 BEVS에 의한 생산을 위해 rAAV5 서열을 적응시키기 위한 이전의 시도들은 제한된 성공에 직면하였으며, 이는 VP1의 캡시드로의 낮은 통합을 일으켰다((Kohlbrenner 등(2005) Molecular Therapy 12 (6):1217-1225; Urabe 등(2006) Journal of Virology 80(4):1874-1885). 이러한 문제를 우회하기 위해, Urabe 등은 AAV 타입의 N-말단 136 아미노산 잔기와 그리고 AAV 타입 5의 나머지 서열을 함유하는 키메릭 타입 2/5 바이러스를 생성하였다. 이러한 바이러스는 잘 생산되는 것으로 보고되었으며, 와일드 타입 AAV5의 것과 유사한 효능을 나타내는 것으로 보고되었다(Urabe 등(2006), 상기 참조). 그러나, 결과적으로 형성된 비리온은 키메라이며, 이는 "진정한(true)" rAAV5 혈청형을 대표하지 못한다.
향상된 감염성 및/또는 효능으로 곤충 세포에서 진성 rAAV5를 생성하기 위해, 본 발명자들은 여러 캡시드 단백질 5 돌연변이들을 디자인하였다. 상기 3가지 바이러스 단백질들의 화학양론의 균형을 맞추어 주는 것이 감염성에 중요한 것으로 보인다. 예를 들어, 이전에 보고된 바와 같이, 본 발명자들은 VP1 합성의 결핍이 벡터의 효능에 대폭적으로 영향을 주는 것을 알았다. 더욱이, 본 발명자들은 벡터의 효능이 VP1 및 VP2에 비해 VP3의 높은 통합과 부정적으로 상관성이 있음을 관찰하였다. 과도한 양의 VP3를 갖는 바이러스 제조물은 시험관 내 및 생체 내에서 세포 트랜스덕션시 좋지 못하였다. 결국 본 발명자들은 Urabe 등(2006, 상기 참조)에 의해 생성된 키메릭 rAAV5에 비해 더 우수한 효능을 나타내는 진성(또는 "진정한(true)") rAAV5 캡시드를 제작하였다. 이 새로운 캡시드는 균형잡힌 VP 화학양론을 가지며, 그리고 키메릭 AAV2/5에 비해 유사하거나 또는 더 우수한 효능을 갖는 것으로 발견되었다.
2. 방법
2.1.
rAAV5
벡터의 생성
rAAV5 뱃치들은 expressSF+® 곤충 세포주(Protein Sciences Corporation)에 3개의 다른 배큘로바이러스로 동시 감염시킴으로써 생성되었으며, 여기서 3개의 다른 배큘로바이러스는 각각, CMV 및 LP1 프로모터의 조절 하에서 캡시드(rAAV5 변이체 라이브러리), 리플리케이즈 및 트랜스진(Seap 또는 팩터 IX)에 대한 발현 카세트를 포함하였다. 캡시드 발현 카세트는 폴리헤드론 프로모터의 조절 하에 존재하였다. Rep 발현 카세트는 WO 2009/14445(BAC.VD183)에 기재된 바와 같았으며, 각각 Rep78 및 Rep52의 발현을 유도하는 deltalE1 및 폴리헤드론 프로모터의 조절 하에 존재하였다. ExpresSF+® 세포들은 새로 증폭된 배큘로바이러스 스톡을 사용하여 5:1:1(Rep:Cap:트랜스진) 용적비로 감염되었다. 28℃에서 72시간 배양 후, 세포들을 28℃에서 1시간 동안 10x 용균 버퍼(1.5M NaCl, 0.5M Tris-HCl, 1mM MgCl2, 1% Triton X-100, pH=8.5)로 용해하였다. 게놈 DNA는 37℃에서 1시간 동안 벤조나아제 처리로 다이제스션하였다. 세포 찌꺼기는 1900xg에서 15분간 원심분리하여 제거한 후, rAAV5 파티클을 함유한 상층액을 4℃에 보관하였다. 벡터 타이터는 상기 트랜스진의 프로모터 리전에 대해 특이적인 Q-PCR을 이용하여 소위 크루드 셀 라이세이트(crude cell lysate)에서 검출되었다. 간략히, 친화 정제된 벡터들은 Q-PCR에 의해 분석되었다. AAVs는 37℃에서 DNAse로 처리하여 외생(extrageneous) DNA를 분해하였다. 그 다음, AAV DNA는 1M NaOH 처리에 의해 파티클로부터 방출되었다. 짧은 열처리(37℃, 30분) 후, 알칼리 환경은 동 체적의 1M HCl로 중화되었다. 중화된 시료는 Taqman Q-PCR에 사용된 AAV DNA를 함유하였다. Q-PCR은 하기 표 1에 열거한 프라이머 및 프로브를 사용하여 표준 방법에 따라 수행되었다.
2.2.
rAAV5
벡터의 정제
rAAV5 파티클은 AVB 세파로즈(친화성 레진, GE healthcare)를 사용하여 뱃치 바인딩 프로토콜에 의해 크루드 라이세이트로부터 정제되었다. rAAV5 크루드 셀 라이세이트는 (0.2M HPO4 pH=7.5 버퍼로) 세정된 레진에 첨가되었다. 후속적으로, 시료들은 온화한 혼합 하에 실온에서 2시간 동안 배양되었다. 배양 후, 레진을 0.2M HPO4 pH=7.5 버퍼로 세정하고, 바인딩된 벡터들은 0.2M 글리신 pH=2.5의 첨가로 용출되었다. 용출된 벡터들의 pH는 0.5M Tris-HCl pH=8.5의 첨가에 의해 즉시 중화되었다. 정제된 rAAV5 뱃치들은 -20℃에 보관되었다. 정제된 벡터들은 특이적 Q-PCR에 의해 타이터가 측정되었다.
생체 내 시험에 있어서 보다 높은 벡터 양을 생성하기 위해, 변형된 정제 프로토콜이 사용되었다. 간략히, 수거 후, 투명해진 라이세이트를 0.22㎛ 필터(Millipak 60, 0.22㎛)에 걸쳐 통과시켰다. 그 다음, 벡터 파티클을 AKTA explorer(FPLC 크로마토그래피 시스템, GE healthcare) 상에서 8ml AVB 세파로즈 컬럼에 의해 친화 정제하였다. 바인딩된 rAAV5 파티클은 0.2M 글리신 pH=2.5로 상기 컬럼으로부터 용출되었다. 용출물은 즉시 60mM Tris-HCl pH=7.5에 의해 중화되었다. 중화된 용출물의 버퍼는 100 Kda 한외여과(Millipore) 필터의 도움으로 PBS 5% 수크로즈로 교환되었다. 그 다음, 최종 산물은 0.22㎛ 필터(Millex GP) 상에서 여과하고, 분액하고, 추가 사용시까지 -20℃에 보관하였다. 정제 후, 바이러스 타이터는 특이적 Q-PCR을 이용하여 검출되었다.
2.3.
rAAV5
변이체들의
VP
단백질 조성
정제된 rAAV5 변이체들의 VP 단백질 조성은 Sypro Ruby로 염색된 Bis-트리스 폴리아크릴아미드 겔(Nupage, Life technologies) 상에서 검출되었다. 간략히, 15㎕의 정제된 rAAV5를 5㎕ 4x LDS 로딩 버퍼(Life technologies)와 혼합하고, Bis-트리스 폴리아크릴아미드 겔 상에 로딩하였다. 시료들은 100 볼트에서 2시간 동안 전기영동으로 분리되었다. 전기영동 후, 단백질들은 10% NaAC/7% EtOH로 30분간 고정되고, 2시간 동안 Sypro Ruby(Life technologies)로 염색되었다. 그 다음, VP 단백질들은 ImageQuant system(GE Healthcare) 상에서 UV 광 하에 가시화되었다.
2.4. 시험관 내 효능
여러 가지 혈청형 5 캡시드 변이체들의 시험관 내 효능을 조사하기 위해, 2가지 연속 세포주를 사용하였다. 여기서, 1x105 Hela 및 Huh7이 다양한 감염 다중도(multiplicity of infection)로 rAAV5 변이체들로 감염되었다. 실험은 le5 cells/well에서 약 80% 컨플루언시를 갖는 24-웰 플래이트에서 수행되었다. 두 실험 모두에서, 와일드 타입 아데노바이러스는 30 감염 다중도로 사용되었다. 와일드 타입 아데노바이러스의 이러한 첨가는 약 24시간 내에 제2 스트랜드 합성의 공정을 촉진하기 위해 시험관 내 효능 시험에만 적용되어, 이에 따라 이 어세이는 상대적으로 보다 짧은 기간으로 수행될 수 있고, 세포 패시지의 요구를 피할 수 있다. 감염 시작 후 48시간에, Seap 발현은 Seap 리포터 어세이 키트(Roche)를 이용하여 상층액에서 측정되었다. 1초의 인테그레이션 시간으로 470nm에서 Spectramax L luminometer(Molecular devices) 상에서 발광을 측정하였다.
2.5. 생체 내 효능
여러 가지 혈청형 5 캡시드 변이체들의 생체 내 효능을 조사하기 위해, 2가지 다른 실험을 수행하였다. 간략히, Seap 리포터 유전자가 은닉되어 있는 rAAV5 벡터 구조체들 159-164의 효능이 C57BL/6 마우스에서 조사되었다. 다양한 벡터들이 5x1012 gc/kg의 도즈로 마우스에 근육 내 주사되었다. 그룹들은 각 5마리의 마우스로 구성되었으며, PBS 그룹을 포함하여 총 7그룹들이 존재하였다. 마우스 혈장은 주사 후 2, 4 및 6주에 획득되었으며, 이후에 마우스는 희생되었다. Seap 활성은 Roche의 Seap 리포터 어세이 키트를 사용하여 혈장에서 측정되었다. 1초의 인테그레이션 시간으로 470nm에서 Spectramax L luminometer(Molecular devices) 상에서 발광을 측정하였다.
그 다음, 변이체 AAV5(765)의 생체 내 효능을 AAV5(160) 및 AAV5(92)의 생체 내 효능과 비교하였다. AAV5(92)는 Kotin 박사의 실험실(Urabe 등, 2006)로부터 친절히 제공받았다. C57BL/6 마우스에 리포터 유전자로서 FIX가 모두 은닉된 765 또는 160을 2x1012 gc/kg 및 2x1013 gc/kg의 도즈로 정맥 내 주사하였다. PBS 그룹을 포함하여 5마리의 마우스로 된 총 7 그룹에 각각 주사하였다. 혈장은 주사 후 1, 2 및 4주에 수집되었으며, 이후 마우스들은 희생되었다. 혈장에 존재하는 팩터 IX 단백질은 팩터 IX 특이적 ELISA(VisuLize FIX antigen kit, Kordia)로 측정되었다. 광학 밀도는 Versamax ELISA 플래이트 리더(Molecular devices) 상에서 450nm에서 측정되었다.
3. 결과
3.1.
BEVS
에서
rAAV5
의 생성
AAV는 이의 유전자들, 특히 cap 유전자를 발현하기 위해 그 숙주의 기구를 이용하는 포유류 바이러스이다. 포유류 숙주에서 VP1:VP2:VP3의 정확한 화학양론에 의해 달성되는 메커니즘은 곤충 세포에서 존재하지 않거나 또는 최적이 아니다. 따라서, Urabe 등은 정확한 화학양론으로 곤충 세포에서 AAV2의 3가지 VP 단백질의 생산을 일으키는 cap 폴리시스트론성 mRNA의 조직화를 위한 유전적 조정의 전략을 개발하였다(Urabe 등(2002), 상기 참조). BEVS에서 rAAV5를 생산하는 유사한 방법을 확립하기 위한 시도들은 충분한 감염성 파티클을 달성하는데 성공적이지 못한 것으로 입증되었다. 어떠한 이론으로 국한하려는 것은 아니나, 이는 VP1의 캡시드 내로의 낮은 통합에 기인하는 것으로 보인다(Urabe 등(2002), 상기 참조). 이 때문에, 타입 2 혈청형을 이용한 이전의 성공을 기반으로 하여 Urabe 등은 감염성 AAV5 파티클을 생산하기 위해 타입 5 VP1의 N-말단 부분을 타입 2의 것으로 치환하였다(Urabe 등(2002), 상기 참조). 성공적임에도 불구하고, 키메릭 AAV2/5 키메릭 캡시드는 bona fide 타입 5 파티클을 포함하지 않으며, 그러므로 AAV5에 비해 변화된 특성을 가질 수 있으며, 이는 타입 5의 것과는 다른 두 캡시드의 조합을 나타낼 수 있다.
향상된 감염성 및 효능으로 곤충 세포에서 AAV5 비리온 생산을 가능하게 하기 위해, 본 발명에서 AAV5의 cap5 발현 카세트에 대한 일련의 유전적 변화를 만들었다(표 2). 기존에 언급된 바와 같이(Urabe 등(2002), 상기 참조) 와일드 타입 cap5 유전자(여기서는 클론 넘버 763)는 rAAV의 생성을 뒷받침하지 못하였다. 곤충 세포에서 네이티브 AAV 스플라이싱 신호의 인지의 결핍은 아마도 개별 VP 단백질의 낮은 발현 및 벡터 생산의 결핍을 일으켰다. 진핵 리보솜은 mRNA를 5'으로부터 3'으로 단방향으로 읽기 때문에, 폴리시스트론성 cap5 mRNA의 제1 번역 개시 출발(여기서는 VP1)은 3가지 모든 단백질들의 발현에 해롭다. 와일드 타입 개시 출발은 소위 강한 번역 개시 코돈인 ATG로 구성되며, 이는 리보솜 번역초과(read through)를 허용하지 않으며, 이에 따라 다른 두 VPs의 발현을 차단하여, rAAV 생산의 결핍을 이끈다. 와일드 타입 AAV는 VP2(비-고전적인 번역 개시 출발, ACG) 및 VP3(ATG)를 발현하기 위해 리보좀 번역초과(read through)를 사용한다는 사실로 인하여, VP1의 번역 출발 및 3가지 VP 단백질들의 발현 및/또는 어셈블리를 변경하기 위해 이의 인접 환경을 조사하였다.
번역 출발의 뉴클레오타이드 콘텍스트는 번역 개시의 강도에 영향을 미친다는 것이 이전에 보고되었다(Kozak(1987) Nucleic Acid Research 15(20):8125 8148; WO2007/046703). 바람직한 뉴클레오타이드는 위치(-3)에 A 및 위치(+4)에 G가 존재하며, 각각 AUG 카운팅 +1, +2 및 +3이 있는 것으로 보인다(Kozak, 상기 참조; WO2007/046703). 표 2는 번역 개시 출발, 3가지 VPs의 발현을 조정하기 위한 이의 업스트림 및 다운스트림 콘텍스트에 도입된 특정 변화들을 상세히 나타낸다. 본 발명자들은 VP2 번역 출발을 본래 둘러싸는 업스트림 개시 콘텍스트; 다양한 비-고전적인 출발 코돈들(ACG, CTG, TTG, GTG), +2 와일드 타입 트리플릿에 대한 다양한 돌연변이유발 변화 및 +1 개시 트리플릿과 +2 와일드 타입 트리플릿 사이의 삽입을 조사하였다. 이러한 특징들의 조합을 포함하는 발현 카세트들이 rAAV의 생성을 위해 사용되었다.
3.2.
VP1
의 번역 개시 출발 주위의 작은
뉴클레오타이드
변화는 벡터의 효능에 깊은 영향을 미친다.
표 2에 열거된 cap5 발현 카세트들의 모든 변이체들이 은닉되어 있는 배큘로바이러스 구조체가 성공적으로 생성되었다. 후속적으로, Rep(s) 및 트랜스진(리포터 유전자, 예, SEAP 또는 FIX)이 은닉된 배큘로바이러스들과 함께 이러한 배큘로바이러스를 rAAV의 생성을 위해 사용하였다. 여러 시도에 관계없이 시험된 구조체들의 일부는 rAAV 생산의 생성을 뒷받침하지 못하였다. 이는 와일드 타입 AAV5(구조체 763) 및 비-고전적인 출발, TTG(구조체 764), GTG(구조체 766)가 은닉된 구조체들의 일부를 포함하였다. 표 2에 열거된 다른 모든 구조체들은 rAAV의 성공적인 생성을 이끌었다.
성공적으로 생산된 rAAV 타입 5 변이체들의 3가지 바이러스 단백질들(VPs)이 분리되었다. 이 3가지 VPs의 화학양론은 정제된 벡터들의 전기영동 분리(SDS-PAGE)에 의해 조사되었다(도 1 및 6). cap5 유전자의 발현 카세트에 도입된 작은 변형들은 캡시드의 구성에 나타나는 상기 3가지 VP 단백질들의 발현 및/또는 어셈블리에 깊은 영향을 미치는 것으로 보인다. 본 발명자들은 비-고전적인 출발 코돈(ACG) 및 혈청형 2의 곤충 세포 생산을 가능하게 하는 변형으로서 Urabe 등에 의해 보고된 9가지의 뉴클레오타이드 업스트립 콘텍스트 CCTGTTAAG를 도입하는 것에 의한 혈청형 5 캡시드의 곤충 세포에 대한 적응이, VP1의 낮은 통합(낮은 VP1/VP2 비율) 및 캡시드 내로의 VP3의 과도한 수준의 통합(높은 VP3/VP1 비율)을 일으키며, 이는 상기 3가지 VPs의 일탈적인 화학양론을 일으키는 것을 알았다(도 1, 구조체 159). 마찬가지로, 알라닌에 대해 위치 +2에서 세린의 교환을 일으킨, G를 구성하기 위한 그리고 고전적인 Kozak 서열에 보다 유사하게 하기 위한 뉴클레오타이드 +4의 변형(구조체 161)은 VP1의 낮은 통합 및 VP3의 높은 통합(낮은 VP1/VP2 높은 VP3/VP1)을 일으켰다. 업스트림 CCTGTTAAG 및 다운스트림 변형, 즉 A에 대한 +4 뉴클레오타이드(구조체 162), 또는 AG에 대한 +4-5(구조체 164) 변화 또는 AGC에 대한 오리지널 +2 프리플릿의 변형과 함께 제2 트리플릿으로서 ACT의 삽입이 함께한 다양한 비=고전적 코돈 CTG의 사용은 낮은 VP1/VP2 높은 VP3/VP1을 일으키는 캡시드의 VP1 통합을 향상시키지 못하였다. 1에 가까운 VP1/VP2 비율을 보인 구조체들 중 하나는 구조체 160이었으며, 이는 와일드 타입 서열에 비해 CCTGTTAAG의 직접적인 업스트림 삽입, 비-고전적 ACG 및 GCT에 의해 암호화된 위치 +2 내의 부가적인 알라닌의 삽입을 포함하는 것이었으나, 그럼에도 불구하고 VP3의 통합은 여전히 과도하였다(동등한 VP1/VP2 높은 VP2/VP1). 후속적으로, 구조체 160에서 프로모터 서열은 보다 정확하게 와일드 타입 폴리헤드린 프로모터와 유사하도록 돌연변이되었다. 이는 돌연변이 761을 생성하였다. VP2 개시 콘텍스트는 제거되어 돌연변이 762가 생성되었다. 두 케이스 모두에서(761 및 762), 구조체 160에 비해 바이러스의 화학양론에 약간의 부정적인 영향이 있었다(보다 낮은 VP1 통합)(도 1). 그 다음, (번역 출발 코돈으로부터 직접적으로 다운스트림의 유익한 GCT를 보존하기 위해) 구조체 160에서 VP1의 번역 개시 출발 사이트는 와일드 타입 ATG(돌연변이 763), TTG(돌연변이 764), CTG(돌연변이 765), CTG(돌연변이 766)으로 변경되었다. 765 돌연변이를 제외하고 모두 rAAV의 검출가능한 생산의 결핍을 일으켰다. 흥미롭게도, 비-고전적인 VP1 개시 출발로서 CTG와 번역 출발 바로 다음의 (여분의 알라닌을 암호화하는) GCT 트리플릿의 첨가의 조합(765)은 VP2보다 더 높은 VP1의 통합 및 VP3의 강한 감쇄를 일으켰으며, 궁극적으로 균형잡힌 와일드 타입 AAV 유사 VP 화학양론(높은 VP1/VP2 중간의 VP3/VP1)을 일으켰다. 결국, ACG 대신 VP1 개시 코돈으로서 CTG를 갖는 구조체 160을 닮은 구조체 43은 거의 네이티브 VP 비율로 VP1 생산을 이끌었다(도 6).
3.3.
VP3
의 불필요한 발현은
BEVS
에서 진성 타입 5
AAV
돌연변이들의 낮은 효능에 원인이 된다.
혈청형 5 캡시드의 라이브러리의 효능, 즉 벡터가 다른 VP 화학양론을 갖는 이의 유전 물질의 발현을 유도해 내는 능력을 시험하기 위해, 시험관 내 및 생체 내 시험이 수행되었다. 두 가지 다른 연속 세포주, 즉 Hela(도 2 및 도 7a) 및 Huh7(도 3 및 도 7b)가 사용되었다. 두 가지 모든 경우에, VP2의 통합보다 낮은 VP1의 통합 및 VP3의 과도한 통합을 보인 돌연변이들의 셋트(구조체 159, 161-164)는 매우 감소된 효능을 나타내었다(도 2 - 3). 벡터의 효능은 VP1 및 VP2 통합의 균형을 ??춤으로써 상당히 향상되었다(구조체 160). 프로모터의 단축(구조체 761) 및 개시자 구조체의 제거(구조체 762)는 벡터 효능에 부정적인 영향을 미쳤다. 가장 강력한 벡터, 구조체 765(도 2 - 3)는 전자에서 더 유리한 VP1 대 VP2 비율을 나타내었으며, VP3 통합을 현저히 감소시켰다. 결국, 개시자 구조체, CTG 개시 코돈 및 (여분의 알라닌을 암호화하는) 부가적인 GCT 트리플릿을 함께 갖는 (단축되지 않은) polH 프로모터(구조체 43)는 비록 구조체 765의 효능보다 다소 낮을지라도 우수한 효능을 나타내었다(도 7a 및 b).
돌연변이들의 부분집합(구조체 159-164)은 효능에 대해 생체 내에서(C57BL/6 마우스) 시험되었다. 벡터는 리포터 유전자 SEAP를 운반하였다. 마우스에 5e12 gc/kg의 도즈로 캡시드 5 변이체들을 주사하고, 시간에 맞춰 모니터하였다. 시험관 내 관찰에 따라, 시험된 셋트 중에서(160) 가장 우수한 효능을 나타낸 변이체는 또한 등몰 양으로 VP1/VP2를 가졌다(도 4).
3.4. 곤충 세포 생산된 진성 AAV5 (765)는 생체 내에서 키메릭 타입 2/5 돌연변이에 비해 더 우수하게 수행한다.
생체 내에서 AAV5(765)의 효능을 조사하기 위해, 3가지 벡터 뱃치들을 제조하였다. 이들은 키메릭 타입 2/5(92)(Urabe 등(2006), 상기 참조), 과도한 양의 VP3를 함유하는 진성 타입 AAV5(160) 및 VP 단백질의 와일드 타입 화학양론을 갖는 시험관 내에서 가장 우수하게 수행되는 진성 타입 5 AAV(765)를 포함하였다. 모든 뱃치들은 (WO 2006/36502에 기재된 바와 같이) Rep 단백질 및 FIX 발현 카세트가 은닉된 배큘로바이러스 구조체들을 사용하여 동일한 조건 하에서 생산되었다. 이 3가지 벡터 제조물의 효능을 비교하기 위해, 6마리의 검정 마우스에 상기 벡터들의 두 가지 다른 도즈를 주사하였다. 즉, 저 도즈 2e12 gc/kg 및 고 도즈 2e13 gc/kg. 각 5 마리의 동물로 구성된 비이클 그룹을 포함하여 총 7 그룹이 실험에 포함되었다. 실험 시작 후, 혈액을 1, 2 및 4주에 수집하였다. FIS의 발현은 특이적 ELISA에 의해 혈액에서 모니터되었다. 그 결과는, 앞선 시험관 내 발견이 160 구조체보다 현저하게 향상된 효능을 나타내는 765 돌연변이가 새롭게 생성되었음을 입증하였다. 흥미롭게도, 765 구조체는 또한 Urabe 등(2006)(상기 참조)에 의해 공개된 타입 2/5 키메라(구조체 92)보다 더 현저히 우수하였다(도 5). 765과 160 그리고 765와 92 사이의 차이를 조사하기 위해 언페어드 t 테스트가 사용되었다. 모든 케이스에서, 즉 1, 2 및 4주에서, p 값<0.05으로 통계학적으로 유의한 차이가 있었다.
4.
디스커션
곤충 세포에서 rAAV의 생성은 cap 유전자의 유전학적 기구에서 다수의 조정을 필요로 한다. 포유류 세포에서 AAV는 VP2에 대한 ACG 개시자 출발로부터 이의 VP 단백질들을 발현한다. 이는 1:1:10의 VP1:VP2:VP3 화학양론을 이끈다. 곤충 세포에서, 이러한 메커니즘은 정확한 VP 화학양론으로 AAV 벡터를 생산하는데 실패하였다(Urabe 등(2002), 상기 참조). 이는 VP1 개시자 트리플릿을 ACG로 변화시키고, 번역 개시 출발 사이트로부터 9 뉴클레오타이드 업스트림을 돌연변이 유발시킴으로써 rAAV2 혈청형을 생성하는데 있어서 Urabe 등에 의해 이전에 회피된 알려진 문제이다. 이러한 변화들은 정확한 화학양론으로 3가지 모든 rAAV2 VP 단백질의 생산을 이끌었다. rAAV5 발현 카세트에서 유사한 유전적 변화는 낮은 VP1 생산 및 생산된 바이러스의 낮은 효능을 일으켰다. rAAV2에 대한 유전적 적응의 성공을 기반으로 하여, Urabe 등은 일련의 6 도메인 스와프(swap) 돌연변이를 제조하기로 하였으며, 여기서 rAAV5는 AAV2로부터 VP1의 (7 아미노산 내지 136 아미노산까지 범위의) 다양한 길이의 N-말단부를 받아들였다. 이 접근은 VP 단백질의 정확한 화학양론을 나타낸 키메릭 rAAV5의 생산을 이끌었다. 더욱이, 상기 도메인 스와프 돌연변이는 293T 세포에서 생산된 rAAV5의 효능과 유사하거나 더 우수한 효능을 이끌었다(Urabe 등(2002), 상기 참조). Urabe 등은 키메릭 rAAV5가 곤충 세포에서 생성될 수 있음을 입증했음에도 불구하고, 획득된 벡터는 bona fide AAV5 파티클을 포함하지 않으며, 그러므로 기존의 중화 항체에 대한 민감성, 세포 내 수송(intracellular trafficking), 바이오 디스트리뷰션(bio-distribution) 및/또는 진정한 AAV5 혈청형으로부터의 표적화와 같은 다양한 견지에서 상이할 수 있다. 이와 동시에, Urabe 등은 VP1 폴리펩타이드의 낮은 합성에 기인하여 감염성 진성 rAAV5를 생산하기 위한 시도가 실패하였음을 보고하였다(Urabe 등(2002), 상기 참조).
여기서 본 발명자들은 곤충 세포에서 생산된 진성 rAAV5의 저 효능에 대한 근본적인 결정 요인을 이해하는데 목적을 둔 cap5 돌연변이 라이브러리를 제작하였다. 우선, 본 발명자들은 곤충 세포에서 rAAV2의 성공적인 생성을 위해 기존에 사용된 다수의 적응들이 편입된 돌연변이(159)를 조사하였다(Urabe 등(2002), 상기 참조). 이 돌연변이는 VP1 번역 출발의 업스트림에 위치한 9 뉴클레오타이드 업스트림 VP2 개시자 콘텍스트 및 비-고전적인 번역 개시 출발 ACG를 함유한다. 이들 9 뉴클레오타이드는 곤충 세포에서 혈청형 2 유전자를 발현하기 위해 Urabe 등에 의해 기존에 사용되었다(Urabe 등(2002), 상기 참조). 이 특정 서열은 자연적으로 VP2의 비-고전적인 출발 코돈(ACG) 측면에 위치한다. 그 다음, 와일드 타입 ATG는 ACG 또는 CTG로 변하였으며, 그리고 출발 코돈으로부터 최적의 다운스트림 콘텍스트를 제공하기 위해 다양한 돌연변이들이 도입되었다. 대부분의 돌연변이들은 VP1의 낮은 통합 및 VP3의 과잉 존재를 갖는 일탈적인 VPs 화학양론(낮은 VP1/VP2 및 높은 VP3/VP1 비율)을 나타내었다. VP1/VP2 비율은 유전학적 디자인 160에서 상당히 향상되었으며, 그러나 이는 VP3의 벡터 파티클 내로의 과도한 통합을 여전히 나타내었다. 마지막으로, 유전학적 디자인들 중 하나, 즉 765가 다른 시험된 변이체들에 비해 VP1의 높은 통합(높은 VP1/VP2 비율) 및 VP3의 감소된 통합(균형잡힌 VP3/VP2 비율)을 나타내었다.
VP1/VP2 단백질의 낮은 비율은 낮은 벡터 효능에 책임이 있는 것으로 이전에 가정되어 왔다(Hermonat 등(1984) Journal of Virology 51(2):329-339; Tratschin 등(1984) Journal of Virology 51(3):611-619). AAV의 독특한 VP1 부분은 캡시드 내부에 매립되며, 바이러스의 핵으로의 수송 중에 노출된다. 이는 엔도솜의 루멘의 pH 저하에 대한 반응으로서 처음 노출된다. VP1의 프리 N-말단부는 캡시드의 외부에 노출시 인지질 기질의 2-아실 에스테르(sn -2) 결합을 특이적으로 가수분해하는데 유용하게 되는 포스포리파아제 도메인을 함유하며, 이는 리소인지질 및 유리 지방산의 방출을 일으키며, 그 다음 AAV의 엔도솜 탈출을 가능하게 한다. VP1의 독특한 부분은 핵 국소화 신호(베이직 아미노산 클러스터)를 함유하며, AAV의 핵 표적화에 연루되었다. 마침내, 일부 저자들은 VP1의 독특한 부분이 핵에서 바이러스 언코팅에 역할을 담당할 수 있음을 제시한다. 낮은 VP1/VP2 비율 및 VP3의 바이러스 파티클 내로의 과도한 통합(높은 VP3/VP1 비율)은 1) 평균적으로 VP1의 어셈블리된 파티클 내로의 감소된 통합을 일으키거나 또는 2) 두 가지 파티클 집단 A) (와일드 타입 화학양론 1:1:10, 즉 벡터 파티클 당 5 VP1 분자에 근접하게 갖는) 정확하게 어셈블리된 파티클 B) 단지 VP3/VP2 파티클의 생성을 일으킬 수 있다. 두 가지 모두의 상황에서(1 및 2), 이러한 벡터 제조물은 변경된 효능을 가질 수 있다. 벡터 제조물에 존재하는 (VP1 또는 VP2에 비해) VP3 단백질들의 과도한 양은 아마도 방해받은 엔도솜 탈출에 기인하여 벡터의 핵으로의 손상된 수송을 일으킨다. VP 화학양론이 벡터 효능에 유해하다는 가정을 시험하기 위해 그리고 보다 강력한 벡터를 생성하기 위해, 혈청형 5 캡시드 돌연변이들의 라이브러리를 시험관 내 및 생체 내에서 시험하였다.
VP의 화학양론은 상기 벡터의 효능과 매우 상관관계에 있는 것으로 보였다. 이전에 나타난 바와 같이(Hermonat 등(1984), 상기 참조; Tratschin 등(1984), 상기 참조; WO2007046703A2), 낮은 VP1/VP2 비율은 바이러스의 효능에 대해 강한 영향을 미친다. 돌연변이 159, 161-164는 모두 낮은 VP1/VP2 비율을 나타내었으며, 급격하게 감소된 효능을 나타내었다. VP1/VP2 사이의 향상된 비율은 벡터의 효능에 현저한 영향을 미쳤다(160). 흥미롭게도 VP1/VP1 비율의 추가 향상 및 VP3의 벡터 파티클로의 통합 감소(VP3/VP1 비율 감소)는 향상된 벡터 43의 생성을 이끌었으며, 이는 시험된 셋트 중 가장 강력한 벡터(구조체 765)의 생성을 이끌었다. 이 데이터는 벡터 파티클의 분자 형성이 이의 효능에 유해함을 분명히 보여준다. VP1의 통합 향상 및 이와 동시에 VP3의 통합 감소는 벡터 효능면에서 가장 우수한 결과를 주는 것으로 보인다. BEVS에서 생성된 파티클의 낮은 VP1/VP2 비율의 영향은 벡터 효능에 부정적인 영향을 미치는 것으로 이전에 보고되었다. VP2/VP3의 비율은 BEVS에서 생산을 위한 이의 유전학적 디자인이 와일드 타입 AAV 바이러스에서와 동일하다는 사실에 주로 기인하여, 이제까지 고려되지 않았다. 이러한 이유로, 이것이 변경된 VP2/VP3 비율을 이끈다는 것이 기대되지 않는다. 그러나, 여기서 존재하는 하나의 돌연변이를 제외한 모든 돌연변이에서 본 발명자들은 VP3의 벡터 파티클 내로의 과도한 통합(높은 VP3/VP1 비율)을 관찰하였으며, 이는 VP1 번역 출발 주위의 변화가 VP2 및 VP3의 발현에 강한 영향을 미친다는 것을 나타내었다. 단지 돌연변이 765만이 높은 VP1/VP2 비율 및 VP3의 감소된 통합을 갖는 균형된 화학양론을 나타내었으며, 이는 다른 시험된 변이체들에 비해 증가된 효능을 이끌었다. 또한, 765 변이체의 효능은 BEVS(구조체 92)에서 생산된 AAV5 유사 벡터에 대해 생체 내(마우스)에서 비교되었다. 상기 92 구조체는 혈청형 2의 N-말단 136 아미노산부를 갖는 AAV 혈청형 5의 키메라이다(Urabe 등(2006), 상기 참조). 구조체 92는 진정한 AAV5를 포함하지 않음에도 불구하고, 이는 BEVS에서 AAV5 유사 파티클의 생성을 위해 현재 이용가능한 유일한 대안이다. 765 구조체는 92 구조체에 비해 통계학적으로 유의한 우수성을 보였다.
본 발명자들은 VP1 번역 리즌(translational reason)의 돌연변이 유발 변화에 의한 다운스트림 VP2 및 VP3의 발현에 대한 강한 영향은 번역 프로세스 자체와 관련되어 있다는 가설을 세운다. 진핵생물에서 번역은 단방향적이며, mRNA 5'로 출발한다. mRNA와 한번 연관된 리보솜은, 이들이 단백질 합성을 개시하기 위해 적절한 콘텍스트에서 번역 ATG 출발을 찾을 때까지 계속해서 진행한다. 때로는 예를 들어 ACG 또는 CTG와 같은 위크 개시 코돈은, 만일 적절한 뉴클레오타이드 콘텍스트로 둘러쌓여진 경우에 비-고전적인 방식으로 단백질 합성을 개시할 수 있다. 이 메커니즘은 누설 리보솜 스캐닝(leaky ribosomal scanning)이라 불린다. VP1에서 누설 리보솜 스캐닝의 강도는 VP2 및 VP3에 대한 리보솜 "누설(leakage)"의 부분을 결정할 것이며, 그리고 후자의 두 가지로부터 단백질 발현의 강도를 결정할 것이다. 결과적으로 3가지 모든 성분들의 발현은 최종 어셈블리된 캡시드에서 이들의 존재를 결정할 것이다.
SEQUENCE LISTING
<110> uniQure IP B.V.
<120> Further improved AAV vectors produced in insect cells
<130> P6049115PCT
<150> EP 14158610.7
<151> 2014-03-10
<160> 73
<170> PatentIn version 3.3
<210> 1
<211> 1876
<212> DNA
<213> adeno-associated virus 2
<220>
<221> CDS
<222> (11)..(1876)
<223> Rep78 coding sequence
<220>
<221> misc_feature
<222> (683)..(1876)
<400> 1
cgcagccgcc atg ccg ggg ttt tac gag att gtg att aag gtc ccc agc 49
Met Pro Gly Phe Tyr Glu Ile Val Ile Lys Val Pro Ser
1 5 10
gac ctt gac gag cat ctg ccc ggc att tct gac agc ttt gtg aac tgg 97
Asp Leu Asp Glu His Leu Pro Gly Ile Ser Asp Ser Phe Val Asn Trp
15 20 25
gtg gcc gag aag gaa tgg gag ttg ccg cca gat tct gac atg gat ctg 145
Val Ala Glu Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu
30 35 40 45
aat ctg att gag cag gca ccc ctg acc gtg gcc gag aag ctg cag cgc 193
Asn Leu Ile Glu Gln Ala Pro Leu Thr Val Ala Glu Lys Leu Gln Arg
50 55 60
gac ttt ctg acg gaa tgg cgc cgt gtg agt aag gcc ccg gag gcc ctt 241
Asp Phe Leu Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu
65 70 75
ttc ttt gtg caa ttt gag aag gga gag agc tac ttc cac atg cac gtg 289
Phe Phe Val Gln Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val
80 85 90
ctc gtg gaa acc acc ggg gtg aaa tcc atg gtt ttg gga cgt ttc ctg 337
Leu Val Glu Thr Thr Gly Val Lys Ser Met Val Leu Gly Arg Phe Leu
95 100 105
agt cag att cgc gaa aaa ctg att cag aga att tac cgc ggg atc gag 385
Ser Gln Ile Arg Glu Lys Leu Ile Gln Arg Ile Tyr Arg Gly Ile Glu
110 115 120 125
ccg act ttg cca aac tgg ttc gcg gtc aca aag acc aga aat ggc gcc 433
Pro Thr Leu Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala
130 135 140
gga ggc ggg aac aag gtg gtg gat gag tgc tac atc ccc aat tac ttg 481
Gly Gly Gly Asn Lys Val Val Asp Glu Cys Tyr Ile Pro Asn Tyr Leu
145 150 155
ctc ccc aaa acc cag cct gag ctc cag tgg gcg tgg act aat atg gaa 529
Leu Pro Lys Thr Gln Pro Glu Leu Gln Trp Ala Trp Thr Asn Met Glu
160 165 170
cag tat tta agc gcc tgt ttg aat ctc acg gag cgt aaa cgg ttg gtg 577
Gln Tyr Leu Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val
175 180 185
gcg cag cat ctg acg cac gtg tcg cag acg cag gag cag aac aaa gag 625
Ala Gln His Leu Thr His Val Ser Gln Thr Gln Glu Gln Asn Lys Glu
190 195 200 205
aat cag aat ccc aat tct gat gcg ccg gtg atc aga tca aaa act tca 673
Asn Gln Asn Pro Asn Ser Asp Ala Pro Val Ile Arg Ser Lys Thr Ser
210 215 220
gcc agg tac atg gag ctg gtc ggg tgg ctc gtg gac aag ggg att acc 721
Ala Arg Tyr Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly Ile Thr
225 230 235
tcg gag aag cag tgg atc cag gag gac cag gcc tca tac atc tcc ttc 769
Ser Glu Lys Gln Trp Ile Gln Glu Asp Gln Ala Ser Tyr Ile Ser Phe
240 245 250
aat gcg gcc tcc aac tcg cgg tcc caa atc aag gct gcc ttg gac aat 817
Asn Ala Ala Ser Asn Ser Arg Ser Gln Ile Lys Ala Ala Leu Asp Asn
255 260 265
gcg gga aag att atg agc ctg act aaa acc gcc ccc gac tac ctg gtg 865
Ala Gly Lys Ile Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val
270 275 280 285
ggc cag cag ccc gtg gag gac att tcc agc aat cgg att tat aaa att 913
Gly Gln Gln Pro Val Glu Asp Ile Ser Ser Asn Arg Ile Tyr Lys Ile
290 295 300
ttg gaa cta aac ggg tac gat ccc caa tat gcg gct tcc gtc ttt ctg 961
Leu Glu Leu Asn Gly Tyr Asp Pro Gln Tyr Ala Ala Ser Val Phe Leu
305 310 315
gga tgg gcc acg aaa aag ttc ggc aag agg aac acc atc tgg ctg ttt 1009
Gly Trp Ala Thr Lys Lys Phe Gly Lys Arg Asn Thr Ile Trp Leu Phe
320 325 330
ggg cct gca act acc ggg aag acc aac atc gcg gag gcc ata gcc cac 1057
Gly Pro Ala Thr Thr Gly Lys Thr Asn Ile Ala Glu Ala Ile Ala His
335 340 345
act gtg ccc ttc tac ggg tgc gta aac tgg acc aat gag aac ttt ccc 1105
Thr Val Pro Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro
350 355 360 365
ttc aac gac tgt gtc gac aag atg gtg atc tgg tgg gag gag ggg aag 1153
Phe Asn Asp Cys Val Asp Lys Met Val Ile Trp Trp Glu Glu Gly Lys
370 375 380
atg acc gcc aag gtc gtg gag tcg gcc aaa gcc att ctc gga gga agc 1201
Met Thr Ala Lys Val Val Glu Ser Ala Lys Ala Ile Leu Gly Gly Ser
385 390 395
aag gtg cgc gtg gac cag aaa tgc aag tcc tcg gcc cag ata gac ccg 1249
Lys Val Arg Val Asp Gln Lys Cys Lys Ser Ser Ala Gln Ile Asp Pro
400 405 410
act ccc gtg atc gtc acc tcc aac acc aac atg tgc gcc gtg att gac 1297
Thr Pro Val Ile Val Thr Ser Asn Thr Asn Met Cys Ala Val Ile Asp
415 420 425
ggg aac tca acg acc ttc gaa cac cag cag ccg ttg caa gac cgg atg 1345
Gly Asn Ser Thr Thr Phe Glu His Gln Gln Pro Leu Gln Asp Arg Met
430 435 440 445
ttc aaa ttt gaa ctc acc cgc cgt ctg gat cat gac ttt ggg aag gtc 1393
Phe Lys Phe Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val
450 455 460
acc aag cag gaa gtc aaa gac ttt ttc cgg tgg gca aag gat cac gtg 1441
Thr Lys Gln Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val
465 470 475
gtt gag gtg gag cat gaa ttc tac gtc aaa aag ggt gga gcc aag aaa 1489
Val Glu Val Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys
480 485 490
aga ccc gcc ccc agt gac gca gat ata agt gag ccc aaa cgg gtg cgc 1537
Arg Pro Ala Pro Ser Asp Ala Asp Ile Ser Glu Pro Lys Arg Val Arg
495 500 505
gag tca gtt gcg cag cca tcg acg tca gac gcg gaa gct tcg atc aac 1585
Glu Ser Val Ala Gln Pro Ser Thr Ser Asp Ala Glu Ala Ser Ile Asn
510 515 520 525
tac gca gac agg tac caa aac aaa tgt tct cgt cac gtg ggc atg aat 1633
Tyr Ala Asp Arg Tyr Gln Asn Lys Cys Ser Arg His Val Gly Met Asn
530 535 540
ctg atg ctg ttt ccc tgc aga caa tgc gag aga atg aat cag aat tca 1681
Leu Met Leu Phe Pro Cys Arg Gln Cys Glu Arg Met Asn Gln Asn Ser
545 550 555
aat atc tgc ttc act cac gga cag aaa gac tgt tta gag tgc ttt ccc 1729
Asn Ile Cys Phe Thr His Gly Gln Lys Asp Cys Leu Glu Cys Phe Pro
560 565 570
gtg tca gaa tct caa ccc gtt tct gtc gtc aaa aag gcg tat cag aaa 1777
Val Ser Glu Ser Gln Pro Val Ser Val Val Lys Lys Ala Tyr Gln Lys
575 580 585
ctg tgc tac att cat cat atc atg gga aag gtg cca gac gct tgc act 1825
Leu Cys Tyr Ile His His Ile Met Gly Lys Val Pro Asp Ala Cys Thr
590 595 600 605
gcc tgc gat ctg gtc aat gtg gat ttg gat gac tgc atc ttt gaa caa 1873
Ala Cys Asp Leu Val Asn Val Asp Leu Asp Asp Cys Ile Phe Glu Gln
610 615 620
taa 1876
<210> 2
<211> 621
<212> PRT
<213> adeno-associated virus 2
<400> 2
Met Pro Gly Phe Tyr Glu Ile Val Ile Lys Val Pro Ser Asp Leu Asp
1 5 10 15
Glu His Leu Pro Gly Ile Ser Asp Ser Phe Val Asn Trp Val Ala Glu
20 25 30
Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu Asn Leu Ile
35 40 45
Glu Gln Ala Pro Leu Thr Val Ala Glu Lys Leu Gln Arg Asp Phe Leu
50 55 60
Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu Phe Phe Val
65 70 75 80
Gln Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val Leu Val Glu
85 90 95
Thr Thr Gly Val Lys Ser Met Val Leu Gly Arg Phe Leu Ser Gln Ile
100 105 110
Arg Glu Lys Leu Ile Gln Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu
115 120 125
Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly
130 135 140
Asn Lys Val Val Asp Glu Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys
145 150 155 160
Thr Gln Pro Glu Leu Gln Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu
165 170 175
Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val Ala Gln His
180 185 190
Leu Thr His Val Ser Gln Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn
195 200 205
Pro Asn Ser Asp Ala Pro Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr
210 215 220
Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly Ile Thr Ser Glu Lys
225 230 235 240
Gln Trp Ile Gln Glu Asp Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala
245 250 255
Ser Asn Ser Arg Ser Gln Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys
260 265 270
Ile Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln
275 280 285
Pro Val Glu Asp Ile Ser Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu
290 295 300
Asn Gly Tyr Asp Pro Gln Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala
305 310 315 320
Thr Lys Lys Phe Gly Lys Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala
325 330 335
Thr Thr Gly Lys Thr Asn Ile Ala Glu Ala Ile Ala His Thr Val Pro
340 345 350
Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp
355 360 365
Cys Val Asp Lys Met Val Ile Trp Trp Glu Glu Gly Lys Met Thr Ala
370 375 380
Lys Val Val Glu Ser Ala Lys Ala Ile Leu Gly Gly Ser Lys Val Arg
385 390 395 400
Val Asp Gln Lys Cys Lys Ser Ser Ala Gln Ile Asp Pro Thr Pro Val
405 410 415
Ile Val Thr Ser Asn Thr Asn Met Cys Ala Val Ile Asp Gly Asn Ser
420 425 430
Thr Thr Phe Glu His Gln Gln Pro Leu Gln Asp Arg Met Phe Lys Phe
435 440 445
Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val Thr Lys Gln
450 455 460
Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val Val Glu Val
465 470 475 480
Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala
485 490 495
Pro Ser Asp Ala Asp Ile Ser Glu Pro Lys Arg Val Arg Glu Ser Val
500 505 510
Ala Gln Pro Ser Thr Ser Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp
515 520 525
Arg Tyr Gln Asn Lys Cys Ser Arg His Val Gly Met Asn Leu Met Leu
530 535 540
Phe Pro Cys Arg Gln Cys Glu Arg Met Asn Gln Asn Ser Asn Ile Cys
545 550 555 560
Phe Thr His Gly Gln Lys Asp Cys Leu Glu Cys Phe Pro Val Ser Glu
565 570 575
Ser Gln Pro Val Ser Val Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr
580 585 590
Ile His His Ile Met Gly Lys Val Pro Asp Ala Cys Thr Ala Cys Asp
595 600 605
Leu Val Asn Val Asp Leu Asp Asp Cys Ile Phe Glu Gln
610 615 620
<210> 3
<211> 9
<212> DNA
<213> adeno-associated virus 2 fragment
<400> 3
cctgttaag 9
<210> 4
<211> 10
<212> DNA
<213> artificial synthetic sequence
<220>
<223> kozak
<220>
<221> misc_feature
<222> (4)..(4)
<223> r=purine = A or G
<220>
<221> misc_feature
<222> (7)..(9)
<223> nnn stands for suboptimal translation initiation codon
<400> 4
gccrccnnng 10
<210> 5
<211> 10
<212> DNA
<213> Artificial
<220>
<223> kozak sequence
<400> 5
gccaccacgg 10
<210> 6
<211> 10
<212> DNA
<213> Artificial
<220>
<223> kozak sequence
<400> 6
gccgccacgg 10
<210> 7
<211> 10
<212> DNA
<213> Artificial
<220>
<223> kozak sequence
<400> 7
gccaccttgg 10
<210> 8
<211> 10
<212> DNA
<213> Artificial
<220>
<223> kozak sequence
<400> 8
gccgccttgg 10
<210> 9
<211> 10
<212> DNA
<213> Artificial
<220>
<223> kozak sequence
<400> 9
gccaccgtgg 10
<210> 10
<211> 10
<212> DNA
<213> Artificial
<220>
<223> kozak sequence
<400> 10
gccgccgtgg 10
<210> 11
<211> 10
<212> DNA
<213> Artificial
<220>
<223> kozak sequence
<400> 11
gccaccctgg 10
<210> 12
<211> 10
<212> DNA
<213> Artificial
<220>
<223> kozak sequence
<400> 12
gccgccctgg 10
<210> 13
<211> 4718
<212> DNA
<213> adeno-associated virus 1
<220>
<221> CDS
<222> (2223)..(4433)
<223> VP1
<220>
<221> misc_feature
<222> (2634)..(4433)
<223> AAV1 VP2
<220>
<221> misc_feature
<222> (2829)..(4433)
<223> AAV1 VP3
<400> 13
ttgcccactc cctctctgcg cgctcgctcg ctcggtgggg cctgcggacc aaaggtccgc 60
agacggcaga gctctgctct gccggcccca ccgagcgagc gagcgcgcag agagggagtg 120
ggcaactcca tcactagggg taatcgcgaa gcgcctccca cgctgccgcg tcagcgctga 180
cgtaaattac gtcatagggg agtggtcctg tattagctgt cacgtgagtg cttttgcgac 240
attttgcgac accacgtggc catttagggt atatatggcc gagtgagcga gcaggatctc 300
cattttgacc gcgaaatttg aacgagcagc agccatgccg ggcttctacg agatcgtgat 360
caaggtgccg agcgacctgg acgagcacct gccgggcatt tctgactcgt ttgtgagctg 420
ggtggccgag aaggaatggg agctgccccc ggattctgac atggatctga atctgattga 480
gcaggcaccc ctgaccgtgg ccgagaagct gcagcgcgac ttcctggtcc aatggcgccg 540
cgtgagtaag gccccggagg ccctcttctt tgttcagttc gagaagggcg agtcctactt 600
ccacctccat attctggtgg agaccacggg ggtcaaatcc atggtgctgg gccgcttcct 660
gagtcagatt agggacaagc tggtgcagac catctaccgc gggatcgagc cgaccctgcc 720
caactggttc gcggtgacca agacgcgtaa tggcgccgga ggggggaaca aggtggtgga 780
cgagtgctac atccccaact acctcctgcc caagactcag cccgagctgc agtgggcgtg 840
gactaacatg gaggagtata taagcgcctg tttgaacctg gccgagcgca aacggctcgt 900
ggcgcagcac ctgacccacg tcagccagac ccaggagcag aacaaggaga atctgaaccc 960
caattctgac gcgcctgtca tccggtcaaa aacctccgcg cgctacatgg agctggtcgg 1020
gtggctggtg gaccggggca tcacctccga gaagcagtgg atccaggagg accaggcctc 1080
gtacatctcc ttcaacgccg cttccaactc gcggtcccag atcaaggccg ctctggacaa 1140
tgccggcaag atcatggcgc tgaccaaatc cgcgcccgac tacctggtag gccccgctcc 1200
gcccgcggac attaaaacca accgcatcta ccgcatcctg gagctgaacg gctacgaacc 1260
tgcctacgcc ggctccgtct ttctcggctg ggcccagaaa aggttcggga agcgcaacac 1320
catctggctg tttgggccgg ccaccacggg caagaccaac atcgcggaag ccatcgccca 1380
cgccgtgccc ttctacggct gcgtcaactg gaccaatgag aactttccct tcaatgattg 1440
cgtcgacaag atggtgatct ggtgggagga gggcaagatg acggccaagg tcgtggagtc 1500
cgccaaggcc attctcggcg gcagcaaggt gcgcgtggac caaaagtgca agtcgtccgc 1560
ccagatcgac cccacccccg tgatcgtcac ctccaacacc aacatgtgcg ccgtgattga 1620
cgggaacagc accaccttcg agcaccagca gccgttgcag gaccggatgt tcaaatttga 1680
actcacccgc cgtctggagc atgactttgg caaggtgaca aagcaggaag tcaaagagtt 1740
cttccgctgg gcgcaggatc acgtgaccga ggtggcgcat gagttctacg tcagaaaggg 1800
tggagccaac aaaagacccg cccccgatga cgcggataaa agcgagccca agcgggcctg 1860
cccctcagtc gcggatccat cgacgtcaga cgcggaagga gctccggtgg actttgccga 1920
caggtaccaa aacaaatgtt ctcgtcacgc gggcatgctt cagatgctgt ttccctgcaa 1980
gacatgcgag agaatgaatc agaatttcaa catttgcttc acgcacggga cgagagactg 2040
ttcagagtgc ttccccggcg tgtcagaatc tcaaccggtc gtcagaaaga ggacgtatcg 2100
gaaactctgt gccattcatc atctgctggg gcgggctccc gagattgctt gctcggcctg 2160
cgatctggtc aacgtggacc tggatgactg tgtttctgag caataaatga cttaaaccag 2220
gt atg gct gcc gat ggt tat ctt cca gat tgg ctc gag gac aac ctc 2267
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu
1 5 10 15
tct gag ggc att cgc gag tgg tgg gac ttg aaa cct gga gcc ccg aag 2315
Ser Glu Gly Ile Arg Glu Trp Trp Asp Leu Lys Pro Gly Ala Pro Lys
20 25 30
ccc aaa gcc aac cag caa aag cag gac gac ggc cgg ggt ctg gtg ctt 2363
Pro Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly Leu Val Leu
35 40 45
cct ggc tac aag tac ctc gga ccc ttc aac gga ctc gac aag ggg gag 2411
Pro Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu
50 55 60
ccc gtc aac gcg gcg gac gca gcg gcc ctc gag cac gac aag gcc tac 2459
Pro Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr
65 70 75
gac cag cag ctc aaa gcg ggt gac aat ccg tac ctg cgg tat aac cac 2507
Asp Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His
80 85 90 95
gcc gac gcc gag ttt cag gag cgt ctg caa gaa gat acg tct ttt ggg 2555
Ala Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly
100 105 110
ggc aac ctc ggg cga gca gtc ttc cag gcc aag aag cgg gtt ctc gaa 2603
Gly Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu
115 120 125
cct ctc ggt ctg gtt gag gaa ggc gct aag acg gct cct gga aag aaa 2651
Pro Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Gly Lys Lys
130 135 140
cgt ccg gta gag cag tcg cca caa gag cca gac tcc tcc tcg ggc atc 2699
Arg Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ser Gly Ile
145 150 155
ggc aag aca ggc cag cag ccc gct aaa aag aga ctc aat ttt ggt cag 2747
Gly Lys Thr Gly Gln Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln
160 165 170 175
act ggc gac tca gag tca gtc ccc gat cca caa cct ctc gga gaa cct 2795
Thr Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro
180 185 190
cca gca acc ccc gct gct gtg gga cct act aca atg gct tca ggc ggt 2843
Pro Ala Thr Pro Ala Ala Val Gly Pro Thr Thr Met Ala Ser Gly Gly
195 200 205
ggc gca cca atg gca gac aat aac gaa ggc gcc gac gga gtg ggt aat 2891
Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn
210 215 220
gcc tca gga aat tgg cat tgc gat tcc aca tgg ctg ggc gac aga gtc 2939
Ala Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val
225 230 235
atc acc acc agc acc cgc acc tgg gcc ttg ccc acc tac aat aac cac 2987
Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His
240 245 250 255
ctc tac aag caa atc tcc agt gct tca acg ggg gcc agc aac gac aac 3035
Leu Tyr Lys Gln Ile Ser Ser Ala Ser Thr Gly Ala Ser Asn Asp Asn
260 265 270
cac tac ttc ggc tac agc acc ccc tgg ggg tat ttt gat ttc aac aga 3083
His Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg
275 280 285
ttc cac tgc cac ttt tca cca cgt gac tgg cag cga ctc atc aac aac 3131
Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn
290 295 300
aat tgg gga ttc cgg ccc aag aga ctc aac ttc aaa ctc ttc aac atc 3179
Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile
305 310 315
caa gtc aag gag gtc acg acg aat gat ggc gtc aca acc atc gct aat 3227
Gln Val Lys Glu Val Thr Thr Asn Asp Gly Val Thr Thr Ile Ala Asn
320 325 330 335
aac ctt acc agc acg gtt caa gtc ttc tcg gac tcg gag tac cag ctt 3275
Asn Leu Thr Ser Thr Val Gln Val Phe Ser Asp Ser Glu Tyr Gln Leu
340 345 350
ccg tac gtc ctc ggc tct gcg cac cag ggc tgc ctc cct ccg ttc ccg 3323
Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro
355 360 365
gcg gac gtg ttc atg att ccg caa tac ggc tac ctg acg ctc aac aat 3371
Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn
370 375 380
ggc agc caa gcc gtg gga cgt tca tcc ttt tac tgc ctg gaa tat ttc 3419
Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe
385 390 395
cct tct cag atg ctg aga acg ggc aac aac ttt acc ttc agc tac acc 3467
Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Thr Phe Ser Tyr Thr
400 405 410 415
ttt gag gaa gtg cct ttc cac agc agc tac gcg cac agc cag agc ctg 3515
Phe Glu Glu Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu
420 425 430
gac cgg ctg atg aat cct ctc atc gac caa tac ctg tat tac ctg aac 3563
Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Asn
435 440 445
aga act caa aat cag tcc gga agt gcc caa aac aag gac ttg ctg ttt 3611
Arg Thr Gln Asn Gln Ser Gly Ser Ala Gln Asn Lys Asp Leu Leu Phe
450 455 460
agc cgt ggg tct cca gct ggc atg tct gtt cag ccc aaa aac tgg cta 3659
Ser Arg Gly Ser Pro Ala Gly Met Ser Val Gln Pro Lys Asn Trp Leu
465 470 475
cct gga ccc tgt tat cgg cag cag cgc gtt tct aaa aca aaa aca gac 3707
Pro Gly Pro Cys Tyr Arg Gln Gln Arg Val Ser Lys Thr Lys Thr Asp
480 485 490 495
aac aac aac agc aat ttt acc tgg act ggt gct tca aaa tat aac ctc 3755
Asn Asn Asn Ser Asn Phe Thr Trp Thr Gly Ala Ser Lys Tyr Asn Leu
500 505 510
aat ggg cgt gaa tcc atc atc aac cct ggc act gct atg gcc tca cac 3803
Asn Gly Arg Glu Ser Ile Ile Asn Pro Gly Thr Ala Met Ala Ser His
515 520 525
aaa gac gac gaa gac aag ttc ttt ccc atg agc ggt gtc atg att ttt 3851
Lys Asp Asp Glu Asp Lys Phe Phe Pro Met Ser Gly Val Met Ile Phe
530 535 540
gga aaa gag agc gcc gga gct tca aac act gca ttg gac aat gtc atg 3899
Gly Lys Glu Ser Ala Gly Ala Ser Asn Thr Ala Leu Asp Asn Val Met
545 550 555
att aca gac gaa gag gaa att aaa gcc act aac cct gtg gcc acc gaa 3947
Ile Thr Asp Glu Glu Glu Ile Lys Ala Thr Asn Pro Val Ala Thr Glu
560 565 570 575
aga ttt ggg acc gtg gca gtc aat ttc cag agc agc agc aca gac cct 3995
Arg Phe Gly Thr Val Ala Val Asn Phe Gln Ser Ser Ser Thr Asp Pro
580 585 590
gcg acc gga gat gtg cat gct atg gga gca tta cct ggc atg gtg tgg 4043
Ala Thr Gly Asp Val His Ala Met Gly Ala Leu Pro Gly Met Val Trp
595 600 605
caa gat aga gac gtg tac ctg cag ggt ccc att tgg gcc aaa att cct 4091
Gln Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro
610 615 620
cac aca gat gga cac ttt cac ccg tct cct ctt atg ggc ggc ttt gga 4139
His Thr Asp Gly His Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly
625 630 635
ctc aag aac ccg cct cct cag atc ctc atc aaa aac acg cct gtt cct 4187
Leu Lys Asn Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro
640 645 650 655
gcg aat cct ccg gcg gag ttt tca gct aca aag ttt gct tca ttc atc 4235
Ala Asn Pro Pro Ala Glu Phe Ser Ala Thr Lys Phe Ala Ser Phe Ile
660 665 670
acc caa tac tcc aca gga caa gtg agt gtg gaa att gaa tgg gag ctg 4283
Thr Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu
675 680 685
cag aaa gaa aac agc aag cgc tgg aat ccc gaa gtg cag tac aca tcc 4331
Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Val Gln Tyr Thr Ser
690 695 700
aat tat gca aaa tct gcc aac gtt gat ttt act gtg gac aac aat gga 4379
Asn Tyr Ala Lys Ser Ala Asn Val Asp Phe Thr Val Asp Asn Asn Gly
705 710 715
ctt tat act gag cct cgc ccc att ggc acc cgt tac ctt acc cgt ccc 4427
Leu Tyr Thr Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Pro
720 725 730 735
ctg taa ttacgtgtta atcaataaac cggttgattc gtttcagttg aactttggtc 4483
Leu
tcctgtcctt cttatcttat cggttaccat ggttatagct tacacattaa ctgcttggtt 4543
gcgcttcgcg ataaaagact tacgtcatcg ggttacccct agtgatggag ttgcccactc 4603
cctctctgcg cgctcgctcg ctcggtgggg cctgcggacc aaaggtccgc agacggcaga 4663
gctctgctct gccggcccca ccgagcgagc gagcgcgcag agagggagtg ggcaa 4718
<210> 14
<211> 736
<212> PRT
<213> adeno-associated virus 1
<400> 14
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Asp Leu Lys Pro Gly Ala Pro Lys Pro
20 25 30
Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ser Gly Ile Gly
145 150 155 160
Lys Thr Gly Gln Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr
165 170 175
Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro Pro
180 185 190
Ala Thr Pro Ala Ala Val Gly Pro Thr Thr Met Ala Ser Gly Gly Gly
195 200 205
Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn Ala
210 215 220
Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val Ile
225 230 235 240
Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu
245 250 255
Tyr Lys Gln Ile Ser Ser Ala Ser Thr Gly Ala Ser Asn Asp Asn His
260 265 270
Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe
275 280 285
His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn
290 295 300
Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln
305 310 315 320
Val Lys Glu Val Thr Thr Asn Asp Gly Val Thr Thr Ile Ala Asn Asn
325 330 335
Leu Thr Ser Thr Val Gln Val Phe Ser Asp Ser Glu Tyr Gln Leu Pro
340 345 350
Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro Ala
355 360 365
Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn Gly
370 375 380
Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro
385 390 395 400
Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Thr Phe Ser Tyr Thr Phe
405 410 415
Glu Glu Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp
420 425 430
Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Asn Arg
435 440 445
Thr Gln Asn Gln Ser Gly Ser Ala Gln Asn Lys Asp Leu Leu Phe Ser
450 455 460
Arg Gly Ser Pro Ala Gly Met Ser Val Gln Pro Lys Asn Trp Leu Pro
465 470 475 480
Gly Pro Cys Tyr Arg Gln Gln Arg Val Ser Lys Thr Lys Thr Asp Asn
485 490 495
Asn Asn Ser Asn Phe Thr Trp Thr Gly Ala Ser Lys Tyr Asn Leu Asn
500 505 510
Gly Arg Glu Ser Ile Ile Asn Pro Gly Thr Ala Met Ala Ser His Lys
515 520 525
Asp Asp Glu Asp Lys Phe Phe Pro Met Ser Gly Val Met Ile Phe Gly
530 535 540
Lys Glu Ser Ala Gly Ala Ser Asn Thr Ala Leu Asp Asn Val Met Ile
545 550 555 560
Thr Asp Glu Glu Glu Ile Lys Ala Thr Asn Pro Val Ala Thr Glu Arg
565 570 575
Phe Gly Thr Val Ala Val Asn Phe Gln Ser Ser Ser Thr Asp Pro Ala
580 585 590
Thr Gly Asp Val His Ala Met Gly Ala Leu Pro Gly Met Val Trp Gln
595 600 605
Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His
610 615 620
Thr Asp Gly His Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Leu
625 630 635 640
Lys Asn Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala
645 650 655
Asn Pro Pro Ala Glu Phe Ser Ala Thr Lys Phe Ala Ser Phe Ile Thr
660 665 670
Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln
675 680 685
Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Val Gln Tyr Thr Ser Asn
690 695 700
Tyr Ala Lys Ser Ala Asn Val Asp Phe Thr Val Asp Asn Asn Gly Leu
705 710 715 720
Tyr Thr Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Pro Leu
725 730 735
<210> 15
<211> 4679
<212> DNA
<213> adeno-associated virus 2
<220>
<221> CDS
<222> (2203)..(4410)
<223> AAV2 VP1
<220>
<221> misc_feature
<222> (2614)..(4410)
<223> AAV2 VP2
<220>
<221> misc_feature
<222> (2809)..(4410)
<223> AAV2 VP3
<400> 15
ttggccactc cctctctgcg cgctcgctcg ctcactgagg ccgggcgacc aaaggtcgcc 60
cgacgcccgg gctttgcccg ggcggcctca gtgagcgagc gagcgcgcag agagggagtg 120
gccaactcca tcactagggg ttcctggagg ggtggagtcg tgacgtgaat tacgtcatag 180
ggttagggag gtcctgtatt agaggtcacg tgagtgtttt gcgacatttt gcgacaccat 240
gtggtcacgc tgggtattta agcccgagtg agcacgcagg gtctccattt tgaagcggga 300
ggtttgaacg cgcagccgcc atgccggggt tttacgagat tgtgattaag gtccccagcg 360
accttgacga gcatctgccc ggcatttctg acagctttgt gaactgggtg gccgagaagg 420
aatgggagtt gccgccagat tctgacatgg atctgaatct gattgagcag gcacccctga 480
ccgtggccga gaagctgcag cgcgactttc tgacggaatg gcgccgtgtg agtaaggccc 540
cggaggccct tttctttgtg caatttgaga agggagagag ctacttccac atgcacgtgc 600
tcgtggaaac caccggggtg aaatccatgg ttttgggacg tttcctgagt cagattcgcg 660
aaaaactgat tcagagaatt taccgcggga tcgagccgac tttgccaaac tggttcgcgg 720
tcacaaagac cagaaatggc gccggaggcg ggaacaaggt ggtggatgag tgctacatcc 780
ccaattactt gctccccaaa acccagcctg agctccagtg ggcgtggact aatatggaac 840
agtatttaag cgcctgtttg aatctcacgg agcgtaaacg gttggtggcg cagcatctga 900
cgcacgtgtc gcagacgcag gagcagaaca aagagaatca gaatcccaat tctgatgcgc 960
cggtgatcag atcaaaaact tcagccaggt acatggagct ggtcgggtgg ctcgtggaca 1020
aggggattac ctcggagaag cagtggatcc aggaggacca ggcctcatac atctccttca 1080
atgcggcctc caactcgcgg tcccaaatca aggctgcctt ggacaatgcg ggaaagatta 1140
tgagcctgac taaaaccgcc cccgactacc tggtgggcca gcagcccgtg gaggacattt 1200
ccagcaatcg gatttataaa attttggaac taaacgggta cgatccccaa tatgcggctt 1260
ccgtctttct gggatgggcc acgaaaaagt tcggcaagag gaacaccatc tggctgtttg 1320
ggcctgcaac taccgggaag accaacatcg cggaggccat agcccacact gtgcccttct 1380
acgggtgcgt aaactggacc aatgagaact ttcccttcaa cgactgtgtc gacaagatgg 1440
tgatctggtg ggaggagggg aagatgaccg ccaaggtcgt ggagtcggcc aaagccattc 1500
tcggaggaag caaggtgcgc gtggaccaga aatgcaagtc ctcggcccag atagacccga 1560
ctcccgtgat cgtcacctcc aacaccaaca tgtgcgccgt gattgacggg aactcaacga 1620
ccttcgaaca ccagcagccg ttgcaagacc ggatgttcaa atttgaactc acccgccgtc 1680
tggatcatga ctttgggaag gtcaccaagc aggaagtcaa agactttttc cggtgggcaa 1740
aggatcacgt ggttgaggtg gagcatgaat tctacgtcaa aaagggtgga gccaagaaaa 1800
gacccgcccc cagtgacgca gatataagtg agcccaaacg ggtgcgcgag tcagttgcgc 1860
agccatcgac gtcagacgcg gaagcttcga tcaactacgc agacaggtac caaaacaaat 1920
gttctcgtca cgtgggcatg aatctgatgc tgtttccctg cagacaatgc gagagaatga 1980
atcagaattc aaatatctgc ttcactcacg gacagaaaga ctgtttagag tgctttcccg 2040
tgtcagaatc tcaacccgtt tctgtcgtca aaaaggcgta tcagaaactg tgctacattc 2100
atcatatcat gggaaaggtg ccagacgctt gcactgcctg cgatctggtc aatgtggatt 2160
tggatgactg catctttgaa caataaatga tttaaatcag gt atg gct gcc gat 2214
Met Ala Ala Asp
1
ggt tat ctt cca gat tgg ctc gag gac act ctc tct gaa gga ata aga 2262
Gly Tyr Leu Pro Asp Trp Leu Glu Asp Thr Leu Ser Glu Gly Ile Arg
5 10 15 20
cag tgg tgg aag ctc aaa cct ggc cca cca cca cca aag ccc gca gag 2310
Gln Trp Trp Lys Leu Lys Pro Gly Pro Pro Pro Pro Lys Pro Ala Glu
25 30 35
cgg cat aag gac gac agc agg ggt ctt gtg ctt cct ggg tac aag tac 2358
Arg His Lys Asp Asp Ser Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr
40 45 50
ctc gga ccc ttc aac gga ctc gac aag gga gag ccg gtc aac gag gca 2406
Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Glu Ala
55 60 65
gac gcc gcg gcc ctc gag cac gac aaa gcc tac gac cgg cag ctc gac 2454
Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Arg Gln Leu Asp
70 75 80
agc gga gac aac ccg tac ctc aag tac aac cac gcc gac gcg gag ttt 2502
Ser Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu Phe
85 90 95 100
cag gag cgc ctt aaa gaa gat acg tct ttt ggg ggc aac ctc gga cga 2550
Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg
105 110 115
gca gtc ttc cag gcg aaa aag agg gtt ctt gaa cct ctg ggc ctg gtt 2598
Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro Leu Gly Leu Val
120 125 130
gag gaa cct gtt aag acg gct ccg gga aaa aag agg ccg gta gag cac 2646
Glu Glu Pro Val Lys Thr Ala Pro Gly Lys Lys Arg Pro Val Glu His
135 140 145
tct cct gtg gag cca gac tcc tcc tcg gga acc gga aag gcg ggc cag 2694
Ser Pro Val Glu Pro Asp Ser Ser Ser Gly Thr Gly Lys Ala Gly Gln
150 155 160
cag cct gca aga aaa aga ttg aat ttt ggt cag act gga gac gca gac 2742
Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln Thr Gly Asp Ala Asp
165 170 175 180
tca gta cct gac ccc cag cct ctc gga cag cca cca gca gcc ccc tct 2790
Ser Val Pro Asp Pro Gln Pro Leu Gly Gln Pro Pro Ala Ala Pro Ser
185 190 195
ggt ctg gga act aat acg atg gct aca ggc agt ggc gca cca atg gca 2838
Gly Leu Gly Thr Asn Thr Met Ala Thr Gly Ser Gly Ala Pro Met Ala
200 205 210
gac aat aac gag ggc gcc gac gga gtg ggt aat tcc tcg gga aat tgg 2886
Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn Ser Ser Gly Asn Trp
215 220 225
cat tgc gat tcc aca tgg atg ggc gac aga gtc atc acc acc agc acc 2934
His Cys Asp Ser Thr Trp Met Gly Asp Arg Val Ile Thr Thr Ser Thr
230 235 240
cga acc tgg gcc ctg ccc acc tac aac aac cac ctc tac aaa caa att 2982
Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile
245 250 255 260
tcc agc caa tca gga gcc tcg aac gac aat cac tac ttt ggc tac agc 3030
Ser Ser Gln Ser Gly Ala Ser Asn Asp Asn His Tyr Phe Gly Tyr Ser
265 270 275
acc cct tgg ggg tat ttt gac ttc aac aga ttc cac tgc cac ttt tca 3078
Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His Phe Ser
280 285 290
cca cgt gac tgg caa aga ctc atc aac aac aac tgg gga ttc cga ccc 3126
Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe Arg Pro
295 300 305
aag aga ctc aac ttc aag ctc ttt aac att caa gtc aaa gag gtc acg 3174
Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val Lys Glu Val Thr
310 315 320
cag aat gac ggt acg acg acg att gcc aat aac ctt acc agc acg gtt 3222
Gln Asn Asp Gly Thr Thr Thr Ile Ala Asn Asn Leu Thr Ser Thr Val
325 330 335 340
cag gtg ttt act gac tcg gag tac cag ctc ccg tac gtc ctc ggc tcg 3270
Gln Val Phe Thr Asp Ser Glu Tyr Gln Leu Pro Tyr Val Leu Gly Ser
345 350 355
gcg cat caa gga tgc ctc ccg ccg ttc cca gca gac gtc ttc atg gtg 3318
Ala His Gln Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe Met Val
360 365 370
cca cag tat gga tac ctc acc ctg aac aac ggg agt cag gca gta gga 3366
Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn Gly Ser Gln Ala Val Gly
375 380 385
cgc tct tca ttt tac tgc ctg gag tac ttt cct tct cag atg ctg cgt 3414
Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln Met Leu Arg
390 395 400
acc gga aac aac ttt acc ttc agc tac act ttt gag gac gtt cct ttc 3462
Thr Gly Asn Asn Phe Thr Phe Ser Tyr Thr Phe Glu Asp Val Pro Phe
405 410 415 420
cac agc agc tac gct cac agc cag agt ctg gac cgt ctc atg aat cct 3510
His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp Arg Leu Met Asn Pro
425 430 435
ctc atc gac cag tac ctg tat tac ttg agc aga aca aac act cca agt 3558
Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser Arg Thr Asn Thr Pro Ser
440 445 450
gga acc acc acg cag tca agg ctt cag ttt tct cag gcc gga gcg agt 3606
Gly Thr Thr Thr Gln Ser Arg Leu Gln Phe Ser Gln Ala Gly Ala Ser
455 460 465
gac att cgg gac cag tct agg aac tgg ctt cct gga ccc tgt tac cgc 3654
Asp Ile Arg Asp Gln Ser Arg Asn Trp Leu Pro Gly Pro Cys Tyr Arg
470 475 480
cag cag cga gta tca aag aca tct gcg gat aac aac aac agt gaa tac 3702
Gln Gln Arg Val Ser Lys Thr Ser Ala Asp Asn Asn Asn Ser Glu Tyr
485 490 495 500
tcg tgg act gga gct acc aag tac cac ctc aat ggc aga gac tct ctg 3750
Ser Trp Thr Gly Ala Thr Lys Tyr His Leu Asn Gly Arg Asp Ser Leu
505 510 515
gtg aat ccg ggc ccg gcc atg gca agc cac aag gac gat gaa gaa aag 3798
Val Asn Pro Gly Pro Ala Met Ala Ser His Lys Asp Asp Glu Glu Lys
520 525 530
ttt ttt cct cag agc ggg gtt ctc atc ttt ggg aag caa ggc tca gag 3846
Phe Phe Pro Gln Ser Gly Val Leu Ile Phe Gly Lys Gln Gly Ser Glu
535 540 545
aaa aca aat gtg gac att gaa aag gtc atg att aca gac gaa gag gaa 3894
Lys Thr Asn Val Asp Ile Glu Lys Val Met Ile Thr Asp Glu Glu Glu
550 555 560
atc agg aca acc aat ccc gtg gct acg gag cag tat ggt tct gta tct 3942
Ile Arg Thr Thr Asn Pro Val Ala Thr Glu Gln Tyr Gly Ser Val Ser
565 570 575 580
acc aac ctc cag aga ggc aac aga caa gca gct acc gca gat gtc aac 3990
Thr Asn Leu Gln Arg Gly Asn Arg Gln Ala Ala Thr Ala Asp Val Asn
585 590 595
aca caa ggc gtt ctt cca ggc atg gtc tgg cag gac aga gat gtg tac 4038
Thr Gln Gly Val Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr
600 605 610
ctt cag ggg ccc atc tgg gca aag att cca cac acg gac gga cat ttt 4086
Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly His Phe
615 620 625
cac ccc tct ccc ctc atg ggt gga ttc gga ctt aaa cac cct cct cca 4134
His Pro Ser Pro Leu Met Gly Gly Phe Gly Leu Lys His Pro Pro Pro
630 635 640
cag att ctc atc aag aac acc ccg gta cct gcg aat cct tcg acc acc 4182
Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asn Pro Ser Thr Thr
645 650 655 660
ttc agt gcg gca aag ttt gct tcc ttc atc aca cag tac tcc acg gga 4230
Phe Ser Ala Ala Lys Phe Ala Ser Phe Ile Thr Gln Tyr Ser Thr Gly
665 670 675
cag gtc agc gtg gag atc gag tgg gag ctg cag aag gaa aac agc aaa 4278
Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys
680 685 690
cgc tgg aat ccc gaa att cag tac act tcc aac tac aac aag tct gtt 4326
Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Asn Lys Ser Val
695 700 705
aat gtg gac ttt act gtg gac act aat ggc gtg tat tca gag cct cgc 4374
Asn Val Asp Phe Thr Val Asp Thr Asn Gly Val Tyr Ser Glu Pro Arg
710 715 720
ccc att ggc acc aga tac ctg act cgt aat ctg taa ttgcttgtta 4420
Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu
725 730 735
atcaataaac cgtttaattc gtttcagttg aactttggtc tctgcgtatt tctttcttat 4480
ctagtttcca tggctacgta gataagtagc atggcgggtt aatcattaac tacaaggaac 4540
ccctagtgat ggagttggcc actccctctc tgcgcgctcg ctcgctcact gaggccgggc 4600
gaccaaaggt cgcccgacgc ccgggctttg cccgggcggc ctcagtgagc gagcgagcgc 4660
gcagagaggg agtggccaa 4679
<210> 16
<211> 735
<212> PRT
<213> adeno-associated virus 2
<400> 16
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Thr Leu Ser
1 5 10 15
Glu Gly Ile Arg Gln Trp Trp Lys Leu Lys Pro Gly Pro Pro Pro Pro
20 25 30
Lys Pro Ala Glu Arg His Lys Asp Asp Ser Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Glu Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Arg Gln Leu Asp Ser Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Pro Val Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
Pro Val Glu His Ser Pro Val Glu Pro Asp Ser Ser Ser Gly Thr Gly
145 150 155 160
Lys Ala Gly Gln Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln Thr
165 170 175
Gly Asp Ala Asp Ser Val Pro Asp Pro Gln Pro Leu Gly Gln Pro Pro
180 185 190
Ala Ala Pro Ser Gly Leu Gly Thr Asn Thr Met Ala Thr Gly Ser Gly
195 200 205
Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn Ser
210 215 220
Ser Gly Asn Trp His Cys Asp Ser Thr Trp Met Gly Asp Arg Val Ile
225 230 235 240
Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu
245 250 255
Tyr Lys Gln Ile Ser Ser Gln Ser Gly Ala Ser Asn Asp Asn His Tyr
260 265 270
Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His
275 280 285
Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp
290 295 300
Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val
305 310 315 320
Lys Glu Val Thr Gln Asn Asp Gly Thr Thr Thr Ile Ala Asn Asn Leu
325 330 335
Thr Ser Thr Val Gln Val Phe Thr Asp Ser Glu Tyr Gln Leu Pro Tyr
340 345 350
Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro Ala Asp
355 360 365
Val Phe Met Val Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn Gly Ser
370 375 380
Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser
385 390 395 400
Gln Met Leu Arg Thr Gly Asn Asn Phe Thr Phe Ser Tyr Thr Phe Glu
405 410 415
Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp Arg
420 425 430
Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser Arg Thr
435 440 445
Asn Thr Pro Ser Gly Thr Thr Thr Gln Ser Arg Leu Gln Phe Ser Gln
450 455 460
Ala Gly Ala Ser Asp Ile Arg Asp Gln Ser Arg Asn Trp Leu Pro Gly
465 470 475 480
Pro Cys Tyr Arg Gln Gln Arg Val Ser Lys Thr Ser Ala Asp Asn Asn
485 490 495
Asn Ser Glu Tyr Ser Trp Thr Gly Ala Thr Lys Tyr His Leu Asn Gly
500 505 510
Arg Asp Ser Leu Val Asn Pro Gly Pro Ala Met Ala Ser His Lys Asp
515 520 525
Asp Glu Glu Lys Phe Phe Pro Gln Ser Gly Val Leu Ile Phe Gly Lys
530 535 540
Gln Gly Ser Glu Lys Thr Asn Val Asp Ile Glu Lys Val Met Ile Thr
545 550 555 560
Asp Glu Glu Glu Ile Arg Thr Thr Asn Pro Val Ala Thr Glu Gln Tyr
565 570 575
Gly Ser Val Ser Thr Asn Leu Gln Arg Gly Asn Arg Gln Ala Ala Thr
580 585 590
Ala Asp Val Asn Thr Gln Gly Val Leu Pro Gly Met Val Trp Gln Asp
595 600 605
Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr
610 615 620
Asp Gly His Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Leu Lys
625 630 635 640
His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asn
645 650 655
Pro Ser Thr Thr Phe Ser Ala Ala Lys Phe Ala Ser Phe Ile Thr Gln
660 665 670
Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys
675 680 685
Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr
690 695 700
Asn Lys Ser Val Asn Val Asp Phe Thr Val Asp Thr Asn Gly Val Tyr
705 710 715 720
Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu
725 730 735
<210> 17
<211> 4726
<212> DNA
<213> adeno-associated virus 3
<220>
<221> CDS
<222> (2209)..(4419)
<223> AAV3 VP1
<220>
<221> misc_feature
<222> (2620)..(4419)
<223> AAV3 VP2
<220>
<221> misc_feature
<222> (2815)..(4419)
<223> AAV3 VP3
<400> 17
ttggccactc cctctatgcg cactcgctcg ctcggtgggg cctggcgacc aaaggtcgcc 60
agacggacgt gctttgcacg tccggcccca ccgagcgagc gagtgcgcat agagggagtg 120
gccaactcca tcactagagg tatggcagtg acgtaacgcg aagcgcgcga agcgagacca 180
cgcctaccag ctgcgtcagc agtcaggtga cccttttgcg acagtttgcg acaccacgtg 240
gccgctgagg gtatatattc tcgagtgagc gaaccaggag ctccattttg accgcgaaat 300
ttgaacgagc agcagccatg ccggggttct acgagattgt cctgaaggtc ccgagtgacc 360
tggacgagcg cctgccgggc atttctaact cgtttgttaa ctgggtggcc gagaaggaat 420
gggacgtgcc gccggattct gacatggatc cgaatctgat tgagcaggca cccctgaccg 480
tggccgaaaa gcttcagcgc gagttcctgg tggagtggcg ccgcgtgagt aaggccccgg 540
aggccctctt ttttgtccag ttcgaaaagg gggagaccta cttccacctg cacgtgctga 600
ttgagaccat cggggtcaaa tccatggtgg tcggccgcta cgtgagccag attaaagaga 660
agctggtgac ccgcatctac cgcggggtcg agccgcagct tccgaactgg ttcgcggtga 720
ccaaaacgcg aaatggcgcc gggggcggga acaaggtggt ggacgactgc tacatcccca 780
actacctgct ccccaagacc cagcccgagc tccagtgggc gtggactaac atggaccagt 840
atttaagcgc ctgtttgaat ctcgcggagc gtaaacggct ggtggcgcag catctgacgc 900
acgtgtcgca gacgcaggag cagaacaaag agaatcagaa ccccaattct gacgcgccgg 960
tcatcaggtc aaaaacctca gccaggtaca tggagctggt cgggtggctg gtggaccgcg 1020
ggatcacgtc agaaaagcaa tggattcagg aggaccaggc ctcgtacatc tccttcaacg 1080
ccgcctccaa ctcgcggtcc cagatcaagg ccgcgctgga caatgcctcc aagatcatga 1140
gcctgacaaa gacggctccg gactacctgg tgggcagcaa cccgccggag gacattacca 1200
aaaatcggat ctaccaaatc ctggagctga acgggtacga tccgcagtac gcggcctccg 1260
tcttcctggg ctgggcgcaa aagaagttcg ggaagaggaa caccatctgg ctctttgggc 1320
cggccacgac gggtaaaacc aacatcgcgg aagccatcgc ccacgccgtg cccttctacg 1380
gctgcgtaaa ctggaccaat gagaactttc ccttcaacga ttgcgtcgac aagatggtga 1440
tctggtggga ggagggcaag atgacggcca aggtcgtgga gagcgccaag gccattctgg 1500
gcggaagcaa ggtgcgcgtg gaccaaaagt gcaagtcatc ggcccagatc gaacccactc 1560
ccgtgatcgt cacctccaac accaacatgt gcgccgtgat tgacgggaac agcaccacct 1620
tcgagcatca gcagccgctg caggaccgga tgtttgaatt tgaacttacc cgccgtttgg 1680
accatgactt tgggaaggtc accaaacagg aagtaaagga ctttttccgg tgggcttccg 1740
atcacgtgac tgacgtggct catgagttct acgtcagaaa gggtggagct aagaaacgcc 1800
ccgcctccaa tgacgcggat gtaagcgagc caaaacggga gtgcacgtca cttgcgcagc 1860
cgacaacgtc agacgcggaa gcaccggcgg actacgcgga caggtaccaa aacaaatgtt 1920
ctcgtcacgt gggcatgaat ctgatgcttt ttccctgtaa aacatgcgag agaatgaatc 1980
aaatttccaa tgtctgtttt acgcatggtc aaagagactg tggggaatgc ttccctggaa 2040
tgtcagaatc tcaacccgtt tctgtcgtca aaaagaagac ttatcagaaa ctgtgtccaa 2100
ttcatcatat cctgggaagg gcacccgaga ttgcctgttc ggcctgcgat ttggccaatg 2160
tggacttgga tgactgtgtt tctgagcaat aaatgactta aaccaggt atg gct gct 2217
Met Ala Ala
1
gac ggt tat ctt cca gat tgg ctc gag gac aac ctt tct gaa ggc att 2265
Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile
5 10 15
cgt gag tgg tgg gct ctg aaa cct gga gtc cct caa ccc aaa gcg aac 2313
Arg Glu Trp Trp Ala Leu Lys Pro Gly Val Pro Gln Pro Lys Ala Asn
20 25 30 35
caa caa cac cag gac aac cgt cgg ggt ctt gtg ctt ccg ggt tac aaa 2361
Gln Gln His Gln Asp Asn Arg Arg Gly Leu Val Leu Pro Gly Tyr Lys
40 45 50
tac ctc gga ccc ggt aac gga ctc gac aaa gga gag ccg gtc aac gag 2409
Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Glu
55 60 65
gcg gac gcg gca gcc ctc gaa cac gac aaa gct tac gac cag cag ctc 2457
Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln Gln Leu
70 75 80
aag gcc ggt gac aac ccg tac ctc aag tac aac cac gcc gac gcc gag 2505
Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu
85 90 95
ttt cag gag cgt ctt caa gaa gat acg tct ttt ggg ggc aac ctt ggc 2553
Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly
100 105 110 115
aga gca gtc ttc cag gcc aaa aag agg atc ctt gag cct ctt ggt ctg 2601
Arg Ala Val Phe Gln Ala Lys Lys Arg Ile Leu Glu Pro Leu Gly Leu
120 125 130
gtt gag gaa gca gct aaa acg gct cct gga aag aag ggg gct gta gat 2649
Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Gly Ala Val Asp
135 140 145
cag tct cct cag gaa ccg gac tca tca tct ggt gtt ggc aaa tcg ggc 2697
Gln Ser Pro Gln Glu Pro Asp Ser Ser Ser Gly Val Gly Lys Ser Gly
150 155 160
aaa cag cct gcc aga aaa aga cta aat ttc ggt cag act gga gac tca 2745
Lys Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln Thr Gly Asp Ser
165 170 175
gag tca gtc cca gac cct caa cct ctc gga gaa cca cca gca gcc ccc 2793
Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro Pro Ala Ala Pro
180 185 190 195
aca agt ttg gga tct aat aca atg gct tca ggc ggt ggc gca cca atg 2841
Thr Ser Leu Gly Ser Asn Thr Met Ala Ser Gly Gly Gly Ala Pro Met
200 205 210
gca gac aat aac gag ggt gcc gat gga gtg ggt aat tcc tca gga aat 2889
Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn Ser Ser Gly Asn
215 220 225
tgg cat tgc gat tcc caa tgg ctg ggc gac aga gtc atc acc acc agc 2937
Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile Thr Thr Ser
230 235 240
acc aga acc tgg gcc ctg ccc act tac aac aac cat ctc tac aag caa 2985
Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln
245 250 255
atc tcc agc caa tca gga gct tca aac gac aac cac tac ttt ggc tac 3033
Ile Ser Ser Gln Ser Gly Ala Ser Asn Asp Asn His Tyr Phe Gly Tyr
260 265 270 275
agc acc cct tgg ggg tat ttt gac ttt aac aga ttc cac tgc cac ttc 3081
Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His Phe
280 285 290
tca cca cgt gac tgg cag cga ctc att aac aac aac tgg gga ttc cgg 3129
Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe Arg
295 300 305
ccc aag aaa ctc agc ttc aag ctc ttc aac atc caa gtt aga ggg gtc 3177
Pro Lys Lys Leu Ser Phe Lys Leu Phe Asn Ile Gln Val Arg Gly Val
310 315 320
acg cag aac gat ggc acg acg act att gcc aat aac ctt acc agc acg 3225
Thr Gln Asn Asp Gly Thr Thr Thr Ile Ala Asn Asn Leu Thr Ser Thr
325 330 335
gtt caa gtg ttt acg gac tcg gag tat cag ctc ccg tac gtg ctc ggg 3273
Val Gln Val Phe Thr Asp Ser Glu Tyr Gln Leu Pro Tyr Val Leu Gly
340 345 350 355
tcg gcg cac caa ggc tgt ctc ccg ccg ttt cca gcg gac gtc ttc atg 3321
Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe Met
360 365 370
gtc cct cag tat gga tac ctc acc ctg aac aac gga agt caa gcg gtg 3369
Val Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn Gly Ser Gln Ala Val
375 380 385
gga cgc tca tcc ttt tac tgc ctg gag tac ttc cct tcg cag atg cta 3417
Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln Met Leu
390 395 400
agg act gga aat aac ttc caa ttc agc tat acc ttc gag gat gta cct 3465
Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Thr Phe Glu Asp Val Pro
405 410 415
ttt cac agc agc tac gct cac agc cag agt ttg gat cgc ttg atg aat 3513
Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp Arg Leu Met Asn
420 425 430 435
cct ctt att gat cag tat ctg tac tac ctg aac aga acg caa gga aca 3561
Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Asn Arg Thr Gln Gly Thr
440 445 450
acc tct gga aca acc aac caa tca cgg ctg ctt ttt agc cag gct ggg 3609
Thr Ser Gly Thr Thr Asn Gln Ser Arg Leu Leu Phe Ser Gln Ala Gly
455 460 465
cct cag tct atg tct ttg cag gcc aga aat tgg cta cct ggg ccc tgc 3657
Pro Gln Ser Met Ser Leu Gln Ala Arg Asn Trp Leu Pro Gly Pro Cys
470 475 480
tac cgg caa cag aga ctt tca aag act gct aac gac aac aac aac agt 3705
Tyr Arg Gln Gln Arg Leu Ser Lys Thr Ala Asn Asp Asn Asn Asn Ser
485 490 495
aac ttt cct tgg aca gcg gcc agc aaa tat cat ctc aat ggc cgc gac 3753
Asn Phe Pro Trp Thr Ala Ala Ser Lys Tyr His Leu Asn Gly Arg Asp
500 505 510 515
tcg ctg gtg aat cca gga cca gct atg gcc agt cac aag gac gat gaa 3801
Ser Leu Val Asn Pro Gly Pro Ala Met Ala Ser His Lys Asp Asp Glu
520 525 530
gaa aaa ttt ttc cct atg cac ggc aat cta ata ttt ggc aaa gaa ggg 3849
Glu Lys Phe Phe Pro Met His Gly Asn Leu Ile Phe Gly Lys Glu Gly
535 540 545
aca acg gca agt aac gca gaa tta gat aat gta atg att acg gat gaa 3897
Thr Thr Ala Ser Asn Ala Glu Leu Asp Asn Val Met Ile Thr Asp Glu
550 555 560
gaa gag att cgt acc acc aat cct gtg gca aca gag cag tat gga act 3945
Glu Glu Ile Arg Thr Thr Asn Pro Val Ala Thr Glu Gln Tyr Gly Thr
565 570 575
gtg gca aat aac ttg cag agc tca aat aca gct ccc acg act gga act 3993
Val Ala Asn Asn Leu Gln Ser Ser Asn Thr Ala Pro Thr Thr Gly Thr
580 585 590 595
gtc aat cat cag ggg gcc tta cct ggc atg gtg tgg caa gat cgt gac 4041
Val Asn His Gln Gly Ala Leu Pro Gly Met Val Trp Gln Asp Arg Asp
600 605 610
gtg tac ctt caa gga cct atc tgg gca aag att cct cac acg gat gga 4089
Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly
615 620 625
cac ttt cat cct tct cct ctg atg gga ggc ttt gga ctg aaa cat ccg 4137
His Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Leu Lys His Pro
630 635 640
cct cct caa atc atg atc aaa aat act ccg gta ccg gca aat cct ccg 4185
Pro Pro Gln Ile Met Ile Lys Asn Thr Pro Val Pro Ala Asn Pro Pro
645 650 655
acg act ttc agc ccg gcc aag ttt gct tca ttt atc act cag tac tcc 4233
Thr Thr Phe Ser Pro Ala Lys Phe Ala Ser Phe Ile Thr Gln Tyr Ser
660 665 670 675
act gga cag gtc agc gtg gaa att gag tgg gag cta cag aaa gaa aac 4281
Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn
680 685 690
agc aaa cgt tgg aat cca gag att cag tac act tcc aac tac aac aag 4329
Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Asn Lys
695 700 705
tct gtt aat gtg gac ttt act gta gac act aat ggt gtt tat agt gaa 4377
Ser Val Asn Val Asp Phe Thr Val Asp Thr Asn Gly Val Tyr Ser Glu
710 715 720
cct cgc cct att gga acc cgg tat ctc aca cga aac ttg tga 4419
Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu
725 730 735
atcctggtta atcaataaac cgtttaattc gtttcagttg aactttggct cttgtgcact 4479
tctttatctt tatcttgttt ccatggctac tgcgtagata agcagcggcc tgcggcgctt 4539
gcgcttcgcg gtttacaact gctggttaat atttaactct cgccatacct ctagtgatgg 4599
agttggccac tccctctatg cgcactcgct cgctcggtgg ggcctggcga ccaaaggtcg 4659
ccagacggac gtgctttgca cgtccggccc caccgagcga gcgagtgcgc atagagggag 4719
tggccaa 4726
<210> 18
<211> 736
<212> PRT
<213> adeno-associated virus 3
<400> 18
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Val Pro Gln Pro
20 25 30
Lys Ala Asn Gln Gln His Gln Asp Asn Arg Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Glu Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Ile Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Gly
130 135 140
Ala Val Asp Gln Ser Pro Gln Glu Pro Asp Ser Ser Ser Gly Val Gly
145 150 155 160
Lys Ser Gly Lys Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln Thr
165 170 175
Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro Pro
180 185 190
Ala Ala Pro Thr Ser Leu Gly Ser Asn Thr Met Ala Ser Gly Gly Gly
195 200 205
Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn Ser
210 215 220
Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile
225 230 235 240
Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu
245 250 255
Tyr Lys Gln Ile Ser Ser Gln Ser Gly Ala Ser Asn Asp Asn His Tyr
260 265 270
Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His
275 280 285
Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp
290 295 300
Gly Phe Arg Pro Lys Lys Leu Ser Phe Lys Leu Phe Asn Ile Gln Val
305 310 315 320
Arg Gly Val Thr Gln Asn Asp Gly Thr Thr Thr Ile Ala Asn Asn Leu
325 330 335
Thr Ser Thr Val Gln Val Phe Thr Asp Ser Glu Tyr Gln Leu Pro Tyr
340 345 350
Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro Ala Asp
355 360 365
Val Phe Met Val Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn Gly Ser
370 375 380
Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser
385 390 395 400
Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Thr Phe Glu
405 410 415
Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp Arg
420 425 430
Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Asn Arg Thr
435 440 445
Gln Gly Thr Thr Ser Gly Thr Thr Asn Gln Ser Arg Leu Leu Phe Ser
450 455 460
Gln Ala Gly Pro Gln Ser Met Ser Leu Gln Ala Arg Asn Trp Leu Pro
465 470 475 480
Gly Pro Cys Tyr Arg Gln Gln Arg Leu Ser Lys Thr Ala Asn Asp Asn
485 490 495
Asn Asn Ser Asn Phe Pro Trp Thr Ala Ala Ser Lys Tyr His Leu Asn
500 505 510
Gly Arg Asp Ser Leu Val Asn Pro Gly Pro Ala Met Ala Ser His Lys
515 520 525
Asp Asp Glu Glu Lys Phe Phe Pro Met His Gly Asn Leu Ile Phe Gly
530 535 540
Lys Glu Gly Thr Thr Ala Ser Asn Ala Glu Leu Asp Asn Val Met Ile
545 550 555 560
Thr Asp Glu Glu Glu Ile Arg Thr Thr Asn Pro Val Ala Thr Glu Gln
565 570 575
Tyr Gly Thr Val Ala Asn Asn Leu Gln Ser Ser Asn Thr Ala Pro Thr
580 585 590
Thr Gly Thr Val Asn His Gln Gly Ala Leu Pro Gly Met Val Trp Gln
595 600 605
Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His
610 615 620
Thr Asp Gly His Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Leu
625 630 635 640
Lys His Pro Pro Pro Gln Ile Met Ile Lys Asn Thr Pro Val Pro Ala
645 650 655
Asn Pro Pro Thr Thr Phe Ser Pro Ala Lys Phe Ala Ser Phe Ile Thr
660 665 670
Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln
675 680 685
Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn
690 695 700
Tyr Asn Lys Ser Val Asn Val Asp Phe Thr Val Asp Thr Asn Gly Val
705 710 715 720
Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu
725 730 735
<210> 19
<211> 4767
<212> DNA
<213> adeno-associated virus 4
<220>
<221> CDS
<222> (2260)..(4464)
<223> AAV4 VP1
<220>
<221> misc_feature
<222> (2668)..(4464)
<223> AAV4 VP2
<220>
<221> misc_feature
<222> (2848)..(4464)
<223> AAV4 VP3
<400> 19
ttggccactc cctctatgcg cgctcgctca ctcactcggc cctggagacc aaaggtctcc 60
agactgccgg cctctggccg gcagggccga gtgagtgagc gagcgcgcat agagggagtg 120
gccaactcca tcatctaggt ttgcccactg acgtcaatgt gacgtcctag ggttagggag 180
gtccctgtat tagcagtcac gtgagtgtcg tatttcgcgg agcgtagcgg agcgcatacc 240
aagctgccac gtcacagcca cgtggtccgt ttgcgacagt ttgcgacacc atgtggtcag 300
gagggtatat aaccgcgagt gagccagcga ggagctccat tttgcccgcg aattttgaac 360
gagcagcagc catgccgggg ttctacgaga tcgtgctgaa ggtgcccagc gacctggacg 420
agcacctgcc cggcatttct gactcttttg tgagctgggt ggccgagaag gaatgggagc 480
tgccgccgga ttctgacatg gacttgaatc tgattgagca ggcacccctg accgtggccg 540
aaaagctgca acgcgagttc ctggtcgagt ggcgccgcgt gagtaaggcc ccggaggccc 600
tcttctttgt ccagttcgag aagggggaca gctacttcca cctgcacatc ctggtggaga 660
ccgtgggcgt caaatccatg gtggtgggcc gctacgtgag ccagattaaa gagaagctgg 720
tgacccgcat ctaccgcggg gtcgagccgc agcttccgaa ctggttcgcg gtgaccaaga 780
cgcgtaatgg cgccggaggc gggaacaagg tggtggacga ctgctacatc cccaactacc 840
tgctccccaa gacccagccc gagctccagt gggcgtggac taacatggac cagtatataa 900
gcgcctgttt gaatctcgcg gagcgtaaac ggctggtggc gcagcatctg acgcacgtgt 960
cgcagacgca ggagcagaac aaggaaaacc agaaccccaa ttctgacgcg ccggtcatca 1020
ggtcaaaaac ctccgccagg tacatggagc tggtcgggtg gctggtggac cgcgggatca 1080
cgtcagaaaa gcaatggatc caggaggacc aggcgtccta catctccttc aacgccgcct 1140
ccaactcgcg gtcacaaatc aaggccgcgc tggacaatgc ctccaaaatc atgagcctga 1200
caaagacggc tccggactac ctggtgggcc agaacccgcc ggaggacatt tccagcaacc 1260
gcatctaccg aatcctcgag atgaacgggt acgatccgca gtacgcggcc tccgtcttcc 1320
tgggctgggc gcaaaagaag ttcgggaaga ggaacaccat ctggctcttt gggccggcca 1380
cgacgggtaa aaccaacatc gcggaagcca tcgcccacgc cgtgcccttc tacggctgcg 1440
tgaactggac caatgagaac tttccgttca acgattgcgt cgacaagatg gtgatctggt 1500
gggaggaggg caagatgacg gccaaggtcg tagagagcgc caaggccatc ctgggcggaa 1560
gcaaggtgcg cgtggaccaa aagtgcaagt catcggccca gatcgaccca actcccgtga 1620
tcgtcacctc caacaccaac atgtgcgcgg tcatcgacgg aaactcgacc accttcgagc 1680
accaacaacc actccaggac cggatgttca agttcgagct caccaagcgc ctggagcacg 1740
actttggcaa ggtcaccaag caggaagtca aagacttttt ccggtgggcg tcagatcacg 1800
tgaccgaggt gactcacgag ttttacgtca gaaagggtgg agctagaaag aggcccgccc 1860
ccaatgacgc agatataagt gagcccaagc gggcctgtcc gtcagttgcg cagccatcga 1920
cgtcagacgc ggaagctccg gtggactacg cggacaggta ccaaaacaaa tgttctcgtc 1980
acgtgggtat gaatctgatg ctttttccct gccggcaatg cgagagaatg aatcagaatg 2040
tggacatttg cttcacgcac ggggtcatgg actgtgccga gtgcttcccc gtgtcagaat 2100
ctcaacccgt gtctgtcgtc agaaagcgga cgtatcagaa actgtgtccg attcatcaca 2160
tcatggggag ggcgcccgag gtggcctgct cggcctgcga actggccaat gtggacttgg 2220
atgactgtga catggaacaa taaatgactc aaaccagat atg act gac ggt tac 2274
Met Thr Asp Gly Tyr
1 5
ctt cca gat tgg cta gag gac aac ctc tct gaa ggc gtt cga gag tgg 2322
Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu Gly Val Arg Glu Trp
10 15 20
tgg gcg ctg caa cct gga gcc cct aaa ccc aag gca aat caa caa cat 2370
Trp Ala Leu Gln Pro Gly Ala Pro Lys Pro Lys Ala Asn Gln Gln His
25 30 35
cag gac aac gct cgg ggt ctt gtg ctt ccg ggt tac aaa tac ctc gga 2418
Gln Asp Asn Ala Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr Leu Gly
40 45 50
ccc ggc aac gga ctc gac aag ggg gaa ccc gtc aac gca gcg gac gcg 2466
Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Ala Ala Asp Ala
55 60 65
gca gcc ctc gag cac gac aag gcc tac gac cag cag ctc aag gcc ggt 2514
Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln Gln Leu Lys Ala Gly
70 75 80 85
gac aac ccc tac ctc aag tac aac cac gcc gac gcg gag ttc cag cag 2562
Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu Phe Gln Gln
90 95 100
cgg ctt cag ggc gac aca tcg ttt ggg ggc aac ctc ggc aga gca gtc 2610
Arg Leu Gln Gly Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg Ala Val
105 110 115
ttc cag gcc aaa aag agg gtt ctt gaa cct ctt ggt ctg gtt gag caa 2658
Phe Gln Ala Lys Lys Arg Val Leu Glu Pro Leu Gly Leu Val Glu Gln
120 125 130
gcg ggt gag acg gct cct gga aag aag aga ccg ttg att gaa tcc ccc 2706
Ala Gly Glu Thr Ala Pro Gly Lys Lys Arg Pro Leu Ile Glu Ser Pro
135 140 145
cag cag ccc gac tcc tcc acg ggt atc ggc aaa aaa ggc aag cag ccg 2754
Gln Gln Pro Asp Ser Ser Thr Gly Ile Gly Lys Lys Gly Lys Gln Pro
150 155 160 165
gct aaa aag aag ctc gtt ttc gaa gac gaa act gga gca ggc gac gga 2802
Ala Lys Lys Lys Leu Val Phe Glu Asp Glu Thr Gly Ala Gly Asp Gly
170 175 180
ccc cct gag gga tca act tcc gga gcc atg tct gat gac agt gag atg 2850
Pro Pro Glu Gly Ser Thr Ser Gly Ala Met Ser Asp Asp Ser Glu Met
185 190 195
cgt gca gca gct ggc gga gct gca gtc gag ggc gga caa ggt gcc gat 2898
Arg Ala Ala Ala Gly Gly Ala Ala Val Glu Gly Gly Gln Gly Ala Asp
200 205 210
gga gtg ggt aat gcc tcg ggt gat tgg cat tgc gat tcc acc tgg tct 2946
Gly Val Gly Asn Ala Ser Gly Asp Trp His Cys Asp Ser Thr Trp Ser
215 220 225
gag ggc cac gtc acg acc acc agc acc aga acc tgg gtc ttg ccc acc 2994
Glu Gly His Val Thr Thr Thr Ser Thr Arg Thr Trp Val Leu Pro Thr
230 235 240 245
tac aac aac cac ctc tac aag cga ctc gga gag agc ctg cag tcc aac 3042
Tyr Asn Asn His Leu Tyr Lys Arg Leu Gly Glu Ser Leu Gln Ser Asn
250 255 260
acc tac aac gga ttc tcc acc ccc tgg gga tac ttt gac ttc aac cgc 3090
Thr Tyr Asn Gly Phe Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg
265 270 275
ttc cac tgc cac ttc tca cca cgt gac tgg cag cga ctc atc aac aac 3138
Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn
280 285 290
aac tgg ggc atg cga ccc aaa gcc atg cgg gtc aaa atc ttc aac atc 3186
Asn Trp Gly Met Arg Pro Lys Ala Met Arg Val Lys Ile Phe Asn Ile
295 300 305
cag gtc aag gag gtc acg acg tcg aac ggc gag aca acg gtg gct aat 3234
Gln Val Lys Glu Val Thr Thr Ser Asn Gly Glu Thr Thr Val Ala Asn
310 315 320 325
aac ctt acc agc acg gtt cag atc ttt gcg gac tcg tcg tac gaa ctg 3282
Asn Leu Thr Ser Thr Val Gln Ile Phe Ala Asp Ser Ser Tyr Glu Leu
330 335 340
ccg tac gtg atg gat gcg ggt caa gag ggc agc ctg cct cct ttt ccc 3330
Pro Tyr Val Met Asp Ala Gly Gln Glu Gly Ser Leu Pro Pro Phe Pro
345 350 355
aac gac gtc ttt atg gtg ccc cag tac ggc tac tgt gga ctg gtg acc 3378
Asn Asp Val Phe Met Val Pro Gln Tyr Gly Tyr Cys Gly Leu Val Thr
360 365 370
ggc aac act tcg cag caa cag act gac aga aat gcc ttc tac tgc ctg 3426
Gly Asn Thr Ser Gln Gln Gln Thr Asp Arg Asn Ala Phe Tyr Cys Leu
375 380 385
gag tac ttt cct tcg cag atg ctg cgg act ggc aac aac ttt gaa att 3474
Glu Tyr Phe Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Glu Ile
390 395 400 405
acg tac agt ttt gag aag gtg cct ttc cac tcg atg tac gcg cac agc 3522
Thr Tyr Ser Phe Glu Lys Val Pro Phe His Ser Met Tyr Ala His Ser
410 415 420
cag agc ctg gac cgg ctg atg aac cct ctc atc gac cag tac ctg tgg 3570
Gln Ser Leu Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Trp
425 430 435
gga ctg caa tcg acc acc acc gga acc acc ctg aat gcc ggg act gcc 3618
Gly Leu Gln Ser Thr Thr Thr Gly Thr Thr Leu Asn Ala Gly Thr Ala
440 445 450
acc acc aac ttt acc aag ctg cgg cct acc aac ttt tcc aac ttt aaa 3666
Thr Thr Asn Phe Thr Lys Leu Arg Pro Thr Asn Phe Ser Asn Phe Lys
455 460 465
aag aac tgg ctg ccc ggg cct tca atc aag cag cag ggc ttc tca aag 3714
Lys Asn Trp Leu Pro Gly Pro Ser Ile Lys Gln Gln Gly Phe Ser Lys
470 475 480 485
act gcc aat caa aac tac aag atc cct gcc acc ggg tca gac agt ctc 3762
Thr Ala Asn Gln Asn Tyr Lys Ile Pro Ala Thr Gly Ser Asp Ser Leu
490 495 500
atc aaa tac gag acg cac agc act ctg gac gga aga tgg agt gcc ctg 3810
Ile Lys Tyr Glu Thr His Ser Thr Leu Asp Gly Arg Trp Ser Ala Leu
505 510 515
acc ccc gga cct cca atg gcc acg gct gga cct gcg gac agc aag ttc 3858
Thr Pro Gly Pro Pro Met Ala Thr Ala Gly Pro Ala Asp Ser Lys Phe
520 525 530
agc aac agc cag ctc atc ttt gcg ggg cct aaa cag aac ggc aac acg 3906
Ser Asn Ser Gln Leu Ile Phe Ala Gly Pro Lys Gln Asn Gly Asn Thr
535 540 545
gcc acc gta ccc ggg act ctg atc ttc acc tct gag gag gag ctg gca 3954
Ala Thr Val Pro Gly Thr Leu Ile Phe Thr Ser Glu Glu Glu Leu Ala
550 555 560 565
gcc acc aac gcc acc gat acg gac atg tgg ggc aac cta cct ggc ggt 4002
Ala Thr Asn Ala Thr Asp Thr Asp Met Trp Gly Asn Leu Pro Gly Gly
570 575 580
gac cag agc aac agc aac ctg ccg acc gtg gac aga ctg aca gcc ttg 4050
Asp Gln Ser Asn Ser Asn Leu Pro Thr Val Asp Arg Leu Thr Ala Leu
585 590 595
gga gcc gtg cct gga atg gtc tgg caa aac aga gac att tac tac cag 4098
Gly Ala Val Pro Gly Met Val Trp Gln Asn Arg Asp Ile Tyr Tyr Gln
600 605 610
ggt ccc att tgg gcc aag att cct cat acc gat gga cac ttt cac ccc 4146
Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly His Phe His Pro
615 620 625
tca ccg ctg att ggt ggg ttt ggg ctg aaa cac ccg cct cct caa att 4194
Ser Pro Leu Ile Gly Gly Phe Gly Leu Lys His Pro Pro Pro Gln Ile
630 635 640 645
ttt atc aag aac acc ccg gta cct gcg aat cct gca acg acc ttc agc 4242
Phe Ile Lys Asn Thr Pro Val Pro Ala Asn Pro Ala Thr Thr Phe Ser
650 655 660
tct act ccg gta aac tcc ttc att act cag tac agc act ggc cag gtg 4290
Ser Thr Pro Val Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val
665 670 675
tcg gtg cag att gac tgg gag atc cag aag gag cgg tcc aaa cgc tgg 4338
Ser Val Gln Ile Asp Trp Glu Ile Gln Lys Glu Arg Ser Lys Arg Trp
680 685 690
aac ccc gag gtc cag ttt acc tcc aac tac gga cag caa aac tct ctg 4386
Asn Pro Glu Val Gln Phe Thr Ser Asn Tyr Gly Gln Gln Asn Ser Leu
695 700 705
ttg tgg gct ccc gat gcg gct ggg aaa tac act gag cct agg gct atc 4434
Leu Trp Ala Pro Asp Ala Ala Gly Lys Tyr Thr Glu Pro Arg Ala Ile
710 715 720 725
ggt acc cgc tac ctc acc cac cac ctg taa taacctgtta atcaataaac 4484
Gly Thr Arg Tyr Leu Thr His His Leu
730
cggtttattc gtttcagttg aactttggtc tccgtgtcct tcttatctta tctcgtttcc 4544
atggctactg cgtacataag cagcggcctg cggcgcttgc gcttcgcggt ttacaactgc 4604
cggttaatca gtaacttctg gcaaaccaga tgatggagtt ggccacatta gctatgcgcg 4664
ctcgctcact cactcggccc tggagaccaa aggtctccag actgccggcc tctggccggc 4724
agggccgagt gagtgagcga gcgcgcatag agggagtggc caa 4767
<210> 20
<211> 734
<212> PRT
<213> adeno-associated virus 4
<400> 20
Met Thr Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu
1 5 10 15
Gly Val Arg Glu Trp Trp Ala Leu Gln Pro Gly Ala Pro Lys Pro Lys
20 25 30
Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro Gly
35 40 45
Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val
50 55 60
Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln
65 70 75 80
Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp
85 90 95
Ala Glu Phe Gln Gln Arg Leu Gln Gly Asp Thr Ser Phe Gly Gly Asn
100 105 110
Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro Leu
115 120 125
Gly Leu Val Glu Gln Ala Gly Glu Thr Ala Pro Gly Lys Lys Arg Pro
130 135 140
Leu Ile Glu Ser Pro Gln Gln Pro Asp Ser Ser Thr Gly Ile Gly Lys
145 150 155 160
Lys Gly Lys Gln Pro Ala Lys Lys Lys Leu Val Phe Glu Asp Glu Thr
165 170 175
Gly Ala Gly Asp Gly Pro Pro Glu Gly Ser Thr Ser Gly Ala Met Ser
180 185 190
Asp Asp Ser Glu Met Arg Ala Ala Ala Gly Gly Ala Ala Val Glu Gly
195 200 205
Gly Gln Gly Ala Asp Gly Val Gly Asn Ala Ser Gly Asp Trp His Cys
210 215 220
Asp Ser Thr Trp Ser Glu Gly His Val Thr Thr Thr Ser Thr Arg Thr
225 230 235 240
Trp Val Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Arg Leu Gly Glu
245 250 255
Ser Leu Gln Ser Asn Thr Tyr Asn Gly Phe Ser Thr Pro Trp Gly Tyr
260 265 270
Phe Asp Phe Asn Arg Phe His Cys His Phe Ser Pro Arg Asp Trp Gln
275 280 285
Arg Leu Ile Asn Asn Asn Trp Gly Met Arg Pro Lys Ala Met Arg Val
290 295 300
Lys Ile Phe Asn Ile Gln Val Lys Glu Val Thr Thr Ser Asn Gly Glu
305 310 315 320
Thr Thr Val Ala Asn Asn Leu Thr Ser Thr Val Gln Ile Phe Ala Asp
325 330 335
Ser Ser Tyr Glu Leu Pro Tyr Val Met Asp Ala Gly Gln Glu Gly Ser
340 345 350
Leu Pro Pro Phe Pro Asn Asp Val Phe Met Val Pro Gln Tyr Gly Tyr
355 360 365
Cys Gly Leu Val Thr Gly Asn Thr Ser Gln Gln Gln Thr Asp Arg Asn
370 375 380
Ala Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln Met Leu Arg Thr Gly
385 390 395 400
Asn Asn Phe Glu Ile Thr Tyr Ser Phe Glu Lys Val Pro Phe His Ser
405 410 415
Met Tyr Ala His Ser Gln Ser Leu Asp Arg Leu Met Asn Pro Leu Ile
420 425 430
Asp Gln Tyr Leu Trp Gly Leu Gln Ser Thr Thr Thr Gly Thr Thr Leu
435 440 445
Asn Ala Gly Thr Ala Thr Thr Asn Phe Thr Lys Leu Arg Pro Thr Asn
450 455 460
Phe Ser Asn Phe Lys Lys Asn Trp Leu Pro Gly Pro Ser Ile Lys Gln
465 470 475 480
Gln Gly Phe Ser Lys Thr Ala Asn Gln Asn Tyr Lys Ile Pro Ala Thr
485 490 495
Gly Ser Asp Ser Leu Ile Lys Tyr Glu Thr His Ser Thr Leu Asp Gly
500 505 510
Arg Trp Ser Ala Leu Thr Pro Gly Pro Pro Met Ala Thr Ala Gly Pro
515 520 525
Ala Asp Ser Lys Phe Ser Asn Ser Gln Leu Ile Phe Ala Gly Pro Lys
530 535 540
Gln Asn Gly Asn Thr Ala Thr Val Pro Gly Thr Leu Ile Phe Thr Ser
545 550 555 560
Glu Glu Glu Leu Ala Ala Thr Asn Ala Thr Asp Thr Asp Met Trp Gly
565 570 575
Asn Leu Pro Gly Gly Asp Gln Ser Asn Ser Asn Leu Pro Thr Val Asp
580 585 590
Arg Leu Thr Ala Leu Gly Ala Val Pro Gly Met Val Trp Gln Asn Arg
595 600 605
Asp Ile Tyr Tyr Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp
610 615 620
Gly His Phe His Pro Ser Pro Leu Ile Gly Gly Phe Gly Leu Lys His
625 630 635 640
Pro Pro Pro Gln Ile Phe Ile Lys Asn Thr Pro Val Pro Ala Asn Pro
645 650 655
Ala Thr Thr Phe Ser Ser Thr Pro Val Asn Ser Phe Ile Thr Gln Tyr
660 665 670
Ser Thr Gly Gln Val Ser Val Gln Ile Asp Trp Glu Ile Gln Lys Glu
675 680 685
Arg Ser Lys Arg Trp Asn Pro Glu Val Gln Phe Thr Ser Asn Tyr Gly
690 695 700
Gln Gln Asn Ser Leu Leu Trp Ala Pro Asp Ala Ala Gly Lys Tyr Thr
705 710 715 720
Glu Pro Arg Ala Ile Gly Thr Arg Tyr Leu Thr His His Leu
725 730
<210> 21
<211> 4642
<212> DNA
<213> adeno-associated virus 5
<220>
<221> CDS
<222> (2207)..(4381)
<223> AAV5 VP1
<220>
<221> misc_feature
<222> (2615)..(4381)
<223> AAV5 VP2
<220>
<221> misc_feature
<222> (2783)..(4381)
<223> AAV5 VP3
<400> 21
ctctcccccc tgtcgcgttc gctcgctcgc tggctcgttt gggggggtgg cagctcaaag 60
agctgccaga cgacggccct ctggccgtcg cccccccaaa cgagccagcg agcgagcgaa 120
cgcgacaggg gggagagtgc cacactctca agcaaggggg ttttgtaagc agtgatgtca 180
taatgatgta atgcttattg tcacgcgata gttaatgatt aacagtcatg tgatgtgttt 240
tatccaatag gaagaaagcg cgcgtatgag ttctcgcgag acttccgggg tataaaagac 300
cgagtgaacg agcccgccgc cattctttgc tctggactgc tagaggaccc tcgctgccat 360
ggctaccttc tatgaagtca ttgttcgcgt cccatttgac gtggaggaac atctgcctgg 420
aatttctgac agctttgtgg actgggtaac tggtcaaatt tgggagctgc ctccagagtc 480
agatttaaat ttgactctgg ttgaacagcc tcagttgacg gtggctgata gaattcgccg 540
cgtgttcctg tacgagtgga acaaattttc caagcaggag tccaaattct ttgtgcagtt 600
tgaaaaggga tctgaatatt ttcatctgca cacgcttgtg gagacctccg gcatctcttc 660
catggtcctc ggccgctacg tgagtcagat tcgcgcccag ctggtgaaag tggtcttcca 720
gggaattgaa ccccagatca acgactgggt cgccatcacc aaggtaaaga agggcggagc 780
caataaggtg gtggattctg ggtatattcc cgcctacctg ctgccgaagg tccaaccgga 840
gcttcagtgg gcgtggacaa acctggacga gtataaattg gccgccctga atctggagga 900
gcgcaaacgg ctcgtcgcgc agtttctggc agaatcctcg cagcgctcgc aggaggcggc 960
ttcgcagcgt gagttctcgg ctgacccggt catcaaaagc aagacttccc agaaatacat 1020
ggcgctcgtc aactggctcg tggagcacgg catcacttcc gagaagcagt ggatccagga 1080
aaatcaggag agctacctct ccttcaactc caccggcaac tctcggagcc agatcaaggc 1140
cgcgctcgac aacgcgacca aaattatgag tctgacaaaa agcgcggtgg actacctcgt 1200
ggggagctcc gttcccgagg acatttcaaa aaacagaatc tggcaaattt ttgagatgaa 1260
tggctacgac ccggcctacg cgggatccat cctctacggc tggtgtcagc gctccttcaa 1320
caagaggaac accgtctggc tctacggacc cgccacgacc ggcaagacca acatcgcgga 1380
ggccatcgcc cacactgtgc ccttttacgg ctgcgtgaac tggaccaatg aaaactttcc 1440
ctttaatgac tgtgtggaca aaatgctcat ttggtgggag gagggaaaga tgaccaacaa 1500
ggtggttgaa tccgccaagg ccatcctggg gggctcaaag gtgcgggtcg atcagaaatg 1560
taaatcctct gttcaaattg attctacccc tgtcattgta acttccaata caaacatgtg 1620
tgtggtggtg gatgggaatt ccacgacctt tgaacaccag cagccgctgg aggaccgcat 1680
gttcaaattt gaactgacta agcggctccc gccagatttt ggcaagatta ctaagcagga 1740
agtcaaggac ttttttgctt gggcaaaggt caatcaggtg ccggtgactc acgagtttaa 1800
agttcccagg gaattggcgg gaactaaagg ggcggagaaa tctctaaaac gcccactggg 1860
tgacgtcacc aatactagct ataaaagtct ggagaagcgg gccaggctct catttgttcc 1920
cgagacgcct cgcagttcag acgtgactgt tgatcccgct cctctgcgac cgctcaattg 1980
gaattcaagg tatgattgca aatgtgacta tcatgctcaa tttgacaaca tttctaacaa 2040
atgtgatgaa tgtgaatatt tgaatcgggg caaaaatgga tgtatctgtc acaatgtaac 2100
tcactgtcaa atttgtcatg ggattccccc ctgggaaaag gaaaacttgt cagattttgg 2160
ggattttgac gatgccaata aagaacagta aataaagcga gtagtc atg tct ttt 2215
Met Ser Phe
1
gtt gat cac cct cca gat tgg ttg gaa gaa gtt ggt gaa ggt ctt cgc 2263
Val Asp His Pro Pro Asp Trp Leu Glu Glu Val Gly Glu Gly Leu Arg
5 10 15
gag ttt ttg ggc ctt gaa gcg ggc cca ccg aaa cca aaa ccc aat cag 2311
Glu Phe Leu Gly Leu Glu Ala Gly Pro Pro Lys Pro Lys Pro Asn Gln
20 25 30 35
cag cat caa gat caa gcc cgt ggt ctt gtg ctg cct ggt tat aac tat 2359
Gln His Gln Asp Gln Ala Arg Gly Leu Val Leu Pro Gly Tyr Asn Tyr
40 45 50
ctc gga ccc gga aac ggt ctc gat cga gga gag cct gtc aac agg gca 2407
Leu Gly Pro Gly Asn Gly Leu Asp Arg Gly Glu Pro Val Asn Arg Ala
55 60 65
gac gag gtc gcg cga gag cac gac atc tcg tac aac gag cag ctt gag 2455
Asp Glu Val Ala Arg Glu His Asp Ile Ser Tyr Asn Glu Gln Leu Glu
70 75 80
gcg gga gac aac ccc tac ctc aag tac aac cac gcg gac gcc gag ttt 2503
Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu Phe
85 90 95
cag gag aag ctc gcc gac gac aca tcc ttc ggg gga aac ctc gga aag 2551
Gln Glu Lys Leu Ala Asp Asp Thr Ser Phe Gly Gly Asn Leu Gly Lys
100 105 110 115
gca gtc ttt cag gcc aag aaa agg gtt ctc gaa cct ttt ggc ctg gtt 2599
Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro Phe Gly Leu Val
120 125 130
gaa gag ggt gct aag acg gcc cct acc gga aag cgg ata gac gac cac 2647
Glu Glu Gly Ala Lys Thr Ala Pro Thr Gly Lys Arg Ile Asp Asp His
135 140 145
ttt cca aaa aga aag aag gct cgg acc gaa gag gac tcc aag cct tcc 2695
Phe Pro Lys Arg Lys Lys Ala Arg Thr Glu Glu Asp Ser Lys Pro Ser
150 155 160
acc tcg tca gac gcc gaa gct gga ccc agc gga tcc cag cag ctg caa 2743
Thr Ser Ser Asp Ala Glu Ala Gly Pro Ser Gly Ser Gln Gln Leu Gln
165 170 175
atc cca gcc caa cca gcc tca agt ttg gga gct gat aca atg tct gcg 2791
Ile Pro Ala Gln Pro Ala Ser Ser Leu Gly Ala Asp Thr Met Ser Ala
180 185 190 195
gga ggt ggc ggc cca ttg ggc gac aat aac caa ggt gcc gat gga gtg 2839
Gly Gly Gly Gly Pro Leu Gly Asp Asn Asn Gln Gly Ala Asp Gly Val
200 205 210
ggc aat gcc tcg gga gat tgg cat tgc gat tcc acg tgg atg ggg gac 2887
Gly Asn Ala Ser Gly Asp Trp His Cys Asp Ser Thr Trp Met Gly Asp
215 220 225
aga gtc gtc acc aag tcc acc cga acc tgg gtg ctg ccc agc tac aac 2935
Arg Val Val Thr Lys Ser Thr Arg Thr Trp Val Leu Pro Ser Tyr Asn
230 235 240
aac cac cag tac cga gag atc aaa agc ggc tcc gtc gac gga agc aac 2983
Asn His Gln Tyr Arg Glu Ile Lys Ser Gly Ser Val Asp Gly Ser Asn
245 250 255
gcc aac gcc tac ttt gga tac agc acc ccc tgg ggg tac ttt gac ttt 3031
Ala Asn Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe
260 265 270 275
aac cgc ttc cac agc cac tgg agc ccc cga gac tgg caa aga ctc atc 3079
Asn Arg Phe His Ser His Trp Ser Pro Arg Asp Trp Gln Arg Leu Ile
280 285 290
aac aac tac tgg ggc ttc aga ccc cgg tcc ctc aga gtc aaa atc ttc 3127
Asn Asn Tyr Trp Gly Phe Arg Pro Arg Ser Leu Arg Val Lys Ile Phe
295 300 305
aac att caa gtc aaa gag gtc acg gtg cag gac tcc acc acc acc atc 3175
Asn Ile Gln Val Lys Glu Val Thr Val Gln Asp Ser Thr Thr Thr Ile
310 315 320
gcc aac aac ctc acc tcc acc gtc caa gtg ttt acg gac gac gac tac 3223
Ala Asn Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Asp Asp Tyr
325 330 335
cag ctg ccc tac gtc gtc ggc aac ggg acc gag gga tgc ctg ccg gcc 3271
Gln Leu Pro Tyr Val Val Gly Asn Gly Thr Glu Gly Cys Leu Pro Ala
340 345 350 355
ttc cct ccg cag gtc ttt acg ctg ccg cag tac ggt tac gcg acg ctg 3319
Phe Pro Pro Gln Val Phe Thr Leu Pro Gln Tyr Gly Tyr Ala Thr Leu
360 365 370
aac cgc gac aac aca gaa aat ccc acc gag agg agc agc ttc ttc tgc 3367
Asn Arg Asp Asn Thr Glu Asn Pro Thr Glu Arg Ser Ser Phe Phe Cys
375 380 385
cta gag tac ttt ccc agc aag atg ctg aga acg ggc aac aac ttt gag 3415
Leu Glu Tyr Phe Pro Ser Lys Met Leu Arg Thr Gly Asn Asn Phe Glu
390 395 400
ttt acc tac aac ttt gag gag gtg ccc ttc cac tcc agc ttc gct ccc 3463
Phe Thr Tyr Asn Phe Glu Glu Val Pro Phe His Ser Ser Phe Ala Pro
405 410 415
agt cag aac ctg ttc aag ctg gcc aac ccg ctg gtg gac cag tac ttg 3511
Ser Gln Asn Leu Phe Lys Leu Ala Asn Pro Leu Val Asp Gln Tyr Leu
420 425 430 435
tac cgc ttc gtg agc aca aat aac act ggc gga gtc cag ttc aac aag 3559
Tyr Arg Phe Val Ser Thr Asn Asn Thr Gly Gly Val Gln Phe Asn Lys
440 445 450
aac ctg gcc ggg aga tac gcc aac acc tac aaa aac tgg ttc ccg ggg 3607
Asn Leu Ala Gly Arg Tyr Ala Asn Thr Tyr Lys Asn Trp Phe Pro Gly
455 460 465
ccc atg ggc cga acc cag ggc tgg aac ctg ggc tcc ggg gtc aac cgc 3655
Pro Met Gly Arg Thr Gln Gly Trp Asn Leu Gly Ser Gly Val Asn Arg
470 475 480
gcc agt gtc agc gcc ttc gcc acg acc aat agg atg gag ctc gag ggc 3703
Ala Ser Val Ser Ala Phe Ala Thr Thr Asn Arg Met Glu Leu Glu Gly
485 490 495
gcg agt tac cag gtg ccc ccg cag ccg aac ggc atg acc aac aac ctc 3751
Ala Ser Tyr Gln Val Pro Pro Gln Pro Asn Gly Met Thr Asn Asn Leu
500 505 510 515
cag ggc agc aac acc tat gcc ctg gag aac act atg atc ttc aac agc 3799
Gln Gly Ser Asn Thr Tyr Ala Leu Glu Asn Thr Met Ile Phe Asn Ser
520 525 530
cag ccg gcg aac ccg ggc acc acc gcc acg tac ctc gag ggc aac atg 3847
Gln Pro Ala Asn Pro Gly Thr Thr Ala Thr Tyr Leu Glu Gly Asn Met
535 540 545
ctc atc acc agc gag agc gag acg cag ccg gtg aac cgc gtg gcg tac 3895
Leu Ile Thr Ser Glu Ser Glu Thr Gln Pro Val Asn Arg Val Ala Tyr
550 555 560
aac gtc ggc ggg cag atg gcc acc aac aac cag agc tcc acc act gcc 3943
Asn Val Gly Gly Gln Met Ala Thr Asn Asn Gln Ser Ser Thr Thr Ala
565 570 575
ccc gcg acc ggc acg tac aac ctc cag gaa atc gtg ccc ggc agc gtg 3991
Pro Ala Thr Gly Thr Tyr Asn Leu Gln Glu Ile Val Pro Gly Ser Val
580 585 590 595
tgg atg gag agg gac gtg tac ctc caa gga ccc atc tgg gcc aag atc 4039
Trp Met Glu Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile
600 605 610
cca gag acg ggg gcg cac ttt cac ccc tct ccg gcc atg ggc gga ttc 4087
Pro Glu Thr Gly Ala His Phe His Pro Ser Pro Ala Met Gly Gly Phe
615 620 625
gga ctc aaa cac cca ccg ccc atg atg ctc atc aag aac acg cct gtg 4135
Gly Leu Lys His Pro Pro Pro Met Met Leu Ile Lys Asn Thr Pro Val
630 635 640
ccc gga aat atc acc agc ttc tcg gac gtg ccc gtc agc agc ttc atc 4183
Pro Gly Asn Ile Thr Ser Phe Ser Asp Val Pro Val Ser Ser Phe Ile
645 650 655
acc cag tac agc acc ggg cag gtc acc gtg gag atg gag tgg gag ctc 4231
Thr Gln Tyr Ser Thr Gly Gln Val Thr Val Glu Met Glu Trp Glu Leu
660 665 670 675
aag aag gaa aac tcc aag agg tgg aac cca gag atc cag tac aca aac 4279
Lys Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Asn
680 685 690
aac tac aac gac ccc cag ttt gtg gac ttt gcc ccg gac agc acc ggg 4327
Asn Tyr Asn Asp Pro Gln Phe Val Asp Phe Ala Pro Asp Ser Thr Gly
695 700 705
gaa tac aga acc acc aga cct atc gga acc cga tac ctt acc cga ccc 4375
Glu Tyr Arg Thr Thr Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Pro
710 715 720
ctt taa cccattcatg tcgcataccc tcaataaacc gtgtattcgt gtcagtaaaa 4431
Leu
tactgcctct tgtggtcatt caatgaataa cagcttacaa catctacaaa acctccttgc 4491
ttgagagtgt ggcactctcc cccctgtcgc gttcgctcgc tcgctggctc gtttgggggg 4551
gtggcagctc aaagagctgc cagacgacgg ccctctggcc gtcgcccccc caaacgagcc 4611
agcgagcgag cgaacgcgac aggggggaga g 4642
<210> 22
<211> 724
<212> PRT
<213> adeno-associated virus 5
<400> 22
Met Ser Phe Val Asp His Pro Pro Asp Trp Leu Glu Glu Val Gly Glu
1 5 10 15
Gly Leu Arg Glu Phe Leu Gly Leu Glu Ala Gly Pro Pro Lys Pro Lys
20 25 30
Pro Asn Gln Gln His Gln Asp Gln Ala Arg Gly Leu Val Leu Pro Gly
35 40 45
Tyr Asn Tyr Leu Gly Pro Gly Asn Gly Leu Asp Arg Gly Glu Pro Val
50 55 60
Asn Arg Ala Asp Glu Val Ala Arg Glu His Asp Ile Ser Tyr Asn Glu
65 70 75 80
Gln Leu Glu Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp
85 90 95
Ala Glu Phe Gln Glu Lys Leu Ala Asp Asp Thr Ser Phe Gly Gly Asn
100 105 110
Leu Gly Lys Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro Phe
115 120 125
Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Thr Gly Lys Arg Ile
130 135 140
Asp Asp His Phe Pro Lys Arg Lys Lys Ala Arg Thr Glu Glu Asp Ser
145 150 155 160
Lys Pro Ser Thr Ser Ser Asp Ala Glu Ala Gly Pro Ser Gly Ser Gln
165 170 175
Gln Leu Gln Ile Pro Ala Gln Pro Ala Ser Ser Leu Gly Ala Asp Thr
180 185 190
Met Ser Ala Gly Gly Gly Gly Pro Leu Gly Asp Asn Asn Gln Gly Ala
195 200 205
Asp Gly Val Gly Asn Ala Ser Gly Asp Trp His Cys Asp Ser Thr Trp
210 215 220
Met Gly Asp Arg Val Val Thr Lys Ser Thr Arg Thr Trp Val Leu Pro
225 230 235 240
Ser Tyr Asn Asn His Gln Tyr Arg Glu Ile Lys Ser Gly Ser Val Asp
245 250 255
Gly Ser Asn Ala Asn Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr
260 265 270
Phe Asp Phe Asn Arg Phe His Ser His Trp Ser Pro Arg Asp Trp Gln
275 280 285
Arg Leu Ile Asn Asn Tyr Trp Gly Phe Arg Pro Arg Ser Leu Arg Val
290 295 300
Lys Ile Phe Asn Ile Gln Val Lys Glu Val Thr Val Gln Asp Ser Thr
305 310 315 320
Thr Thr Ile Ala Asn Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp
325 330 335
Asp Asp Tyr Gln Leu Pro Tyr Val Val Gly Asn Gly Thr Glu Gly Cys
340 345 350
Leu Pro Ala Phe Pro Pro Gln Val Phe Thr Leu Pro Gln Tyr Gly Tyr
355 360 365
Ala Thr Leu Asn Arg Asp Asn Thr Glu Asn Pro Thr Glu Arg Ser Ser
370 375 380
Phe Phe Cys Leu Glu Tyr Phe Pro Ser Lys Met Leu Arg Thr Gly Asn
385 390 395 400
Asn Phe Glu Phe Thr Tyr Asn Phe Glu Glu Val Pro Phe His Ser Ser
405 410 415
Phe Ala Pro Ser Gln Asn Leu Phe Lys Leu Ala Asn Pro Leu Val Asp
420 425 430
Gln Tyr Leu Tyr Arg Phe Val Ser Thr Asn Asn Thr Gly Gly Val Gln
435 440 445
Phe Asn Lys Asn Leu Ala Gly Arg Tyr Ala Asn Thr Tyr Lys Asn Trp
450 455 460
Phe Pro Gly Pro Met Gly Arg Thr Gln Gly Trp Asn Leu Gly Ser Gly
465 470 475 480
Val Asn Arg Ala Ser Val Ser Ala Phe Ala Thr Thr Asn Arg Met Glu
485 490 495
Leu Glu Gly Ala Ser Tyr Gln Val Pro Pro Gln Pro Asn Gly Met Thr
500 505 510
Asn Asn Leu Gln Gly Ser Asn Thr Tyr Ala Leu Glu Asn Thr Met Ile
515 520 525
Phe Asn Ser Gln Pro Ala Asn Pro Gly Thr Thr Ala Thr Tyr Leu Glu
530 535 540
Gly Asn Met Leu Ile Thr Ser Glu Ser Glu Thr Gln Pro Val Asn Arg
545 550 555 560
Val Ala Tyr Asn Val Gly Gly Gln Met Ala Thr Asn Asn Gln Ser Ser
565 570 575
Thr Thr Ala Pro Ala Thr Gly Thr Tyr Asn Leu Gln Glu Ile Val Pro
580 585 590
Gly Ser Val Trp Met Glu Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp
595 600 605
Ala Lys Ile Pro Glu Thr Gly Ala His Phe His Pro Ser Pro Ala Met
610 615 620
Gly Gly Phe Gly Leu Lys His Pro Pro Pro Met Met Leu Ile Lys Asn
625 630 635 640
Thr Pro Val Pro Gly Asn Ile Thr Ser Phe Ser Asp Val Pro Val Ser
645 650 655
Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val Thr Val Glu Met Glu
660 665 670
Trp Glu Leu Lys Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln
675 680 685
Tyr Thr Asn Asn Tyr Asn Asp Pro Gln Phe Val Asp Phe Ala Pro Asp
690 695 700
Ser Thr Gly Glu Tyr Arg Thr Thr Arg Pro Ile Gly Thr Arg Tyr Leu
705 710 715 720
Thr Arg Pro Leu
<210> 23
<211> 4683
<212> DNA
<213> adeno-associated virus 6
<220>
<221> CDS
<222> (2208)..(4418)
<223> AAV6 VP1
<220>
<221> misc_feature
<222> (2619)..(4418)
<223> AAV6 VP2
<220>
<221> misc_feature
<222> (2814)..(4418)
<223> AAV6 VP3
<400> 23
ttggccactc cctctctgcg cgctcgctcg ctcactgagg ccgggcgacc aaaggtcgcc 60
cgacgcccgg gctttgcccg ggcggcctca gtgagcgagc gagcgcgcag agagggagtg 120
gccaactcca tcactagggg ttcctggagg ggtggagtcg tgacgtgaat tacgtcatag 180
ggttagggag gtcctgtatt agaggtcacg tgagtgtttt gcgacatttt gcgacaccat 240
gtggtcacgc tgggtattta agcccgagtg agcacgcagg gtctccattt tgaagcggga 300
ggtttgaacg cgcagcgcca tgccggggtt ttacgagatt gtgattaagg tccccagcga 360
ccttgacgag catctgcccg gcatttctga cagctttgtg aactgggtgg ccgagaagga 420
atgggagttg ccgccagatt ctgacatgga tctgaatctg attgagcagg cacccctgac 480
cgtggccgag aagctgcagc gcgacttcct ggtccagtgg cgccgcgtga gtaaggcccc 540
ggaggccctc ttctttgttc agttcgagaa gggcgagtcc tacttccacc tccatattct 600
ggtggagacc acgggggtca aatccatggt gctgggccgc ttcctgagtc agattaggga 660
caagctggtg cagaccatct accgcgggat cgagccgacc ctgcccaact ggttcgcggt 720
gaccaagacg cgtaatggcg ccggaggggg gaacaaggtg gtggacgagt gctacatccc 780
caactacctc ctgcccaaga ctcagcccga gctgcagtgg gcgtggacta acatggagga 840
gtatataagc gcgtgtttaa acctggccga gcgcaaacgg ctcgtggcgc acgacctgac 900
ccacgtcagc cagacccagg agcagaacaa ggagaatctg aaccccaatt ctgacgcgcc 960
tgtcatccgg tcaaaaacct ccgcacgcta catggagctg gtcgggtggc tggtggaccg 1020
gggcatcacc tccgagaagc agtggatcca ggaggaccag gcctcgtaca tctccttcaa 1080
cgccgcctcc aactcgcggt cccagatcaa ggccgctctg gacaatgccg gcaagatcat 1140
ggcgctgacc aaatccgcgc ccgactacct ggtaggcccc gctccgcccg ccgacattaa 1200
aaccaaccgc atttaccgca tcctggagct gaacggctac gaccctgcct acgccggctc 1260
cgtctttctc ggctgggccc agaaaaggtt cggaaaacgc aacaccatct ggctgtttgg 1320
gccggccacc acgggcaaga ccaacatcgc ggaagccatc gcccacgccg tgcccttcta 1380
cggctgcgtc aactggacca atgagaactt tcccttcaac gattgcgtcg acaagatggt 1440
gatctggtgg gaggagggca agatgacggc caaggtcgtg gagtccgcca aggccattct 1500
cggcggcagc aaggtgcgcg tggaccaaaa gtgcaagtcg tccgcccaga tcgatcccac 1560
ccccgtgatc gtcacctcca acaccaacat gtgcgccgtg attgacggga acagcaccac 1620
cttcgagcac cagcagccgt tgcaggaccg gatgttcaaa tttgaactca cccgccgtct 1680
ggagcatgac tttggcaagg tgacaaagca ggaagtcaaa gagttcttcc gctgggcgca 1740
ggatcacgtg accgaggtgg cgcatgagtt ctacgtcaga aagggtggag ccaacaagag 1800
acccgccccc gatgacgcgg ataaaagcga gcccaagcgg gcctgcccct cagtcgcgga 1860
tccatcgacg tcagacgcgg aaggagctcc ggtggacttt gccgacaggt accaaaacaa 1920
atgttctcgt cacgcgggca tgcttcagat gctgtttccc tgcaaaacat gcgagagaat 1980
gaatcagaat ttcaacattt gcttcacgca cgggaccaga gactgttcag aatgtttccc 2040
cggcgtgtca gaatctcaac cggtcgtcag aaagaggacg tatcggaaac tctgtgccat 2100
tcatcatctg ctggggcggg ctcccgagat tgcttgctcg gcctgcgatc tggtcaacgt 2160
ggatctggat gactgtgttt ctgagcaata aatgacttaa accaggt atg gct gcc 2216
Met Ala Ala
1
gat ggt tat ctt cca gat tgg ctc gag gac aac ctc tct gag ggc att 2264
Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile
5 10 15
cgc gag tgg tgg gac ttg aaa cct gga gcc ccg aaa ccc aaa gcc aac 2312
Arg Glu Trp Trp Asp Leu Lys Pro Gly Ala Pro Lys Pro Lys Ala Asn
20 25 30 35
cag caa aag cag gac gac ggc cgg ggt ctg gtg ctt cct ggc tac aag 2360
Gln Gln Lys Gln Asp Asp Gly Arg Gly Leu Val Leu Pro Gly Tyr Lys
40 45 50
tac ctc gga ccc ttc aac gga ctc gac aag ggg gag ccc gtc aac gcg 2408
Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Ala
55 60 65
gcg gat gca gcg gcc ctc gag cac gac aag gcc tac gac cag cag ctc 2456
Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln Gln Leu
70 75 80
aaa gcg ggt gac aat ccg tac ctg cgg tat aac cac gcc gac gcc gag 2504
Lys Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala Asp Ala Glu
85 90 95
ttt cag gag cgt ctg caa gaa gat acg tct ttt ggg ggc aac ctc ggg 2552
Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly
100 105 110 115
cga gca gtc ttc cag gcc aag aag agg gtt ctc gaa cct ttt ggt ctg 2600
Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro Phe Gly Leu
120 125 130
gtt gag gaa ggt gct aag acg gct cct gga aag aaa cgt ccg gta gag 2648
Val Glu Glu Gly Ala Lys Thr Ala Pro Gly Lys Lys Arg Pro Val Glu
135 140 145
cag tcg cca caa gag cca gac tcc tcc tcg ggc att ggc aag aca ggc 2696
Gln Ser Pro Gln Glu Pro Asp Ser Ser Ser Gly Ile Gly Lys Thr Gly
150 155 160
cag cag ccc gct aaa aag aga ctc aat ttt ggt cag act ggc gac tca 2744
Gln Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr Gly Asp Ser
165 170 175
gag tca gtc ccc gac cca caa cct ctc gga gaa cct cca gca acc ccc 2792
Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro Pro Ala Thr Pro
180 185 190 195
gct gct gtg gga cct act aca atg gct tca ggc ggt ggc gca cca atg 2840
Ala Ala Val Gly Pro Thr Thr Met Ala Ser Gly Gly Gly Ala Pro Met
200 205 210
gca gac aat aac gaa ggc gcc gac gga gtg ggt aat gcc tca gga aat 2888
Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn Ala Ser Gly Asn
215 220 225
tgg cat tgc gat tcc aca tgg ctg ggc gac aga gtc atc acc acc agc 2936
Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val Ile Thr Thr Ser
230 235 240
acc cga aca tgg gcc ttg ccc acc tat aac aac cac ctc tac aag caa 2984
Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln
245 250 255
atc tcc agt gct tca acg ggg gcc agc aac gac aac cac tac ttc ggc 3032
Ile Ser Ser Ala Ser Thr Gly Ala Ser Asn Asp Asn His Tyr Phe Gly
260 265 270 275
tac agc acc ccc tgg ggg tat ttt gat ttc aac aga ttc cac tgc cat 3080
Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His
280 285 290
ttc tca cca cgt gac tgg cag cga ctc atc aac aac aat tgg gga ttc 3128
Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe
295 300 305
cgg ccc aag aga ctc aac ttc aag ctc ttc aac atc caa gtc aag gag 3176
Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val Lys Glu
310 315 320
gtc acg acg aat gat ggc gtc acg acc atc gct aat aac ctt acc agc 3224
Val Thr Thr Asn Asp Gly Val Thr Thr Ile Ala Asn Asn Leu Thr Ser
325 330 335
acg gtt caa gtc ttc tcg gac tcg gag tac cag ttg ccg tac gtc ctc 3272
Thr Val Gln Val Phe Ser Asp Ser Glu Tyr Gln Leu Pro Tyr Val Leu
340 345 350 355
ggc tct gcg cac cag ggc tgc ctc cct ccg ttc ccg gcg gac gtg ttc 3320
Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe
360 365 370
atg att ccg cag tac ggc tac cta acg ctc aac aat ggc agc cag gca 3368
Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn Gly Ser Gln Ala
375 380 385
gtg gga cgg tca tcc ttt tac tgc ctg gaa tat ttc cca tcg cag atg 3416
Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln Met
390 395 400
ctg aga acg ggc aat aac ttt acc ttc agc tac acc ttc gag gac gtg 3464
Leu Arg Thr Gly Asn Asn Phe Thr Phe Ser Tyr Thr Phe Glu Asp Val
405 410 415
cct ttc cac agc agc tac gcg cac agc cag agc ctg gac cgg ctg atg 3512
Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp Arg Leu Met
420 425 430 435
aat cct ctc atc gac cag tac ctg tat tac ctg aac aga act cag aat 3560
Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Asn Arg Thr Gln Asn
440 445 450
cag tcc gga agt gcc caa aac aag gac ttg ctg ttt agc cgg ggg tct 3608
Gln Ser Gly Ser Ala Gln Asn Lys Asp Leu Leu Phe Ser Arg Gly Ser
455 460 465
cca gct ggc atg tct gtt cag ccc aaa aac tgg cta cct gga ccc tgt 3656
Pro Ala Gly Met Ser Val Gln Pro Lys Asn Trp Leu Pro Gly Pro Cys
470 475 480
tac cgg cag cag cgc gtt tct aaa aca aaa aca gac aac aac aac agc 3704
Tyr Arg Gln Gln Arg Val Ser Lys Thr Lys Thr Asp Asn Asn Asn Ser
485 490 495
aac ttt acc tgg act ggt gct tca aaa tat aac ctt aat ggg cgt gaa 3752
Asn Phe Thr Trp Thr Gly Ala Ser Lys Tyr Asn Leu Asn Gly Arg Glu
500 505 510 515
tct ata atc aac cct ggc act gct atg gcc tca cac aaa gac gac aaa 3800
Ser Ile Ile Asn Pro Gly Thr Ala Met Ala Ser His Lys Asp Asp Lys
520 525 530
gac aag ttc ttt ccc atg agc ggt gtc atg att ttt gga aag gag agc 3848
Asp Lys Phe Phe Pro Met Ser Gly Val Met Ile Phe Gly Lys Glu Ser
535 540 545
gcc gga gct tca aac act gca ttg gac aat gtc atg atc aca gac gaa 3896
Ala Gly Ala Ser Asn Thr Ala Leu Asp Asn Val Met Ile Thr Asp Glu
550 555 560
gag gaa atc aaa gcc act aac ccc gtg gcc acc gaa aga ttt ggg act 3944
Glu Glu Ile Lys Ala Thr Asn Pro Val Ala Thr Glu Arg Phe Gly Thr
565 570 575
gtg gca gtc aat ctc cag agc agc agc aca gac cct gcg acc gga gat 3992
Val Ala Val Asn Leu Gln Ser Ser Ser Thr Asp Pro Ala Thr Gly Asp
580 585 590 595
gtg cat gtt atg gga gcc tta cct gga atg gtg tgg caa gac aga gac 4040
Val His Val Met Gly Ala Leu Pro Gly Met Val Trp Gln Asp Arg Asp
600 605 610
gta tac ctg cag ggt cct att tgg gcc aaa att cct cac acg gat gga 4088
Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly
615 620 625
cac ttt cac ccg tct cct ctc atg ggc ggc ttt gga ctt aag cac ccg 4136
His Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Leu Lys His Pro
630 635 640
cct cct cag atc ctc atc aaa aac acg cct gtt cct gcg aat cct ccg 4184
Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asn Pro Pro
645 650 655
gca gag ttt tcg gct aca aag ttt gct tca ttc atc acc cag tat tcc 4232
Ala Glu Phe Ser Ala Thr Lys Phe Ala Ser Phe Ile Thr Gln Tyr Ser
660 665 670 675
aca gga caa gtg agc gtg gag att gaa tgg gag ctg cag aaa gaa aac 4280
Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn
680 685 690
agc aaa cgc tgg aat ccc gaa gtg cag tat aca tct aac tat gca aaa 4328
Ser Lys Arg Trp Asn Pro Glu Val Gln Tyr Thr Ser Asn Tyr Ala Lys
695 700 705
tct gcc aac gtt gat ttc act gtg gac aac aat gga ctt tat act gag 4376
Ser Ala Asn Val Asp Phe Thr Val Asp Asn Asn Gly Leu Tyr Thr Glu
710 715 720
cct cgc ccc att ggc acc cgt tac ctc acc cgt ccc ctg taa 4418
Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Pro Leu
725 730 735
ttgtgtgtta atcaataaac cggttaattc gtgtcagttg aactttggtc tcatgtcgtt 4478
attatcttat ctggtcacca tagcaaccgg ttacacatta actgcttagt tgcgcttcgc 4538
gaatacccct agtgatggag ttgcccactc cctctatgcg cgctcgctcg ctcggtgggg 4598
ccggcagagc agagctctgc cgtctgcgga cctttggtcc gcaggcccca ccgagcgagc 4658
gagcgcgcat agagggagtg ggcaa 4683
<210> 24
<211> 736
<212> PRT
<213> adeno-associated virus 6
<400> 24
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Asp Leu Lys Pro Gly Ala Pro Lys Pro
20 25 30
Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro
115 120 125
Phe Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ser Gly Ile Gly
145 150 155 160
Lys Thr Gly Gln Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr
165 170 175
Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro Pro
180 185 190
Ala Thr Pro Ala Ala Val Gly Pro Thr Thr Met Ala Ser Gly Gly Gly
195 200 205
Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn Ala
210 215 220
Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val Ile
225 230 235 240
Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu
245 250 255
Tyr Lys Gln Ile Ser Ser Ala Ser Thr Gly Ala Ser Asn Asp Asn His
260 265 270
Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe
275 280 285
His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn
290 295 300
Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln
305 310 315 320
Val Lys Glu Val Thr Thr Asn Asp Gly Val Thr Thr Ile Ala Asn Asn
325 330 335
Leu Thr Ser Thr Val Gln Val Phe Ser Asp Ser Glu Tyr Gln Leu Pro
340 345 350
Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro Ala
355 360 365
Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn Gly
370 375 380
Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro
385 390 395 400
Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Thr Phe Ser Tyr Thr Phe
405 410 415
Glu Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp
420 425 430
Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Asn Arg
435 440 445
Thr Gln Asn Gln Ser Gly Ser Ala Gln Asn Lys Asp Leu Leu Phe Ser
450 455 460
Arg Gly Ser Pro Ala Gly Met Ser Val Gln Pro Lys Asn Trp Leu Pro
465 470 475 480
Gly Pro Cys Tyr Arg Gln Gln Arg Val Ser Lys Thr Lys Thr Asp Asn
485 490 495
Asn Asn Ser Asn Phe Thr Trp Thr Gly Ala Ser Lys Tyr Asn Leu Asn
500 505 510
Gly Arg Glu Ser Ile Ile Asn Pro Gly Thr Ala Met Ala Ser His Lys
515 520 525
Asp Asp Lys Asp Lys Phe Phe Pro Met Ser Gly Val Met Ile Phe Gly
530 535 540
Lys Glu Ser Ala Gly Ala Ser Asn Thr Ala Leu Asp Asn Val Met Ile
545 550 555 560
Thr Asp Glu Glu Glu Ile Lys Ala Thr Asn Pro Val Ala Thr Glu Arg
565 570 575
Phe Gly Thr Val Ala Val Asn Leu Gln Ser Ser Ser Thr Asp Pro Ala
580 585 590
Thr Gly Asp Val His Val Met Gly Ala Leu Pro Gly Met Val Trp Gln
595 600 605
Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His
610 615 620
Thr Asp Gly His Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Leu
625 630 635 640
Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala
645 650 655
Asn Pro Pro Ala Glu Phe Ser Ala Thr Lys Phe Ala Ser Phe Ile Thr
660 665 670
Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln
675 680 685
Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Val Gln Tyr Thr Ser Asn
690 695 700
Tyr Ala Lys Ser Ala Asn Val Asp Phe Thr Val Asp Asn Asn Gly Leu
705 710 715 720
Tyr Thr Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Pro Leu
725 730 735
<210> 25
<211> 4721
<212> DNA
<213> adeno-associated virus 7
<220>
<221> CDS
<222> (2222)..(4435)
<223> AAV7 VP1
<220>
<221> misc_feature
<222> (2633)..(4435)
<223> AAV7 VP2
<220>
<221> misc_feature
<222> (2831)..(4435)
<223> AAV7 VP3
<400> 25
ttggccactc cctctatgcg cgctcgctcg ctcggtgggg cctgcggacc aaaggtccgc 60
agacggcaga gctctgctct gccggcccca ccgagcgagc gagcgcgcat agagggagtg 120
gccaactcca tcactagggg taccgcgaag cgcctcccac gctgccgcgt cagcgctgac 180
gtaaatcacg tcatagggga gtggtcctgt attagctgtc acgtgagtgc ttttgcgaca 240
ttttgcgaca ccacgtggcc atttgaggta tatatggccg agtgagcgag caggatctcc 300
attttgaccg cgaaatttga acgagcagca gccatgccgg gtttctacga gatcgtgatc 360
aaggtgccga gcgacctgga cgagcacctg ccgggcattt ctgactcgtt tgtgaactgg 420
gtggccgaga aggaatggga gctgcccccg gattctgaca tggatctgaa tctgatcgag 480
caggcacccc tgaccgtggc cgagaagctg cagcgcgact tcctggtcca atggcgccgc 540
gtgagtaagg ccccggaggc cctgttcttt gttcagttcg agaagggcga gagctacttc 600
caccttcacg ttctggtgga gaccacgggg gtcaagtcca tggtgctagg ccgcttcctg 660
agtcagattc gggagaagct ggtccagacc atctaccgcg gggtcgagcc cacgctgccc 720
aactggttcg cggtgaccaa gacgcgtaat ggcgccggcg gggggaacaa ggtggtggac 780
gagtgctaca tccccaacta cctcctgccc aagacccagc ccgagctgca gtgggcgtgg 840
actaacatgg aggagtatat aagcgcgtgt ttgaacctgg ccgaacgcaa acggctcgtg 900
gcgcagcacc tgacccacgt cagccagacg caggagcaga acaaggagaa tctgaacccc 960
aattctgacg cgcccgtgat caggtcaaaa acctccgcgc gctacatgga gctggtcggg 1020
tggctggtgg accggggcat cacctccgag aagcagtgga tccaggagga ccaggcctcg 1080
tacatctcct tcaacgccgc ctccaactcg cggtcccaga tcaaggccgc gctggacaat 1140
gccggcaaga tcatggcgct gaccaaatcc gcgcccgact acctggtggg gccctcgctg 1200
cccgcggaca ttaaaaccaa ccgcatctac cgcatcctgg agctgaacgg gtacgatcct 1260
gcctacgccg gctccgtctt tctcggctgg gcccagaaaa agttcgggaa gcgcaacacc 1320
atctggctgt ttgggcccgc caccaccggc aagaccaaca ttgcggaagc catcgcccac 1380
gccgtgccct tctacggctg cgtcaactgg accaatgaga actttccctt caacgattgc 1440
gtcgacaaga tggtgatctg gtgggaggag ggcaagatga cggccaaggt cgtggagtcc 1500
gccaaggcca ttctcggcgg cagcaaggtg cgcgtggacc aaaagtgcaa gtcgtccgcc 1560
cagatcgacc ccacccccgt gatcgtcacc tccaacacca acatgtgcgc cgtgattgac 1620
gggaacagca ccaccttcga gcaccagcag ccgttgcagg accggatgtt caaatttgaa 1680
ctcacccgcc gtctggagca cgactttggc aaggtgacga agcaggaagt caaagagttc 1740
ttccgctggg ccagtgatca cgtgaccgag gtggcgcatg agttctacgt cagaaagggc 1800
ggagccagca aaagacccgc ccccgatgac gcggatataa gcgagcccaa gcgggcctgc 1860
ccctcagtcg cggatccatc gacgtcagac gcggaaggag ctccggtgga ctttgccgac 1920
aggtaccaaa acaaatgttc tcgtcacgcg ggcatgattc agatgctgtt tccctgcaaa 1980
acgtgcgaga gaatgaatca gaatttcaac atttgcttca cacacggggt cagagactgt 2040
ttagagtgtt tccccggcgt gtcagaatct caaccggtcg tcagaaaaaa gacgtatcgg 2100
aaactctgcg cgattcatca tctgctgggg cgggcgcccg agattgcttg ctcggcctgc 2160
gacctggtca acgtggacct ggacgactgc gtttctgagc aataaatgac ttaaaccagg 2220
t atg gct gcc gat ggt tat ctt cca gat tgg ctc gag gac aac ctc tct 2269
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
gag ggc att cgc gag tgg tgg gac ctg aaa cct gga gcc ccg aaa ccc 2317
Glu Gly Ile Arg Glu Trp Trp Asp Leu Lys Pro Gly Ala Pro Lys Pro
20 25 30
aaa gcc aac cag caa aag cag gac aac ggc cgg ggt ctg gtg ctt cct 2365
Lys Ala Asn Gln Gln Lys Gln Asp Asn Gly Arg Gly Leu Val Leu Pro
35 40 45
ggc tac aag tac ctc gga ccc ttc aac gga ctc gac aag ggg gag ccc 2413
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
gtc aac gcg gcg gac gca gcg gcc ctc gag cac gac aag gcc tac gac 2461
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
cag cag ctc aaa gcg ggt gac aat ccg tac ctg cgg tat aac cac gcc 2509
Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala
85 90 95
gac gcc gag ttt cag gag cgt ctg caa gaa gat acg tca ttt ggg ggc 2557
Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly
100 105 110
aac ctc ggg cga gca gtc ttc cag gcc aag aag cgg gtt ctc gaa cct 2605
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro
115 120 125
ctc ggt ctg gtt gag gaa ggc gct aag acg gct cct gca aag aag aga 2653
Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Ala Lys Lys Arg
130 135 140
ccg gta gag ccg tca cct cag cgt tcc ccc gac tcc tcc acg ggc atc 2701
Pro Val Glu Pro Ser Pro Gln Arg Ser Pro Asp Ser Ser Thr Gly Ile
145 150 155 160
ggc aag aaa ggc cag cag ccc gcc aga aag aga ctc aat ttc ggt cag 2749
Gly Lys Lys Gly Gln Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln
165 170 175
act ggc gac tca gag tca gtc ccc gac cct caa cct ctc gga gaa cct 2797
Thr Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro
180 185 190
cca gca gcg ccc tct agt gtg gga tct ggt aca gtg gct gca ggc ggt 2845
Pro Ala Ala Pro Ser Ser Val Gly Ser Gly Thr Val Ala Ala Gly Gly
195 200 205
ggc gca cca atg gca gac aat aac gaa ggt gcc gac gga gtg ggt aat 2893
Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn
210 215 220
gcc tca gga aat tgg cat tgc gat tcc aca tgg ctg ggc gac aga gtc 2941
Ala Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val
225 230 235 240
att acc acc agc acc cga acc tgg gcc ctg ccc acc tac aac aac cac 2989
Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His
245 250 255
ctc tac aag caa atc tcc agt gaa act gca ggt agt acc aac gac aac 3037
Leu Tyr Lys Gln Ile Ser Ser Glu Thr Ala Gly Ser Thr Asn Asp Asn
260 265 270
acc tac ttc ggc tac agc acc ccc tgg ggg tat ttt gac ttt aac aga 3085
Thr Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg
275 280 285
ttc cac tgc cac ttc tca cca cgt gac tgg cag cga ctc atc aac aac 3133
Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn
290 295 300
aac tgg gga ttc cgg ccc aag aag ctg cgg ttc aag ctc ttc aac atc 3181
Asn Trp Gly Phe Arg Pro Lys Lys Leu Arg Phe Lys Leu Phe Asn Ile
305 310 315 320
cag gtc aag gag gtc acg acg aat gac ggc gtt acg acc atc gct aat 3229
Gln Val Lys Glu Val Thr Thr Asn Asp Gly Val Thr Thr Ile Ala Asn
325 330 335
aac ctt acc agc acg att cag gta ttc tcg gac tcg gaa tac cag ctg 3277
Asn Leu Thr Ser Thr Ile Gln Val Phe Ser Asp Ser Glu Tyr Gln Leu
340 345 350
ccg tac gtc ctc ggc tct gcg cac cag ggc tgc ctg cct ccg ttc ccg 3325
Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro
355 360 365
gcg gac gtc ttc atg att cct cag tac ggc tac ctg act ctc aac aat 3373
Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn
370 375 380
ggc agt cag tct gtg gga cgt tcc tcc ttc tac tgc ctg gag tac ttc 3421
Gly Ser Gln Ser Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe
385 390 395 400
ccc tct cag atg ctg aga acg ggc aac aac ttt gag ttc agc tac agc 3469
Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Glu Phe Ser Tyr Ser
405 410 415
ttc gag gac gtg cct ttc cac agc agc tac gca cac agc cag agc ctg 3517
Phe Glu Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu
420 425 430
gac cgg ctg atg aat ccc ctc atc gac cag tac ttg tac tac ctg gcc 3565
Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ala
435 440 445
aga aca cag agt aac cca gga ggc aca gct ggc aat cgg gaa ctg cag 3613
Arg Thr Gln Ser Asn Pro Gly Gly Thr Ala Gly Asn Arg Glu Leu Gln
450 455 460
ttt tac cag ggc ggg cct tca act atg gcc gaa caa gcc aag aat tgg 3661
Phe Tyr Gln Gly Gly Pro Ser Thr Met Ala Glu Gln Ala Lys Asn Trp
465 470 475 480
tta cct gga cct tgc ttc cgg caa caa aga gtc tcc aaa acg ctg gat 3709
Leu Pro Gly Pro Cys Phe Arg Gln Gln Arg Val Ser Lys Thr Leu Asp
485 490 495
caa aac aac aac agc aac ttt gct tgg act ggt gcc acc aaa tat cac 3757
Gln Asn Asn Asn Ser Asn Phe Ala Trp Thr Gly Ala Thr Lys Tyr His
500 505 510
ctg aac ggc aga aac tcg ttg gtt aat ccc ggc gtc gcc atg gca act 3805
Leu Asn Gly Arg Asn Ser Leu Val Asn Pro Gly Val Ala Met Ala Thr
515 520 525
cac aag gac gac gag gac cgc ttt ttc cca tcc agc gga gtc ctg att 3853
His Lys Asp Asp Glu Asp Arg Phe Phe Pro Ser Ser Gly Val Leu Ile
530 535 540
ttt gga aaa act gga gca act aac aaa act aca ttg gaa aat gtg tta 3901
Phe Gly Lys Thr Gly Ala Thr Asn Lys Thr Thr Leu Glu Asn Val Leu
545 550 555 560
atg aca aat gaa gaa gaa att cgt cct act aat cct gta gcc acg gaa 3949
Met Thr Asn Glu Glu Glu Ile Arg Pro Thr Asn Pro Val Ala Thr Glu
565 570 575
gaa tac ggg ata gtc agc agc aac tta caa gcg gct aat act gca gcc 3997
Glu Tyr Gly Ile Val Ser Ser Asn Leu Gln Ala Ala Asn Thr Ala Ala
580 585 590
cag aca caa gtt gtc aac aac cag gga gcc tta cct ggc atg gtc tgg 4045
Gln Thr Gln Val Val Asn Asn Gln Gly Ala Leu Pro Gly Met Val Trp
595 600 605
cag aac cgg gac gtg tac ctg cag ggt ccc atc tgg gcc aag att cct 4093
Gln Asn Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro
610 615 620
cac acg gat ggc aac ttt cac ccg tct cct ttg atg ggc ggc ttt gga 4141
His Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly
625 630 635 640
ctt aaa cat ccg cct cct cag atc ctg atc aag aac act ccc gtt ccc 4189
Leu Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro
645 650 655
gct aat cct ccg gag gtg ttt act cct gcc aag ttt gct tcg ttc atc 4237
Ala Asn Pro Pro Glu Val Phe Thr Pro Ala Lys Phe Ala Ser Phe Ile
660 665 670
aca cag tac agc acc gga caa gtc agc gtg gaa atc gag tgg gag ctg 4285
Thr Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu
675 680 685
cag aag gaa aac agc aag cgc tgg aac ccg gag att cag tac acc tcc 4333
Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser
690 695 700
aac ttt gaa aag cag act ggt gtg gac ttt gcc gtt gac agc cag ggt 4381
Asn Phe Glu Lys Gln Thr Gly Val Asp Phe Ala Val Asp Ser Gln Gly
705 710 715 720
gtt tac tct gag cct cgc cct att ggc act cgt tac ctc acc cgt aat 4429
Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn
725 730 735
ctg taa ttgcatgtta atcaataaac cggttgattc gtttcagttg aactttggtc 4485
Leu
tcctgtgctt cttatcttat cggtttccat agcaactggt tacacattaa ctgcttgggt 4545
gcgcttcacg ataagaacac tgacgtcacc gcggtacccc tagtgatgga gttggccact 4605
ccctctatgc gcgctcgctc gctcggtggg gcctgcggac caaaggtccg cagacggcag 4665
agctctgctc tgccggcccc accgagcgag cgagcgcgca tagagggagt ggccaa 4721
<210> 26
<211> 737
<212> PRT
<213> adeno-associated virus 7
<400> 26
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Asp Leu Lys Pro Gly Ala Pro Lys Pro
20 25 30
Lys Ala Asn Gln Gln Lys Gln Asp Asn Gly Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Ala Lys Lys Arg
130 135 140
Pro Val Glu Pro Ser Pro Gln Arg Ser Pro Asp Ser Ser Thr Gly Ile
145 150 155 160
Gly Lys Lys Gly Gln Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln
165 170 175
Thr Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro
180 185 190
Pro Ala Ala Pro Ser Ser Val Gly Ser Gly Thr Val Ala Ala Gly Gly
195 200 205
Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn
210 215 220
Ala Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val
225 230 235 240
Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His
245 250 255
Leu Tyr Lys Gln Ile Ser Ser Glu Thr Ala Gly Ser Thr Asn Asp Asn
260 265 270
Thr Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg
275 280 285
Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn
290 295 300
Asn Trp Gly Phe Arg Pro Lys Lys Leu Arg Phe Lys Leu Phe Asn Ile
305 310 315 320
Gln Val Lys Glu Val Thr Thr Asn Asp Gly Val Thr Thr Ile Ala Asn
325 330 335
Asn Leu Thr Ser Thr Ile Gln Val Phe Ser Asp Ser Glu Tyr Gln Leu
340 345 350
Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro
355 360 365
Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn
370 375 380
Gly Ser Gln Ser Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe
385 390 395 400
Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Glu Phe Ser Tyr Ser
405 410 415
Phe Glu Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu
420 425 430
Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ala
435 440 445
Arg Thr Gln Ser Asn Pro Gly Gly Thr Ala Gly Asn Arg Glu Leu Gln
450 455 460
Phe Tyr Gln Gly Gly Pro Ser Thr Met Ala Glu Gln Ala Lys Asn Trp
465 470 475 480
Leu Pro Gly Pro Cys Phe Arg Gln Gln Arg Val Ser Lys Thr Leu Asp
485 490 495
Gln Asn Asn Asn Ser Asn Phe Ala Trp Thr Gly Ala Thr Lys Tyr His
500 505 510
Leu Asn Gly Arg Asn Ser Leu Val Asn Pro Gly Val Ala Met Ala Thr
515 520 525
His Lys Asp Asp Glu Asp Arg Phe Phe Pro Ser Ser Gly Val Leu Ile
530 535 540
Phe Gly Lys Thr Gly Ala Thr Asn Lys Thr Thr Leu Glu Asn Val Leu
545 550 555 560
Met Thr Asn Glu Glu Glu Ile Arg Pro Thr Asn Pro Val Ala Thr Glu
565 570 575
Glu Tyr Gly Ile Val Ser Ser Asn Leu Gln Ala Ala Asn Thr Ala Ala
580 585 590
Gln Thr Gln Val Val Asn Asn Gln Gly Ala Leu Pro Gly Met Val Trp
595 600 605
Gln Asn Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro
610 615 620
His Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly
625 630 635 640
Leu Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro
645 650 655
Ala Asn Pro Pro Glu Val Phe Thr Pro Ala Lys Phe Ala Ser Phe Ile
660 665 670
Thr Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu
675 680 685
Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser
690 695 700
Asn Phe Glu Lys Gln Thr Gly Val Asp Phe Ala Val Asp Ser Gln Gly
705 710 715 720
Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn
725 730 735
Leu
<210> 27
<211> 4393
<212> DNA
<213> adeno-associated virus 8
<220>
<221> CDS
<222> (2121)..(4337)
<223> AAV8 VP1
<220>
<221> misc_feature
<222> (2532)..(4337)
<223> AAV8 VP2
<220>
<221> misc_feature
<222> (2730)..(4337)
<223> AAV8 VP3
<400> 27
cagagaggga gtggccaact ccatcactag gggtagcgcg aagcgcctcc cacgctgccg 60
cgtcagcgct gacgtaaatt acgtcatagg ggagtggtcc tgtattagct gtcacgtgag 120
tgcttttgcg gcattttgcg acaccacgtg gccatttgag gtatatatgg ccgagtgagc 180
gagcaggatc tccattttga ccgcgaaatt tgaacgagca gcagccatgc cgggcttcta 240
cgagatcgtg atcaaggtgc cgagcgacct ggacgagcac ctgccgggca tttctgactc 300
gtttgtgaac tgggtggccg agaaggaatg ggagctgccc ccggattctg acatggatcg 360
gaatctgatc gagcaggcac ccctgaccgt ggccgagaag ctgcagcgcg acttcctggt 420
ccaatggcgc cgcgtgagta aggccccgga ggccctcttc tttgttcagt tcgagaaggg 480
cgagagctac tttcacctgc acgttctggt cgagaccacg ggggtcaagt ccatggtgct 540
aggccgcttc ctgagtcaga ttcgggaaaa gcttggtcca gaccatctac ccgcggggtc 600
gagccccacc ttgcccaact ggttcgcggt gaccaaagac gcggtaatgg cgccggcggg 660
ggggaacaag gtggtggacg agtgctacat ccccaactac ctcctgccca agactcagcc 720
cgagctgcag tgggcgtgga ctaacatgga ggagtatata agcgcgtgct tgaacctggc 780
cgagcgcaaa cggctcgtgg cgcagcacct gacccacgtc agccagacgc aggagcagaa 840
caaggagaat ctgaacccca attctgacgc gcccgtgatc aggtcaaaaa cctccgcgcg 900
ctatatggag ctggtcgggt ggctggtgga ccggggcatc acctccgaga agcagtggat 960
ccaggaggac caggcctcgt acatctcctt caacgccgcc tccaactcgc ggtcccagat 1020
caaggccgcg ctggacaatg ccggcaagat catggcgctg accaaatccg cgcccgacta 1080
cctggtgggg ccctcgctgc ccgcggacat tacccagaac cgcatctacc gcatcctcgc 1140
tctcaacggc tacgaccctg cctacgccgg ctccgtcttt ctcggctggg ctcagaaaaa 1200
gttcgggaaa cgcaacacca tctggctgtt tggacccgcc accaccggca agaccaacat 1260
tgcggaagcc atcgcccacg ccgtgccctt ctacggctgc gtcaactgga ccaatgagaa 1320
ctttcccttc aatgattgcg tcgacaagat ggtgatctgg tgggaggagg gcaagatgac 1380
ggccaaggtc gtggagtccg ccaaggccat tctcggcggc agcaaggtgc gcgtggacca 1440
aaagtgcaag tcgtccgccc agatcgaccc cacccccgtg atcgtcacct ccaacaccaa 1500
catgtgcgcc gtgattgacg ggaacagcac caccttcgag caccagcagc ctctccagga 1560
ccggatgttt aagttcgaac tcacccgccg tctggagcac gactttggca aggtgacaaa 1620
gcaggaagtc aaagagttct tccgctgggc cagtgatcac gtgaccgagg tggcgcatga 1680
gttttacgtc agaaagggcg gagccagcaa aagacccgcc cccgatgacg cggataaaag 1740
cgagcccaag cgggcctgcc cctcagtcgc ggatccatcg acgtcagacg cggaaggagc 1800
tccggtggac tttgccgaca ggtaccaaaa caaatgttct cgtcacgcgg gcatgcttca 1860
gatgctgttt ccctgcaaaa cgtgcgagag aatgaatcag aatttcaaca tttgcttcac 1920
acacggggtc agagactgct cagagtgttt ccccggcgtg tcagaatctc aaccggtcgt 1980
cagaaagagg acgtatcgga aactctgtgc gattcatcat ctgctggggc gggctcccga 2040
gattgcttgc tcggcctgcg atctggtcaa cgtggacctg gatgactgtg tttctgagca 2100
ataaatgact taaaccaggt atg gct gcc gat ggt tat ctt cca gat tgg ctc 2153
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu
1 5 10
gag gac aac ctc tct gag ggc att cgc gag tgg tgg gcg ctg aaa cct 2201
Glu Asp Asn Leu Ser Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro
15 20 25
gga gcc ccg aag ccc aaa gcc aac cag caa aag cag gac gac ggc cgg 2249
Gly Ala Pro Lys Pro Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg
30 35 40
ggt ctg gtg ctt cct ggc tac aag tac ctc gga ccc ttc aac gga ctc 2297
Gly Leu Val Leu Pro Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu
45 50 55
gac aag ggg gag ccc gtc aac gcg gcg gac gca gcg gcc ctc gag cac 2345
Asp Lys Gly Glu Pro Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His
60 65 70 75
gac aag gcc tac gac cag cag ctg cag gcg ggt gac aat ccg tac ctg 2393
Asp Lys Ala Tyr Asp Gln Gln Leu Gln Ala Gly Asp Asn Pro Tyr Leu
80 85 90
cgg tat aac cac gcc gac gcc gag ttt cag gag cgt ctg caa gaa gat 2441
Arg Tyr Asn His Ala Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp
95 100 105
acg tct ttt ggg ggc aac ctc ggg cga gca gtc ttc cag gcc aag aag 2489
Thr Ser Phe Gly Gly Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys
110 115 120
cgg gtt ctc gaa cct ctc ggt ctg gtt gag gaa ggc gct aag acg gct 2537
Arg Val Leu Glu Pro Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala
125 130 135
cct gga aag aag aga ccg gta gag cca tca ccc cag cgt tct cca gac 2585
Pro Gly Lys Lys Arg Pro Val Glu Pro Ser Pro Gln Arg Ser Pro Asp
140 145 150 155
tcc tct acg ggc atc ggc aag aaa ggc caa cag ccc gcc aga aaa aga 2633
Ser Ser Thr Gly Ile Gly Lys Lys Gly Gln Gln Pro Ala Arg Lys Arg
160 165 170
ctc aat ttt ggt cag act ggc gac tca gag tca gtt cca gac cct caa 2681
Leu Asn Phe Gly Gln Thr Gly Asp Ser Glu Ser Val Pro Asp Pro Gln
175 180 185
cct ctc gga gaa cct cca gca gcg ccc tct ggt gtg gga cct aat aca 2729
Pro Leu Gly Glu Pro Pro Ala Ala Pro Ser Gly Val Gly Pro Asn Thr
190 195 200
atg gct gca ggc ggt ggc gca cca atg gca gac aat aac gaa ggc gcc 2777
Met Ala Ala Gly Gly Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala
205 210 215
gac gga gtg ggt agt tcc tcg gga aat tgg cat tgc gat tcc aca tgg 2825
Asp Gly Val Gly Ser Ser Ser Gly Asn Trp His Cys Asp Ser Thr Trp
220 225 230 235
ctg ggc gac aga gtc atc acc acc agc acc cga acc tgg gcc ctg ccc 2873
Leu Gly Asp Arg Val Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro
240 245 250
acc tac aac aac cac ctc tac aag caa atc tcc aac ggg aca tcg gga 2921
Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile Ser Asn Gly Thr Ser Gly
255 260 265
gga gcc acc aac gac aac acc tac ttc ggc tac agc acc ccc tgg ggg 2969
Gly Ala Thr Asn Asp Asn Thr Tyr Phe Gly Tyr Ser Thr Pro Trp Gly
270 275 280
tat ttt gac ttt aac aga ttc cac tgc cac ttt tca cca cgt gac tgg 3017
Tyr Phe Asp Phe Asn Arg Phe His Cys His Phe Ser Pro Arg Asp Trp
285 290 295
cag cga ctc atc aac aac aac tgg gga ttc cgg ccc aag aga ctc agc 3065
Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe Arg Pro Lys Arg Leu Ser
300 305 310 315
ttc aag ctc ttc aac atc cag gtc aag gag gtc acg cag aat gaa ggc 3113
Phe Lys Leu Phe Asn Ile Gln Val Lys Glu Val Thr Gln Asn Glu Gly
320 325 330
acc aag acc atc gcc aat aac ctc acc agc acc atc cag gtg ttt acg 3161
Thr Lys Thr Ile Ala Asn Asn Leu Thr Ser Thr Ile Gln Val Phe Thr
335 340 345
gac tcg gag tac cag ctg ccg tac gtt ctc ggc tct gcc cac cag ggc 3209
Asp Ser Glu Tyr Gln Leu Pro Tyr Val Leu Gly Ser Ala His Gln Gly
350 355 360
tgc ctg cct ccg ttc ccg gcg gac gtg ttc atg att ccc cag tac ggc 3257
Cys Leu Pro Pro Phe Pro Ala Asp Val Phe Met Ile Pro Gln Tyr Gly
365 370 375
tac cta aca ctc aac aac ggt agt cag gcc gtg gga cgc tcc tcc ttc 3305
Tyr Leu Thr Leu Asn Asn Gly Ser Gln Ala Val Gly Arg Ser Ser Phe
380 385 390 395
tac tgc ctg gaa tac ttt cct tcg cag atg ctg aga acc ggc aac aac 3353
Tyr Cys Leu Glu Tyr Phe Pro Ser Gln Met Leu Arg Thr Gly Asn Asn
400 405 410
ttc cag ttt act tac acc ttc gag gac gtg cct ttc cac agc agc tac 3401
Phe Gln Phe Thr Tyr Thr Phe Glu Asp Val Pro Phe His Ser Ser Tyr
415 420 425
gcc cac agc cag agc ttg gac cgg ctg atg aat cct ctg att gac cag 3449
Ala His Ser Gln Ser Leu Asp Arg Leu Met Asn Pro Leu Ile Asp Gln
430 435 440
tac ctg tac tac ttg tct cgg act caa aca aca gga ggc acg gca aat 3497
Tyr Leu Tyr Tyr Leu Ser Arg Thr Gln Thr Thr Gly Gly Thr Ala Asn
445 450 455
acg cag act ctg ggc ttc agc caa ggt ggg cct aat aca atg gcc aat 3545
Thr Gln Thr Leu Gly Phe Ser Gln Gly Gly Pro Asn Thr Met Ala Asn
460 465 470 475
cag gca aag aac tgg ctg cca gga ccc tgt tac cgc caa caa cgc gtc 3593
Gln Ala Lys Asn Trp Leu Pro Gly Pro Cys Tyr Arg Gln Gln Arg Val
480 485 490
tca acg aca acc ggg caa aac aac aat agc aac ttt gcc tgg act gct 3641
Ser Thr Thr Thr Gly Gln Asn Asn Asn Ser Asn Phe Ala Trp Thr Ala
495 500 505
ggg acc aaa tac cat ctg aat gga aga aat tca ttg gct aat cct ggc 3689
Gly Thr Lys Tyr His Leu Asn Gly Arg Asn Ser Leu Ala Asn Pro Gly
510 515 520
atc gct atg gca aca cac aaa gac gac gag gag cgt ttt ttt ccc agt 3737
Ile Ala Met Ala Thr His Lys Asp Asp Glu Glu Arg Phe Phe Pro Ser
525 530 535
aac ggg atc ctg att ttt ggc aaa caa aat gct gcc aga gac aat gcg 3785
Asn Gly Ile Leu Ile Phe Gly Lys Gln Asn Ala Ala Arg Asp Asn Ala
540 545 550 555
gat tac agc gat gtc atg ctc acc agc gag gaa gaa atc aaa acc act 3833
Asp Tyr Ser Asp Val Met Leu Thr Ser Glu Glu Glu Ile Lys Thr Thr
560 565 570
aac cct gtg gct aca gag gaa tac ggt atc gtg gca gat aac ttg cag 3881
Asn Pro Val Ala Thr Glu Glu Tyr Gly Ile Val Ala Asp Asn Leu Gln
575 580 585
cag caa aac acg gct cct caa att gga act gtc aac agc cag ggg gcc 3929
Gln Gln Asn Thr Ala Pro Gln Ile Gly Thr Val Asn Ser Gln Gly Ala
590 595 600
tta ccc ggt atg gtc tgg cag aac cgg gac gtg tac ctg cag ggt ccc 3977
Leu Pro Gly Met Val Trp Gln Asn Arg Asp Val Tyr Leu Gln Gly Pro
605 610 615
atc tgg gcc aag att cct cac acg gac ggc aac ttc cac ccg tct ccg 4025
Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe His Pro Ser Pro
620 625 630 635
ctg atg ggc ggc ttt ggc ctg aaa cat cct ccg cct cag atc ctg atc 4073
Leu Met Gly Gly Phe Gly Leu Lys His Pro Pro Pro Gln Ile Leu Ile
640 645 650
aag aac acg cct gta cct gcg gat cct ccg acc acc ttc aac cag tca 4121
Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Thr Phe Asn Gln Ser
655 660 665
aag ctg aac tct ttc atc acg caa tac agc acc gga cag gtc agc gtg 4169
Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val Ser Val
670 675 680
gaa att gaa tgg gag ctg cag aag gaa aac agc aag cgc tgg aac ccc 4217
Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro
685 690 695
gag atc cag tac acc tcc aac tac tac aaa tct aca agt gtg gac ttt 4265
Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser Thr Ser Val Asp Phe
700 705 710 715
gct gtt aat aca gaa ggc gtg tac tct gaa ccc cgc ccc att ggc acc 4313
Ala Val Asn Thr Glu Gly Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr
720 725 730
cgt tac ctc acc cgt aat ctg taa ttgcctgtta atcaataaac cggttgattc 4367
Arg Tyr Leu Thr Arg Asn Leu
735
gtttcagttg aactttggtc tctgcg 4393
<210> 28
<211> 738
<212> PRT
<213> adeno-associated virus 8
<400> 28
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Lys Pro
20 25 30
Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Gln Gln Leu Gln Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
Pro Val Glu Pro Ser Pro Gln Arg Ser Pro Asp Ser Ser Thr Gly Ile
145 150 155 160
Gly Lys Lys Gly Gln Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln
165 170 175
Thr Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro
180 185 190
Pro Ala Ala Pro Ser Gly Val Gly Pro Asn Thr Met Ala Ala Gly Gly
195 200 205
Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser
210 215 220
Ser Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val
225 230 235 240
Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His
245 250 255
Leu Tyr Lys Gln Ile Ser Asn Gly Thr Ser Gly Gly Ala Thr Asn Asp
260 265 270
Asn Thr Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn
275 280 285
Arg Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn
290 295 300
Asn Asn Trp Gly Phe Arg Pro Lys Arg Leu Ser Phe Lys Leu Phe Asn
305 310 315 320
Ile Gln Val Lys Glu Val Thr Gln Asn Glu Gly Thr Lys Thr Ile Ala
325 330 335
Asn Asn Leu Thr Ser Thr Ile Gln Val Phe Thr Asp Ser Glu Tyr Gln
340 345 350
Leu Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe
355 360 365
Pro Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn
370 375 380
Asn Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr
385 390 395 400
Phe Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Thr Tyr
405 410 415
Thr Phe Glu Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser
420 425 430
Leu Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu
435 440 445
Ser Arg Thr Gln Thr Thr Gly Gly Thr Ala Asn Thr Gln Thr Leu Gly
450 455 460
Phe Ser Gln Gly Gly Pro Asn Thr Met Ala Asn Gln Ala Lys Asn Trp
465 470 475 480
Leu Pro Gly Pro Cys Tyr Arg Gln Gln Arg Val Ser Thr Thr Thr Gly
485 490 495
Gln Asn Asn Asn Ser Asn Phe Ala Trp Thr Ala Gly Thr Lys Tyr His
500 505 510
Leu Asn Gly Arg Asn Ser Leu Ala Asn Pro Gly Ile Ala Met Ala Thr
515 520 525
His Lys Asp Asp Glu Glu Arg Phe Phe Pro Ser Asn Gly Ile Leu Ile
530 535 540
Phe Gly Lys Gln Asn Ala Ala Arg Asp Asn Ala Asp Tyr Ser Asp Val
545 550 555 560
Met Leu Thr Ser Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr
565 570 575
Glu Glu Tyr Gly Ile Val Ala Asp Asn Leu Gln Gln Gln Asn Thr Ala
580 585 590
Pro Gln Ile Gly Thr Val Asn Ser Gln Gly Ala Leu Pro Gly Met Val
595 600 605
Trp Gln Asn Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile
610 615 620
Pro His Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe
625 630 635 640
Gly Leu Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val
645 650 655
Pro Ala Asp Pro Pro Thr Thr Phe Asn Gln Ser Lys Leu Asn Ser Phe
660 665 670
Ile Thr Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu
675 680 685
Leu Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr
690 695 700
Ser Asn Tyr Tyr Lys Ser Thr Ser Val Asp Phe Ala Val Asn Thr Glu
705 710 715 720
Gly Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg
725 730 735
Asn Leu
<210> 29
<211> 4385
<212> DNA
<213> adeno-associated virus 9
<220>
<221> CDS
<222> (2116)..(4329)
<223> AAV9 VP1
<220>
<221> misc_feature
<222> (2527)..(4329)
<223> AAV9 VP2
<220>
<221> misc_feature
<222> (2725)..(4329)
<223> AAV9 VP3
<400> 29
cagagaggga gtggccaact ccatcactag gggtaatcgc gaagcgcctc ccacgctgcc 60
gcgtcagcgc tgacgtagat tacgtcatag gggagtggtc ctgtattagc tgtcacgtga 120
gtgcttttgc gacattttgc gacaccacat ggccatttga ggtatatatg gccgagtgag 180
cgagcaggat ctccattttg accgcgaaat ttgaacgagc agcagccatg ccgggcttct 240
acgagattgt gatcaaggtg ccgagcgacc tggacgagca cctgccgggc atttctgact 300
cttttgtgaa ctgggtggcc gagaaggaat gggagctgcc cccggattct gacatggatc 360
ggaatctgat cgagcaggca cccctgaccg tggccgagaa gctgcagcgc gacttcctgg 420
tccaatggcg ccgcgtgagt aaggccccgg aggccctctt ctttgttcag ttcgagaagg 480
gcgagagcta ctttcacctg cacgttctgg tcgagaccac gggggtcaag tccatggtgc 540
taggccgctt cctgagtcag attcgggaga agctggtcca gaccatctac cgcgggatcg 600
agccgaccct gcccaactgg ttcgcggtga ccaagacgcg taatggcgcc ggcgggggga 660
acaaggtggt ggacgagtgc tacatcccca actacctcct gcccaagact cagcccgagc 720
tgcagtgggc gtggactaac atggaggagt atataagcgc gtgcttgaac ctggccgagc 780
gcaaacggct cgtggcgcag cacctgaccc acgtcagcca gacgcaggag cagaacaagg 840
agaatctgaa ccccaattct gacgcgcccg tgatcaggtc aaaaacctcc gcgcgctaca 900
tggagctggt cgggtggctg gtggaccggg gcatcacctc cgagaagcag tggatccagg 960
aggaccaggc ctcgtacatc tccttcaacg ccgcctccaa ctcgcggtcc cagatcaagg 1020
ccgcgctgga caatgccggc aagatcatgg cgctgaccaa atccgcgccc gactacctgg 1080
taggcccttc acttccggtg gacattacgc agaaccgcat ctaccgcatc ctgcagctca 1140
acggctacga ccctgcctac gccggctccg tctttctcgg ctgggcacaa aagaagttcg 1200
ggaaacgcaa caccatctgg ctgtttgggc cggccaccac gggaaagacc aacatcgcag 1260
aagccattgc ccacgccgtg cccttctacg gctgcgtcaa ctggaccaat gagaactttc 1320
ccttcaacga ttgcgtcgac aagatggtga tctggtggga ggagggcaag atgacggcca 1380
aggtcgtgga gtccgccaag gccattctcg gcggcagcaa ggtgcgcgtg gaccaaaagt 1440
gcaagtcgtc cgcccagatc gaccccactc ccgtgatcgt cacctccaac accaacatgt 1500
gcgccgtgat tgacgggaac agcaccacct tcgagcacca gcagcctctc caggaccgga 1560
tgtttaagtt cgaactcacc cgccgtctgg agcacgactt tggcaaggtg acaaagcagg 1620
aagtcaaaga gttcttccgc tgggccagtg atcacgtgac cgaggtggcg catgagtttt 1680
acgtcagaaa gggcggagcc agcaaaagac ccgcccccga tgacgcggat aaaagcgagc 1740
ccaagcgggc ctgcccctca gtcgcggatc catcgacgtc agacgcggaa ggagctccgg 1800
tggactttgc cgacaggtac caaaacaaat gttctcgtca cgcgggcatg cttcagatgc 1860
tgcttccctg caaaacgtgc gagagaatga atcagaattt caacatttgc ttcacacacg 1920
gggtcagaga ctgctcagag tgtttccccg gcgtgtcaga atctcaaccg gtcgtcagaa 1980
agaggacgta tcggaaactc tgtgcgattc atcatctgct ggggcgggct cccgagattg 2040
cttgctcggc ctgcgatctg gtcaacgtgg acctggatga ctgtgtttct gagcaataaa 2100
tgacttaaac caggt atg gct gcc gat ggt tat ctt cca gat tgg ctc gag 2151
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu
1 5 10
gac aac ctc tct gag ggc att cgc gag tgg tgg gcg ctg aaa cct gga 2199
Asp Asn Leu Ser Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly
15 20 25
gcc ccg aag ccc aaa gcc aac cag caa aag cag gac gac ggc cgg ggt 2247
Ala Pro Lys Pro Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly
30 35 40
ctg gtg ctt cct ggc tac aag tac ctc gga ccc ttc aac gga ctc gac 2295
Leu Val Leu Pro Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp
45 50 55 60
aag ggg gag ccc gtc aac gcg gcg gac gca gcg gcc ctc gag cac ggc 2343
Lys Gly Glu Pro Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Gly
65 70 75
aag gcc tac gac cag cag ctg cag gcg ggt gac aat ccg tac ctg cgg 2391
Lys Ala Tyr Asp Gln Gln Leu Gln Ala Gly Asp Asn Pro Tyr Leu Arg
80 85 90
tat aac cac gcc gac gcc gag ttt cag gag cgt ctg caa gaa gat acg 2439
Tyr Asn His Ala Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr
95 100 105
tct ttt ggg ggc aac ctc ggg cga gca gtc ttc cag gcc aag aag cgg 2487
Ser Phe Gly Gly Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg
110 115 120
gtt ctc gaa cct ctc ggt ctg gtt gag gaa ggc gct aag acg gct cct 2535
Val Leu Glu Pro Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro
125 130 135 140
gga aag aag aga ccg gta gag cca tca ccc cag cgt tct cca gac tcc 2583
Gly Lys Lys Arg Pro Val Glu Pro Ser Pro Gln Arg Ser Pro Asp Ser
145 150 155
tct acg ggc atc ggc aag aaa ggc caa cag ccc gcc aga aaa aga ctc 2631
Ser Thr Gly Ile Gly Lys Lys Gly Gln Gln Pro Ala Arg Lys Arg Leu
160 165 170
aat ttt ggt cag act ggc gac tca gag tca gtt cca gac cct caa cct 2679
Asn Phe Gly Gln Thr Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro
175 180 185
ctc gga gaa cct cca gca gcg ccc tct ggt gtg gga cct aat aca atg 2727
Leu Gly Glu Pro Pro Ala Ala Pro Ser Gly Val Gly Pro Asn Thr Met
190 195 200
gct gca ggc ggt ggc gca cca atg gca gac aat aac gaa ggc gcc gac 2775
Ala Ala Gly Gly Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp
205 210 215 220
gga gtg ggt aat tcc tcg gga aat tgg cat tgc gat tcc aca tgg ctg 2823
Gly Val Gly Asn Ser Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu
225 230 235
ggg gac aga gtc atc acc acc agc acc cga acc tgg gca ttg ccc acc 2871
Gly Asp Arg Val Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr
240 245 250
tac aac aac cac ctc tac aag caa atc tcc aat gga aca tcg gga gga 2919
Tyr Asn Asn His Leu Tyr Lys Gln Ile Ser Asn Gly Thr Ser Gly Gly
255 260 265
agc acc aac gac aac acc tac ttt ggc tac agc acc ccc tgg ggg tat 2967
Ser Thr Asn Asp Asn Thr Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr
270 275 280
ttt gac ttc aac aga ttc cac tgc cac ttc tca cca cgt gac tgg cag 3015
Phe Asp Phe Asn Arg Phe His Cys His Phe Ser Pro Arg Asp Trp Gln
285 290 295 300
cga ctc atc aac aac aac tgg gga ttc cgg cca aag aga ctc aac ttc 3063
Arg Leu Ile Asn Asn Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe
305 310 315
aag ctg ttc aac atc cag gtc aag gag gtt acg acg aac gaa ggc acc 3111
Lys Leu Phe Asn Ile Gln Val Lys Glu Val Thr Thr Asn Glu Gly Thr
320 325 330
aag acc atc gcc aat aac ctt acc agc acc gtc cag gtc ttt acg gac 3159
Lys Thr Ile Ala Asn Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp
335 340 345
tcg gag tac cag cta ccg tac gtc cta ggc tct gcc cac caa gga tgc 3207
Ser Glu Tyr Gln Leu Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys
350 355 360
ctg cca ccg ttt cct gca gac gtc ttc atg gtt cct cag tac ggc tac 3255
Leu Pro Pro Phe Pro Ala Asp Val Phe Met Val Pro Gln Tyr Gly Tyr
365 370 375 380
ctg acg ctc aac aat gga agt caa gcg tta gga cgt tct tct ttc tac 3303
Leu Thr Leu Asn Asn Gly Ser Gln Ala Leu Gly Arg Ser Ser Phe Tyr
385 390 395
tgt ctg gaa tac ttc cct tct cag atg ctg aga acc ggc aac aac ttt 3351
Cys Leu Glu Tyr Phe Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe
400 405 410
cag ttc agc tac act ttc gag gac gtg cct ttc cac agc agc tac gca 3399
Gln Phe Ser Tyr Thr Phe Glu Asp Val Pro Phe His Ser Ser Tyr Ala
415 420 425
cac agc cag agt cta gat cga ctg atg aac ccc ctc atc gac cag tac 3447
His Ser Gln Ser Leu Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr
430 435 440
cta tac tac ctg gtc aga aca cag aca act gga act ggg gga act caa 3495
Leu Tyr Tyr Leu Val Arg Thr Gln Thr Thr Gly Thr Gly Gly Thr Gln
445 450 455 460
act ttg gca ttc agc caa gca ggc cct agc tca atg gcc aat cag gct 3543
Thr Leu Ala Phe Ser Gln Ala Gly Pro Ser Ser Met Ala Asn Gln Ala
465 470 475
aga aac tgg gta ccc ggg cct tgc tac cgt cag cag cgc gtc tcc aca 3591
Arg Asn Trp Val Pro Gly Pro Cys Tyr Arg Gln Gln Arg Val Ser Thr
480 485 490
acc acc aac caa aat aac aac agc aac ttt gcg tgg acg gga gct gct 3639
Thr Thr Asn Gln Asn Asn Asn Ser Asn Phe Ala Trp Thr Gly Ala Ala
495 500 505
aaa ttc aag ctg aac ggg aga gac tcg cta atg aat cct ggc gtg gct 3687
Lys Phe Lys Leu Asn Gly Arg Asp Ser Leu Met Asn Pro Gly Val Ala
510 515 520
atg gca tcg cac aaa gac gac gag gac cgc ttc ttt cca tca agt ggc 3735
Met Ala Ser His Lys Asp Asp Glu Asp Arg Phe Phe Pro Ser Ser Gly
525 530 535 540
gtt ctc ata ttt ggc aag caa gga gcc ggg aac gat gga gtc gac tac 3783
Val Leu Ile Phe Gly Lys Gln Gly Ala Gly Asn Asp Gly Val Asp Tyr
545 550 555
agc cag gtg ctg att aca gat gag gaa gaa att aaa gcc acc aac cct 3831
Ser Gln Val Leu Ile Thr Asp Glu Glu Glu Ile Lys Ala Thr Asn Pro
560 565 570
gta gcc aca gag gaa tac gga gca gtg gcc atc aac aac cag gcc gct 3879
Val Ala Thr Glu Glu Tyr Gly Ala Val Ala Ile Asn Asn Gln Ala Ala
575 580 585
aac acg cag gcg caa act gga ctt gtg cat aac cag gga gtt att cct 3927
Asn Thr Gln Ala Gln Thr Gly Leu Val His Asn Gln Gly Val Ile Pro
590 595 600
ggt atg gtc tgg cag aac cgg gac gtg tac ctg cag ggc cct att tgg 3975
Gly Met Val Trp Gln Asn Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp
605 610 615 620
gct aaa ata cct cac aca gat ggc aac ttt cac ccg tct cct ctg atg 4023
Ala Lys Ile Pro His Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met
625 630 635
ggt gga ttt gga ctg aaa cac cca cct cca cag att cta att aaa aat 4071
Gly Gly Phe Gly Leu Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn
640 645 650
aca cca gtg ccg gca gat cct cct ctt acc ttc aat caa gcc aag ctg 4119
Thr Pro Val Pro Ala Asp Pro Pro Leu Thr Phe Asn Gln Ala Lys Leu
655 660 665
aac tct ttc atc acg cag tac agc acg gga caa gtc agc gtg gaa atc 4167
Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile
670 675 680
gag tgg gag ctg cag aaa gaa aac agc aag cgc tgg aat cca gag atc 4215
Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile
685 690 695 700
cag tat act tca aac tac tac aaa tct aca aat gtg gac ttt gct gtc 4263
Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser Thr Asn Val Asp Phe Ala Val
705 710 715
aat acc aaa ggt gtt tac tct gag cct cgc ccc att ggt act cgt tac 4311
Asn Thr Lys Gly Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr
720 725 730
ctc acc cgt aat ttg taa ttgcctgtta atcaataaac cggttaattc 4359
Leu Thr Arg Asn Leu
735
gtttcagttg aactttggtc tctgcg 4385
<210> 30
<211> 737
<212> PRT
<213> adeno-associated virus 9
<400> 30
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Lys Pro
20 25 30
Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Gly Lys Ala Tyr Asp
65 70 75 80
Gln Gln Leu Gln Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
Pro Val Glu Pro Ser Pro Gln Arg Ser Pro Asp Ser Ser Thr Gly Ile
145 150 155 160
Gly Lys Lys Gly Gln Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln
165 170 175
Thr Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro
180 185 190
Pro Ala Ala Pro Ser Gly Val Gly Pro Asn Thr Met Ala Ala Gly Gly
195 200 205
Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn
210 215 220
Ser Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val
225 230 235 240
Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His
245 250 255
Leu Tyr Lys Gln Ile Ser Asn Gly Thr Ser Gly Gly Ser Thr Asn Asp
260 265 270
Asn Thr Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn
275 280 285
Arg Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn
290 295 300
Asn Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn
305 310 315 320
Ile Gln Val Lys Glu Val Thr Thr Asn Glu Gly Thr Lys Thr Ile Ala
325 330 335
Asn Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Glu Tyr Gln
340 345 350
Leu Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe
355 360 365
Pro Ala Asp Val Phe Met Val Pro Gln Tyr Gly Tyr Leu Thr Leu Asn
370 375 380
Asn Gly Ser Gln Ala Leu Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr
385 390 395 400
Phe Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr
405 410 415
Thr Phe Glu Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser
420 425 430
Leu Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu
435 440 445
Val Arg Thr Gln Thr Thr Gly Thr Gly Gly Thr Gln Thr Leu Ala Phe
450 455 460
Ser Gln Ala Gly Pro Ser Ser Met Ala Asn Gln Ala Arg Asn Trp Val
465 470 475 480
Pro Gly Pro Cys Tyr Arg Gln Gln Arg Val Ser Thr Thr Thr Asn Gln
485 490 495
Asn Asn Asn Ser Asn Phe Ala Trp Thr Gly Ala Ala Lys Phe Lys Leu
500 505 510
Asn Gly Arg Asp Ser Leu Met Asn Pro Gly Val Ala Met Ala Ser His
515 520 525
Lys Asp Asp Glu Asp Arg Phe Phe Pro Ser Ser Gly Val Leu Ile Phe
530 535 540
Gly Lys Gln Gly Ala Gly Asn Asp Gly Val Asp Tyr Ser Gln Val Leu
545 550 555 560
Ile Thr Asp Glu Glu Glu Ile Lys Ala Thr Asn Pro Val Ala Thr Glu
565 570 575
Glu Tyr Gly Ala Val Ala Ile Asn Asn Gln Ala Ala Asn Thr Gln Ala
580 585 590
Gln Thr Gly Leu Val His Asn Gln Gly Val Ile Pro Gly Met Val Trp
595 600 605
Gln Asn Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro
610 615 620
His Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly
625 630 635 640
Leu Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro
645 650 655
Ala Asp Pro Pro Leu Thr Phe Asn Gln Ala Lys Leu Asn Ser Phe Ile
660 665 670
Thr Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu
675 680 685
Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser
690 695 700
Asn Tyr Tyr Lys Ser Thr Asn Val Asp Phe Ala Val Asn Thr Lys Gly
705 710 715 720
Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn
725 730 735
Leu
<210> 31
<211> 4102
<212> DNA
<213> adeno-associated virus 10
<220>
<221> CDS
<222> (1886)..(4102)
<223> AAV10 VP1
<220>
<221> misc_feature
<222> (2297)..(4102)
<223> AAV10 VP2
<220>
<221> misc_feature
<222> (2495)..(4102)
<223> AAV10 VP3
<400> 31
atgccgggct tctacgagat cgtgatcaag gtgccgagcg acctggacga gcacctgccg 60
ggcatttctg actcgtttgt gaactgggtg gccgagaagg aatgggagct gcccccggat 120
tctgacatgg atcggaatct gatcgagcag gcacccctga ccgtggccga gaagctgcag 180
cgcgacttcc tggtccactg gcgccgcgtg agtaaggccc cggaggccct cttctttgtt 240
cagttcgaga agggcgagtc ctactttcac ctgcacgttc tggtcgagac cacgggggtc 300
aagtccatgg tcctgggccg cttcctgagt cagatcagag acaggctggt gcagaccatc 360
taccgcgggg tagagcccac gctgcccaac tggttcgcgg tgaccaagac gcgaaatggc 420
gccggcgggg ggaacaaggt ggtggacgag tgctacatcc ccaactacct cctgcccaag 480
acgcagcccg agctgcagtg ggcgtggact aacatggagg agtatataag cgcgtgtctg 540
aacctcgcgg agcgtaaacg gctcgtggcg cagcacctga cccacgtcag ccagacgcag 600
gagcagaaca aggagaatct gaacccgaat tctgacgcgc ccgtgatcag gtcaaaaacc 660
tccgcgcgct acatggagct ggtcgggtgg ctggtggacc ggggcatcac ctccgagaag 720
cagtggatcc aggaggacca ggcctcgtac atctccttca acgccgcctc caactcgcgg 780
tcccagatca aggccgcgct ggacaatgcc ggaaagatca tggcgctgac caaatccgcg 840
cccgactacc tggtaggccc gtccttaccc gcggacatta aggccaaccg catctaccgc 900
atcctggagc tcaacggcta cgaccccgcc tacgccggct ccgtcttcct gggctgggcg 960
cagaaaaagt tcggtaaaag gaatacaatt tggctgttcg ggcccgccac caccggcaag 1020
accaacatcg cggaagccat cgcccacgcc gtgcccttct acggctgcgt caactggacc 1080
aatgagaact ttcccttcaa cgattgcgtc gacaagatgg tgatctggtg ggaggagggc 1140
aagatgaccg ccaaggtcgt ggagtccgcc aaggccattc tgggcggaag caaggtgcgc 1200
gtcgaccaaa agtgcaagtc ctcggcccag atcgacccca cgcccgtgat cgtcacctcc 1260
aacaccaaca tgtgcgccgt gatcgacggg aacagcacca ccttcgagca ccagcagccc 1320
ctgcaggacc gcatgttcaa gttcgagctc acccgccgtc tggagcacga ctttggcaag 1380
gtgaccaagc aggaagtcaa agagttcttc cgctgggctc aggatcacgt gactgaggtg 1440
acgcatgagt tctacgtcag aaagggcgga gccaccaaaa gacccgcccc cagtgacgcg 1500
gatataagcg agcccaagcg ggcctgcccc tcagttgcgg agccatcgac gtcagacgcg 1560
gaagcaccgg tggactttgc ggacaggtac caaaacaaat gttctcgtca cgcgggcatg 1620
cttcagatgc tgtttccctg caagacatgc gagagaatga atcagaattt caacgtctgc 1680
ttcacgcacg gggtcagaga ctgctcagag tgcttccccg gcgcgtcaga atctcaacct 1740
gtcgtcagaa aaaagacgta tcagaaactg tgcgcgattc atcatctgct ggggcgggca 1800
cccgagattg cgtgttcggc ctgcgatctc gtcaacgtgg acttggatga ctgtgtttct 1860
gagcaataaa tgacttaaac caggt atg gct gct gac ggt tat ctt cca gat 1912
Met Ala Ala Asp Gly Tyr Leu Pro Asp
1 5
tgg ctc gag gac aac ctc tct gag ggc att cgc gag tgg tgg gac ctg 1960
Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile Arg Glu Trp Trp Asp Leu
10 15 20 25
aaa cct gga gcc ccc aag ccc aag gcc aac cag cag aag cag gac gac 2008
Lys Pro Gly Ala Pro Lys Pro Lys Ala Asn Gln Gln Lys Gln Asp Asp
30 35 40
ggc cgg ggt ctg gtg ctt cct ggc tac aag tac ctc gga ccc ttc aac 2056
Gly Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr Leu Gly Pro Phe Asn
45 50 55
gga ctc gac aag ggg gag ccc gtc aac gcg gcg gac gca gcg gcc ctc 2104
Gly Leu Asp Lys Gly Glu Pro Val Asn Ala Ala Asp Ala Ala Ala Leu
60 65 70
gag cac gac aag gcc tac gac cag cag ctc aaa gcg ggt gac aat ccg 2152
Glu His Asp Lys Ala Tyr Asp Gln Gln Leu Lys Ala Gly Asp Asn Pro
75 80 85
tac ctg cgg tat aac cac gcc gac gcc gag ttt cag gag cgt ctg caa 2200
Tyr Leu Arg Tyr Asn His Ala Asp Ala Glu Phe Gln Glu Arg Leu Gln
90 95 100 105
gaa gat acg tct ttt ggg ggc aac ctc ggg cga gca gtc ttc cag gcc 2248
Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg Ala Val Phe Gln Ala
110 115 120
aag aag cgg gtt ctc gaa cct ctc ggt ctg gtt gag gaa gct gct aag 2296
Lys Lys Arg Val Leu Glu Pro Leu Gly Leu Val Glu Glu Ala Ala Lys
125 130 135
acg gct cct gga aag aag aga ccg gta gaa ccg tca cct cag cgt tcc 2344
Thr Ala Pro Gly Lys Lys Arg Pro Val Glu Pro Ser Pro Gln Arg Ser
140 145 150
ccc gac tcc tcc acg ggc atc ggc aag aaa ggc cag cag ccc gct aaa 2392
Pro Asp Ser Ser Thr Gly Ile Gly Lys Lys Gly Gln Gln Pro Ala Lys
155 160 165
aag aga ctg aac ttt ggg cag act ggc gag tca gag tca gtc ccc gac 2440
Lys Arg Leu Asn Phe Gly Gln Thr Gly Glu Ser Glu Ser Val Pro Asp
170 175 180 185
cct caa cca atc gga gaa cca cca gca ggc ccc tct ggt ctg gga tct 2488
Pro Gln Pro Ile Gly Glu Pro Pro Ala Gly Pro Ser Gly Leu Gly Ser
190 195 200
ggt aca atg gct gca ggc ggt ggc gct cca atg gca gac aat aac gaa 2536
Gly Thr Met Ala Ala Gly Gly Gly Ala Pro Met Ala Asp Asn Asn Glu
205 210 215
ggc gcc gac gga gtg ggt agt tcc tca gga aat tgg cat tgc gat tcc 2584
Gly Ala Asp Gly Val Gly Ser Ser Ser Gly Asn Trp His Cys Asp Ser
220 225 230
aca tgg ctg ggc gac aga gtc atc acc acc agc acc cga acc tgg gcc 2632
Thr Trp Leu Gly Asp Arg Val Ile Thr Thr Ser Thr Arg Thr Trp Ala
235 240 245
ctg ccc acc tac aac aac cac ctc tac aag caa atc tcc aac ggg aca 2680
Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile Ser Asn Gly Thr
250 255 260 265
tcg gga gga agc acc aac gac aac acc tac ttc ggc tac agc acc ccc 2728
Ser Gly Gly Ser Thr Asn Asp Asn Thr Tyr Phe Gly Tyr Ser Thr Pro
270 275 280
tgg ggg tat ttt gac ttc aac aga ttc cac tgc cac ttc tca cca cgt 2776
Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His Phe Ser Pro Arg
285 290 295
gac tgg cag cga ctc atc aac aac aac tgg gga ttc cgg cca aaa aga 2824
Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe Arg Pro Lys Arg
300 305 310
ctc agc ttc aag ctc ttc aac atc cag gtc aag gag gtc acg cag aat 2872
Leu Ser Phe Lys Leu Phe Asn Ile Gln Val Lys Glu Val Thr Gln Asn
315 320 325
gaa ggc acc aag acc atc gcc aat aac ctt acc agc acg att cag gta 2920
Glu Gly Thr Lys Thr Ile Ala Asn Asn Leu Thr Ser Thr Ile Gln Val
330 335 340 345
ttt acg gac tcg gaa tac cag ctg ccg tac gtc ctc ggc tcc gcg cac 2968
Phe Thr Asp Ser Glu Tyr Gln Leu Pro Tyr Val Leu Gly Ser Ala His
350 355 360
cag ggc tgc ctg cct ccg ttc ccg gcg gat gtc ttc atg att ccc cag 3016
Gln Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe Met Ile Pro Gln
365 370 375
tac ggc tac ctg aca ctg aac aat gga agt caa gcc gta ggc cgt tcc 3064
Tyr Gly Tyr Leu Thr Leu Asn Asn Gly Ser Gln Ala Val Gly Arg Ser
380 385 390
tcc ttc tac tgc ctg gaa tat ttt cca tct caa atg ctg cga act gga 3112
Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln Met Leu Arg Thr Gly
395 400 405
aac aat ttt gaa ttc agc tac acc ttc gag gac gtg cct ttc cac agc 3160
Asn Asn Phe Glu Phe Ser Tyr Thr Phe Glu Asp Val Pro Phe His Ser
410 415 420 425
agc tac gca cac agc cag agc ttg gac cga ctg atg aat cct ctc att 3208
Ser Tyr Ala His Ser Gln Ser Leu Asp Arg Leu Met Asn Pro Leu Ile
430 435 440
gac cag tac ctg tac tac tta tcc aga act cag tcc aca gga gga act 3256
Asp Gln Tyr Leu Tyr Tyr Leu Ser Arg Thr Gln Ser Thr Gly Gly Thr
445 450 455
caa ggt acc cag caa ttg tta ttt tct caa gct ggg cct gca aac atg 3304
Gln Gly Thr Gln Gln Leu Leu Phe Ser Gln Ala Gly Pro Ala Asn Met
460 465 470
tcg gct cag gcc aag aac tgg ctg cct gga cct tgc tac cgg cag cag 3352
Ser Ala Gln Ala Lys Asn Trp Leu Pro Gly Pro Cys Tyr Arg Gln Gln
475 480 485
cga gtc tcc acg aca ctg tcg caa aac aac aac agc aac ttt gct tgg 3400
Arg Val Ser Thr Thr Leu Ser Gln Asn Asn Asn Ser Asn Phe Ala Trp
490 495 500 505
act ggt gcc acc aaa tat cac ctg aac gga aga gac tct ctg gtg aat 3448
Thr Gly Ala Thr Lys Tyr His Leu Asn Gly Arg Asp Ser Leu Val Asn
510 515 520
ccc ggt gtc gcc atg gca acc cac aag gac gac gag gaa cgc ttc ttc 3496
Pro Gly Val Ala Met Ala Thr His Lys Asp Asp Glu Glu Arg Phe Phe
525 530 535
ccg tcg agc gga gtc ctg atg ttt gga aaa cag ggt gct gga aga gac 3544
Pro Ser Ser Gly Val Leu Met Phe Gly Lys Gln Gly Ala Gly Arg Asp
540 545 550
aat gtg gac tac agc agc gtt atg cta aca agc gaa gaa gaa att aaa 3592
Asn Val Asp Tyr Ser Ser Val Met Leu Thr Ser Glu Glu Glu Ile Lys
555 560 565
acc act aac cct gta gcc aca gaa caa tac ggc gtg gtg gct gac aac 3640
Thr Thr Asn Pro Val Ala Thr Glu Gln Tyr Gly Val Val Ala Asp Asn
570 575 580 585
ttg cag caa gcc aat aca ggg cct att gtg gga aat gtc aac agc caa 3688
Leu Gln Gln Ala Asn Thr Gly Pro Ile Val Gly Asn Val Asn Ser Gln
590 595 600
gga gcc tta cct ggc atg gtc tgg cag aac cga gac gtg tac ctg cag 3736
Gly Ala Leu Pro Gly Met Val Trp Gln Asn Arg Asp Val Tyr Leu Gln
605 610 615
ggt ccc atc tgg gcc aag att cct cac acg gac ggc aac ttt cac ccg 3784
Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe His Pro
620 625 630
tct cct ctg atg ggc ggc ttt gga ctt aaa cac ccg cct cca cag atc 3832
Ser Pro Leu Met Gly Gly Phe Gly Leu Lys His Pro Pro Pro Gln Ile
635 640 645
ctg atc aag aac acg ccg gta cct gcg gat cct cca aca acg ttc agc 3880
Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Thr Phe Ser
650 655 660 665
cag gcg aaa ttg gct tcc ttc atc acg cag tac agc acc gga cag gtc 3928
Gln Ala Lys Leu Ala Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val
670 675 680
agc gtg gaa atc gag tgg gag ctg cag aag gag aac agc aaa cgc tgg 3976
Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys Arg Trp
685 690 695
aac cca gag att cag tac act tca aac tac tac aaa tct aca aat gtg 4024
Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser Thr Asn Val
700 705 710
gac ttt gct gtc aat aca gag gga act tat tct gag cct cgc ccc att 4072
Asp Phe Ala Val Asn Thr Glu Gly Thr Tyr Ser Glu Pro Arg Pro Ile
715 720 725
ggt act cgt tat ctg aca cgt aat ctg taa 4102
Gly Thr Arg Tyr Leu Thr Arg Asn Leu
730 735
<210> 32
<211> 738
<212> PRT
<213> adeno-associated virus 10
<400> 32
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Asp Leu Lys Pro Gly Ala Pro Lys Pro
20 25 30
Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
Pro Val Glu Pro Ser Pro Gln Arg Ser Pro Asp Ser Ser Thr Gly Ile
145 150 155 160
Gly Lys Lys Gly Gln Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln
165 170 175
Thr Gly Glu Ser Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro
180 185 190
Pro Ala Gly Pro Ser Gly Leu Gly Ser Gly Thr Met Ala Ala Gly Gly
195 200 205
Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser
210 215 220
Ser Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val
225 230 235 240
Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His
245 250 255
Leu Tyr Lys Gln Ile Ser Asn Gly Thr Ser Gly Gly Ser Thr Asn Asp
260 265 270
Asn Thr Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn
275 280 285
Arg Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn
290 295 300
Asn Asn Trp Gly Phe Arg Pro Lys Arg Leu Ser Phe Lys Leu Phe Asn
305 310 315 320
Ile Gln Val Lys Glu Val Thr Gln Asn Glu Gly Thr Lys Thr Ile Ala
325 330 335
Asn Asn Leu Thr Ser Thr Ile Gln Val Phe Thr Asp Ser Glu Tyr Gln
340 345 350
Leu Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe
355 360 365
Pro Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn
370 375 380
Asn Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr
385 390 395 400
Phe Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Glu Phe Ser Tyr
405 410 415
Thr Phe Glu Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser
420 425 430
Leu Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu
435 440 445
Ser Arg Thr Gln Ser Thr Gly Gly Thr Gln Gly Thr Gln Gln Leu Leu
450 455 460
Phe Ser Gln Ala Gly Pro Ala Asn Met Ser Ala Gln Ala Lys Asn Trp
465 470 475 480
Leu Pro Gly Pro Cys Tyr Arg Gln Gln Arg Val Ser Thr Thr Leu Ser
485 490 495
Gln Asn Asn Asn Ser Asn Phe Ala Trp Thr Gly Ala Thr Lys Tyr His
500 505 510
Leu Asn Gly Arg Asp Ser Leu Val Asn Pro Gly Val Ala Met Ala Thr
515 520 525
His Lys Asp Asp Glu Glu Arg Phe Phe Pro Ser Ser Gly Val Leu Met
530 535 540
Phe Gly Lys Gln Gly Ala Gly Arg Asp Asn Val Asp Tyr Ser Ser Val
545 550 555 560
Met Leu Thr Ser Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr
565 570 575
Glu Gln Tyr Gly Val Val Ala Asp Asn Leu Gln Gln Ala Asn Thr Gly
580 585 590
Pro Ile Val Gly Asn Val Asn Ser Gln Gly Ala Leu Pro Gly Met Val
595 600 605
Trp Gln Asn Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile
610 615 620
Pro His Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe
625 630 635 640
Gly Leu Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val
645 650 655
Pro Ala Asp Pro Pro Thr Thr Phe Ser Gln Ala Lys Leu Ala Ser Phe
660 665 670
Ile Thr Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu
675 680 685
Leu Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr
690 695 700
Ser Asn Tyr Tyr Lys Ser Thr Asn Val Asp Phe Ala Val Asn Thr Glu
705 710 715 720
Gly Thr Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg
725 730 735
Asn Leu
<210> 33
<211> 4087
<212> DNA
<213> adeno-associated virus 11
<220>
<221> CDS
<222> (1886)..(4087)
<223> AAV11 VP1
<220>
<221> misc_feature
<222> (2297)..(4087)
<223> AAV11 VP2
<220>
<221> misc_feature
<222> (2474)..(4087)
<223> AAV11 VP3
<400> 33
atgccgggct tctacgagat cgtgatcaag gtgccgagcg acctggacga gcacctgccg 60
ggcatttctg actcgtttgt gaactgggtg gccgagaagg aatgggagct gcccccggat 120
tctgacatgg atcggaatct gatcgagcag gcacccctga ccgtggccga gaagctgcag 180
cgcgacttcc tggtccactg gcgccgcgtg agtaaggccc cggaggccct cttctttgtt 240
cagttcgaga agggcgagtc ctacttccac ctccacgttc tcgtcgagac cacgggggtc 300
aagtccatgg tcctgggccg cttcctgagt cagatcagag acaggctggt gcagaccatc 360
taccgcgggg tcgagcccac gctgcccaac tggttcgcgg tgaccaagac gcgaaatggc 420
gccggcgggg ggaacaaggt ggtggacgag tgctacatcc ccaactacct cctgcccaag 480
acccagcccg agctgcagtg ggcgtggact aacatggagg agtatataag cgcgtgtcta 540
aacctcgcgg agcgtaaacg gctcgtggcg cagcacctga cccacgtcag ccagacgcag 600
gagcagaaca aggagaatct gaacccgaat tctgacgcgc ccgtgatcag gtcaaaaacc 660
tccgcgcgct acatggagct ggtcgggtgg ctggtggacc ggggcatcac ctccgagaag 720
cagtggatcc aggaggacca ggcctcgtac atctccttca acgccgcctc caactcgcgg 780
tcccagatca aggccgcgct ggacaatgcc ggaaagatca tggcgctgac caaatccgcg 840
cccgactacc tggtaggccc gtccttaccc gcggacatta aggccaaccg catctaccgc 900
atcctggagc tcaacggcta cgaccccgcc tacgccggct ccgtcttcct gggctgggcg 960
cagaaaaagt tcggtaaacg caacaccatc tggctgtttg ggcccgccac caccggcaag 1020
accaacatcg cggaagccat agcccacgcc gtgcccttct acggctgcgt gaactggacc 1080
aatgagaact ttcccttcaa cgattgcgtc gacaagatgg tgatctggtg ggaggagggc 1140
aagatgaccg ccaaggtcgt ggagtccgcc aaggccattc tgggcggaag caaggtgcgc 1200
gtggaccaaa agtgcaagtc ctcggcccag atcgacccca cgcccgtgat cgtcacctcc 1260
aacaccaaca tgtgcgccgt gatcgacggg aacagcacca ccttcgagca ccagcagccg 1320
ctgcaggacc gcatgttcaa gttcgagctc acccgccgtc tggagcacga ctttggcaag 1380
gtgaccaagc aggaagtcaa agagttcttc cgctgggctc aggatcacgt gactgaggtg 1440
gcgcatgagt tctacgtcag aaagggcgga gccaccaaaa gacccgcccc cagtgacgcg 1500
gatataagcg agcccaagcg ggcctgcccc tcagttccgg agccatcgac gtcagacgcg 1560
gaagcaccgg tggactttgc ggacaggtac caaaacaaat gttctcgtca cgcgggcatg 1620
cttcagatgc tgtttccctg caagacatgc gagagaatga atcagaattt caacgtctgc 1680
ttcacgcacg gggtcagaga ctgctcagag tgcttccccg gcgcgtcaga atctcaaccc 1740
gtcgtcagaa aaaagacgta tcagaaactg tgcgcgattc atcatctgct ggggcgggca 1800
cccgagattg cgtgttcggc ctgcgatctc gtcaacgtgg acttggatga ctgtgtttct 1860
gagcaataaa tgacttaaac caggt atg gct gct gac ggt tat ctt cca gat 1912
Met Ala Ala Asp Gly Tyr Leu Pro Asp
1 5
tgg ctc gag gac aac ctc tct gag ggc att cgc gag tgg tgg gac ctg 1960
Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile Arg Glu Trp Trp Asp Leu
10 15 20 25
aaa cct gga gcc ccg aag ccc aag gcc aac cag cag aag cag gac gac 2008
Lys Pro Gly Ala Pro Lys Pro Lys Ala Asn Gln Gln Lys Gln Asp Asp
30 35 40
ggc cgg ggt ctg gtg ctt cct ggc tac aag tac ctc gga ccc ttc aac 2056
Gly Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr Leu Gly Pro Phe Asn
45 50 55
gga ctc gac aag ggg gag ccc gtc aac gcg gcg gac gca gcg gcc ctc 2104
Gly Leu Asp Lys Gly Glu Pro Val Asn Ala Ala Asp Ala Ala Ala Leu
60 65 70
gag cac gac aag gcc tac gac cag cag ctc aaa gcg ggt gac aat ccg 2152
Glu His Asp Lys Ala Tyr Asp Gln Gln Leu Lys Ala Gly Asp Asn Pro
75 80 85
tac ctg cgg tat aac cac gcc gac gcc gag ttt cag gag cgt ctg caa 2200
Tyr Leu Arg Tyr Asn His Ala Asp Ala Glu Phe Gln Glu Arg Leu Gln
90 95 100 105
gaa gat acg tct ttt ggg ggc aac ctc ggg cga gca gtc ttc cag gcc 2248
Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg Ala Val Phe Gln Ala
110 115 120
aag aag agg gta ctc gaa cct ctg ggc ctg gtt gaa gaa ggt gct aaa 2296
Lys Lys Arg Val Leu Glu Pro Leu Gly Leu Val Glu Glu Gly Ala Lys
125 130 135
acg gct cct gga aag aag aga ccg tta gag tca cca caa gag ccc gac 2344
Thr Ala Pro Gly Lys Lys Arg Pro Leu Glu Ser Pro Gln Glu Pro Asp
140 145 150
tcc tcc tcg ggc atc ggc aaa aaa ggc aaa caa cca gcc aga aag agg 2392
Ser Ser Ser Gly Ile Gly Lys Lys Gly Lys Gln Pro Ala Arg Lys Arg
155 160 165
ctc aac ttt gaa gag gac act gga gcc gga gac gga ccc cct gaa gga 2440
Leu Asn Phe Glu Glu Asp Thr Gly Ala Gly Asp Gly Pro Pro Glu Gly
170 175 180 185
tca gat acc agc gcc atg tct tca gac att gaa atg cgt gca gca ccg 2488
Ser Asp Thr Ser Ala Met Ser Ser Asp Ile Glu Met Arg Ala Ala Pro
190 195 200
ggc gga aat gct gtc gat gcg gga caa ggt tcc gat gga gtg ggt aat 2536
Gly Gly Asn Ala Val Asp Ala Gly Gln Gly Ser Asp Gly Val Gly Asn
205 210 215
gcc tcg ggt gat tgg cat tgc gat tcc acc tgg tct gag ggc aag gtc 2584
Ala Ser Gly Asp Trp His Cys Asp Ser Thr Trp Ser Glu Gly Lys Val
220 225 230
aca aca acc tcg acc aga acc tgg gtc ttg ccc acc tac aac aac cac 2632
Thr Thr Thr Ser Thr Arg Thr Trp Val Leu Pro Thr Tyr Asn Asn His
235 240 245
ttg tac ctg cgt ctc gga aca aca tca agc agc aac acc tac aac gga 2680
Leu Tyr Leu Arg Leu Gly Thr Thr Ser Ser Ser Asn Thr Tyr Asn Gly
250 255 260 265
ttc tcc acc ccc tgg gga tat ttt gac ttc aac aga ttc cac tgt cac 2728
Phe Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His
270 275 280
ttc tca cca cgt gac tgg caa aga ctc atc aac aac aac tgg gga cta 2776
Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Leu
285 290 295
cga cca aaa gcc atg cgc gtt aaa atc ttc aat atc caa gtt aag gag 2824
Arg Pro Lys Ala Met Arg Val Lys Ile Phe Asn Ile Gln Val Lys Glu
300 305 310
gtc aca acg tcg aac ggc gag act acg gtc gct aat aac ctt acc agc 2872
Val Thr Thr Ser Asn Gly Glu Thr Thr Val Ala Asn Asn Leu Thr Ser
315 320 325
acg gtt cag ata ttt gcg gac tcg tcg tat gag ctc ccg tac gtg atg 2920
Thr Val Gln Ile Phe Ala Asp Ser Ser Tyr Glu Leu Pro Tyr Val Met
330 335 340 345
gac gct gga caa gag ggg agc ctg cct cct ttc ccc aat gac gtg ttc 2968
Asp Ala Gly Gln Glu Gly Ser Leu Pro Pro Phe Pro Asn Asp Val Phe
350 355 360
atg gtg cct caa tat ggc tac tgt ggc atc gtg act ggc gag aat cag 3016
Met Val Pro Gln Tyr Gly Tyr Cys Gly Ile Val Thr Gly Glu Asn Gln
365 370 375
aac caa acg gac aga aac gct ttc tac tgc ctg gag tat ttt cct tcg 3064
Asn Gln Thr Asp Arg Asn Ala Phe Tyr Cys Leu Glu Tyr Phe Pro Ser
380 385 390
caa atg ttg aga act ggc aac aac ttt gaa atg gct tac aac ttt gag 3112
Gln Met Leu Arg Thr Gly Asn Asn Phe Glu Met Ala Tyr Asn Phe Glu
395 400 405
aag gtg ccg ttc cac tca atg tat gct cac agc cag agc ctg gac aga 3160
Lys Val Pro Phe His Ser Met Tyr Ala His Ser Gln Ser Leu Asp Arg
410 415 420 425
ctg atg aat ccc ctc ctg gac cag tac ctg tgg cac tta cag tcg act 3208
Leu Met Asn Pro Leu Leu Asp Gln Tyr Leu Trp His Leu Gln Ser Thr
430 435 440
acc tct gga gag act ctg aat caa ggc aat gca gca acc aca ttt gga 3256
Thr Ser Gly Glu Thr Leu Asn Gln Gly Asn Ala Ala Thr Thr Phe Gly
445 450 455
aaa atc agg agt gga gac ttt gcc ttt tac aga aag aac tgg ctg cct 3304
Lys Ile Arg Ser Gly Asp Phe Ala Phe Tyr Arg Lys Asn Trp Leu Pro
460 465 470
ggg cct tgt gtt aaa cag cag aga ttc tca aaa act gcc agt caa aat 3352
Gly Pro Cys Val Lys Gln Gln Arg Phe Ser Lys Thr Ala Ser Gln Asn
475 480 485
tac aag att cct gcc agc ggg ggc aac gct ctg tta aag tat gac acc 3400
Tyr Lys Ile Pro Ala Ser Gly Gly Asn Ala Leu Leu Lys Tyr Asp Thr
490 495 500 505
cac tat acc tta aac aac cgc tgg agc aac atc gcg ccc gga cct cca 3448
His Tyr Thr Leu Asn Asn Arg Trp Ser Asn Ile Ala Pro Gly Pro Pro
510 515 520
atg gcc aca gcc gga cct tcg gat ggg gac ttc agt aac gcc cag ctt 3496
Met Ala Thr Ala Gly Pro Ser Asp Gly Asp Phe Ser Asn Ala Gln Leu
525 530 535
ata ttc cct gga cca tct gtt acc gga aat aca aca act tca gcc aac 3544
Ile Phe Pro Gly Pro Ser Val Thr Gly Asn Thr Thr Thr Ser Ala Asn
540 545 550
aat ctg ttg ttt aca tca gaa gaa gaa att gct gcc acc aac cca aga 3592
Asn Leu Leu Phe Thr Ser Glu Glu Glu Ile Ala Ala Thr Asn Pro Arg
555 560 565
gac acg gac atg ttt ggc cag att gct gac aat aat cag aat gct aca 3640
Asp Thr Asp Met Phe Gly Gln Ile Ala Asp Asn Asn Gln Asn Ala Thr
570 575 580 585
act gct ccc ata acc ggc aac gtg act gct atg gga gtg ctg cct ggc 3688
Thr Ala Pro Ile Thr Gly Asn Val Thr Ala Met Gly Val Leu Pro Gly
590 595 600
atg gtg tgg caa aac aga gac att tac tac caa ggg cca att tgg gcc 3736
Met Val Trp Gln Asn Arg Asp Ile Tyr Tyr Gln Gly Pro Ile Trp Ala
605 610 615
aag atc cca cac gcg gac gga cat ttt cat cct tca ccg ctg att ggt 3784
Lys Ile Pro His Ala Asp Gly His Phe His Pro Ser Pro Leu Ile Gly
620 625 630
ggg ttt gga ctg aaa cac ccg cct ccc cag ata ttc atc aag aac act 3832
Gly Phe Gly Leu Lys His Pro Pro Pro Gln Ile Phe Ile Lys Asn Thr
635 640 645
ccc gta cct gcc aat cct gcg aca acc ttc act gca gcc aga gtg gac 3880
Pro Val Pro Ala Asn Pro Ala Thr Thr Phe Thr Ala Ala Arg Val Asp
650 655 660 665
tct ttc atc aca caa tac agc acc ggc cag gtc gct gtt cag att gaa 3928
Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val Ala Val Gln Ile Glu
670 675 680
tgg gaa att gaa aag gaa cgc tcc aaa cgc tgg aat cct gaa gtg cag 3976
Trp Glu Ile Glu Lys Glu Arg Ser Lys Arg Trp Asn Pro Glu Val Gln
685 690 695
ttt act tca aac tat ggg aac cag tct tct atg ttg tgg gct cct gat 4024
Phe Thr Ser Asn Tyr Gly Asn Gln Ser Ser Met Leu Trp Ala Pro Asp
700 705 710
aca act ggg aag tat aca gag ccg cgg gtt att ggc tct cgt tat ttg 4072
Thr Thr Gly Lys Tyr Thr Glu Pro Arg Val Ile Gly Ser Arg Tyr Leu
715 720 725
act aat cat ttg taa 4087
Thr Asn His Leu
730
<210> 34
<211> 733
<212> PRT
<213> adeno-associated virus 11
<400> 34
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Asp Leu Lys Pro Gly Ala Pro Lys Pro
20 25 30
Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
Pro Leu Glu Ser Pro Gln Glu Pro Asp Ser Ser Ser Gly Ile Gly Lys
145 150 155 160
Lys Gly Lys Gln Pro Ala Arg Lys Arg Leu Asn Phe Glu Glu Asp Thr
165 170 175
Gly Ala Gly Asp Gly Pro Pro Glu Gly Ser Asp Thr Ser Ala Met Ser
180 185 190
Ser Asp Ile Glu Met Arg Ala Ala Pro Gly Gly Asn Ala Val Asp Ala
195 200 205
Gly Gln Gly Ser Asp Gly Val Gly Asn Ala Ser Gly Asp Trp His Cys
210 215 220
Asp Ser Thr Trp Ser Glu Gly Lys Val Thr Thr Thr Ser Thr Arg Thr
225 230 235 240
Trp Val Leu Pro Thr Tyr Asn Asn His Leu Tyr Leu Arg Leu Gly Thr
245 250 255
Thr Ser Ser Ser Asn Thr Tyr Asn Gly Phe Ser Thr Pro Trp Gly Tyr
260 265 270
Phe Asp Phe Asn Arg Phe His Cys His Phe Ser Pro Arg Asp Trp Gln
275 280 285
Arg Leu Ile Asn Asn Asn Trp Gly Leu Arg Pro Lys Ala Met Arg Val
290 295 300
Lys Ile Phe Asn Ile Gln Val Lys Glu Val Thr Thr Ser Asn Gly Glu
305 310 315 320
Thr Thr Val Ala Asn Asn Leu Thr Ser Thr Val Gln Ile Phe Ala Asp
325 330 335
Ser Ser Tyr Glu Leu Pro Tyr Val Met Asp Ala Gly Gln Glu Gly Ser
340 345 350
Leu Pro Pro Phe Pro Asn Asp Val Phe Met Val Pro Gln Tyr Gly Tyr
355 360 365
Cys Gly Ile Val Thr Gly Glu Asn Gln Asn Gln Thr Asp Arg Asn Ala
370 375 380
Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln Met Leu Arg Thr Gly Asn
385 390 395 400
Asn Phe Glu Met Ala Tyr Asn Phe Glu Lys Val Pro Phe His Ser Met
405 410 415
Tyr Ala His Ser Gln Ser Leu Asp Arg Leu Met Asn Pro Leu Leu Asp
420 425 430
Gln Tyr Leu Trp His Leu Gln Ser Thr Thr Ser Gly Glu Thr Leu Asn
435 440 445
Gln Gly Asn Ala Ala Thr Thr Phe Gly Lys Ile Arg Ser Gly Asp Phe
450 455 460
Ala Phe Tyr Arg Lys Asn Trp Leu Pro Gly Pro Cys Val Lys Gln Gln
465 470 475 480
Arg Phe Ser Lys Thr Ala Ser Gln Asn Tyr Lys Ile Pro Ala Ser Gly
485 490 495
Gly Asn Ala Leu Leu Lys Tyr Asp Thr His Tyr Thr Leu Asn Asn Arg
500 505 510
Trp Ser Asn Ile Ala Pro Gly Pro Pro Met Ala Thr Ala Gly Pro Ser
515 520 525
Asp Gly Asp Phe Ser Asn Ala Gln Leu Ile Phe Pro Gly Pro Ser Val
530 535 540
Thr Gly Asn Thr Thr Thr Ser Ala Asn Asn Leu Leu Phe Thr Ser Glu
545 550 555 560
Glu Glu Ile Ala Ala Thr Asn Pro Arg Asp Thr Asp Met Phe Gly Gln
565 570 575
Ile Ala Asp Asn Asn Gln Asn Ala Thr Thr Ala Pro Ile Thr Gly Asn
580 585 590
Val Thr Ala Met Gly Val Leu Pro Gly Met Val Trp Gln Asn Arg Asp
595 600 605
Ile Tyr Tyr Gln Gly Pro Ile Trp Ala Lys Ile Pro His Ala Asp Gly
610 615 620
His Phe His Pro Ser Pro Leu Ile Gly Gly Phe Gly Leu Lys His Pro
625 630 635 640
Pro Pro Gln Ile Phe Ile Lys Asn Thr Pro Val Pro Ala Asn Pro Ala
645 650 655
Thr Thr Phe Thr Ala Ala Arg Val Asp Ser Phe Ile Thr Gln Tyr Ser
660 665 670
Thr Gly Gln Val Ala Val Gln Ile Glu Trp Glu Ile Glu Lys Glu Arg
675 680 685
Ser Lys Arg Trp Asn Pro Glu Val Gln Phe Thr Ser Asn Tyr Gly Asn
690 695 700
Gln Ser Ser Met Leu Trp Ala Pro Asp Thr Thr Gly Lys Tyr Thr Glu
705 710 715 720
Pro Arg Val Ile Gly Ser Arg Tyr Leu Thr Asn His Leu
725 730
<210> 35
<211> 4213
<212> DNA
<213> adeno-associated virus 12
<220>
<221> CDS
<222> (1985)..(4213)
<223> AAV12 VP1
<220>
<221> misc_feature
<222> (2396)..(4213)
<223> AAV12 VP2
<220>
<221> misc_feature
<222> (2600)..(4213)
<223> AAV12 VP3
<400> 35
ttgcgacagt ttgcgacacc atgtggtcac aagaggtata taaccgcgag tgagccagcg 60
aggagctcca ttttgcccgc gaagtttgaa cgagcagcag ccatgccggg gttctacgag 120
gtggtgatca aggtgcccag cgacctggac gagcacctgc ccggcatttc tgactccttt 180
gtgaactggg tggccgagaa ggaatgggag ttgcccccgg attctgacat ggatcagaat 240
ctgattgagc aggcacccct gaccgtggcc gagaagctgc agcgcgagtt cctggtggaa 300
tggcgccgag tgagtaaatt tctggaggcc aagttttttg tgcagtttga aaagggggac 360
tcgtactttc atttgcatat tctgattgaa attaccggcg tgaaatccat ggtggtgggc 420
cgctacgtga gtcagattag ggataaactg atccagcgca tctaccgcgg ggtcgagccc 480
cagctgccca actggttcgc ggtcacaaag acccgaaatg gcgccggagg cgggaacaag 540
gtggtggacg agtgctacat ccccaactac ctgctcccca aggtccagcc cgagcttcag 600
tgggcgtgga ctaacatgga ggagtatata agcgcctgtt tgaacctcgc ggagcgtaaa 660
cggctcgtgg cgcagcacct gacgcacgtc tcccagaccc aggagggcga caaggagaat 720
ctgaacccga attctgacgc gccggtgatc cggtcaaaaa cctccgccag gtacatggag 780
ctggtcgggt ggctggtgga caagggcatc acgtccgaga agcagtggat ccaggaggac 840
caggcctcgt acatctcctt caacgcggcc tccaactccc ggtcgcagat caaggcggcc 900
ctggacaatg cctccaaaat catgagcctc accaaaacgg ctccggacta tctcatcggg 960
cagcagcccg tgggggacat taccaccaac cggatctaca aaatcctgga actgaacggg 1020
tacgaccccc agtacgccgc ctccgtcttt ctcggctggg cccagaaaaa gtttggaaag 1080
cgcaacacca tctggctgtt tgggcccgcc accaccggca agaccaacat cgcggaagcc 1140
atcgcccacg cggtcccctt ctacggctgc gtcaactgga ccaatgagaa ctttcccttc 1200
aacgactgcg tcgacaaaat ggtgatttgg tgggaggagg gcaagatgac cgccaaggtc 1260
gtagagtccg ccaaggccat tctgggcggc agcaaggtgc gcgtggacca aaaatgcaag 1320
gcctctgcgc agatcgaccc cacccccgtg atcgtcacct ccaacaccaa catgtgcgcc 1380
gtgattgacg ggaacagcac caccttcgag caccagcagc ccctgcagga ccggatgttc 1440
aagtttgaac tcacccgccg cctcgaccac gactttggca aggtcaccaa gcaggaagtc 1500
aaggactttt tccggtgggc ggctgatcac gtgactgacg tggctcatga gttttacgtc 1560
acaaagggtg gagctaagaa aaggcccgcc ccctctgacg aggatataag cgagcccaag 1620
cggccgcgcg tgtcatttgc gcagccggag acgtcagacg cggaagctcc cggagacttc 1680
gccgacaggt accaaaacaa atgttctcgt cacgcgggta tgctgcagat gctctttccc 1740
tgcaagacgt gcgagagaat gaatcagaat tccaacgtct gcttcacgca cggtcagaaa 1800
gattgcgggg agtgctttcc cgggtcagaa tctcaaccgg tttctgtcgt cagaaaaacg 1860
tatcagaaac tgtgcatcct tcatcagctc cggggggcac ccgagatcgc ctgctctgct 1920
tgcgaccaac tcaaccccga tttggacgat tgccaatttg agcaataaat gactgaaatc 1980
aggt atg gct gct gac ggt tat ctt cca gat tgg ctc gag gac aac ctc 2029
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu
1 5 10 15
tct gaa ggc att cgc gag tgg tgg gcg ctg aaa cct gga gct cca caa 2077
Ser Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln
20 25 30
ccc aag gcc aac caa cag cat cag gac aac ggc agg ggt ctt gtg ctt 2125
Pro Lys Ala Asn Gln Gln His Gln Asp Asn Gly Arg Gly Leu Val Leu
35 40 45
cct ggg tac aag tac ctc gga ccc ttc aac gga ctc gac aag gga gag 2173
Pro Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu
50 55 60
ccg gtc aac gag gca gac gcc gcg gcc ctc gag cac gac aag gcc tac 2221
Pro Val Asn Glu Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr
65 70 75
gac aag cag ctc gag cag ggg gac aac ccg tat ctc aag tac aac cac 2269
Asp Lys Gln Leu Glu Gln Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His
80 85 90 95
gcc gac gcc gag ttc cag cag cgc ttg gcg acc gac acc tct ttt ggg 2317
Ala Asp Ala Glu Phe Gln Gln Arg Leu Ala Thr Asp Thr Ser Phe Gly
100 105 110
ggc aac ctc ggg cga gca gtc ttc cag gcc aaa aag agg att ctc gag 2365
Gly Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Ile Leu Glu
115 120 125
cct ctg ggt ctg gtt gaa gag ggc gtt aaa acg gct cct gga aag aaa 2413
Pro Leu Gly Leu Val Glu Glu Gly Val Lys Thr Ala Pro Gly Lys Lys
130 135 140
cgc cca tta gaa aag act cca aat cgg ccg acc aac ccg gac tct ggg 2461
Arg Pro Leu Glu Lys Thr Pro Asn Arg Pro Thr Asn Pro Asp Ser Gly
145 150 155
aag gcc ccg gcc aag aaa aag caa aaa gac ggc gaa cca gcc gac tct 2509
Lys Ala Pro Ala Lys Lys Lys Gln Lys Asp Gly Glu Pro Ala Asp Ser
160 165 170 175
gct aga agg aca ctc gac ttt gaa gac tct gga gca gga gac gga ccc 2557
Ala Arg Arg Thr Leu Asp Phe Glu Asp Ser Gly Ala Gly Asp Gly Pro
180 185 190
cct gag gga tca tct tcc gga gaa atg tct cat gat gct gag atg cgt 2605
Pro Glu Gly Ser Ser Ser Gly Glu Met Ser His Asp Ala Glu Met Arg
195 200 205
gcg gcg cca ggc gga aat gct gtc gag gcg gga caa ggt gcc gat gga 2653
Ala Ala Pro Gly Gly Asn Ala Val Glu Ala Gly Gln Gly Ala Asp Gly
210 215 220
gtg ggt aat gcc tcc ggt gat tgg cat tgc gat tcc acc tgg tca gag 2701
Val Gly Asn Ala Ser Gly Asp Trp His Cys Asp Ser Thr Trp Ser Glu
225 230 235
ggc cga gtc acc acc acc agc acc cga acc tgg gtc cta ccc acg tac 2749
Gly Arg Val Thr Thr Thr Ser Thr Arg Thr Trp Val Leu Pro Thr Tyr
240 245 250 255
aac aac cac ctg tac ctg cga atc gga aca acg gcc aac agc aac acc 2797
Asn Asn His Leu Tyr Leu Arg Ile Gly Thr Thr Ala Asn Ser Asn Thr
260 265 270
tac aac gga ttc tcc acc ccc tgg gga tac ttt gac ttt aac cgc ttc 2845
Tyr Asn Gly Phe Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe
275 280 285
cac tgc cac ttt tcc cca cgc gac tgg cag cga ctc atc aac aac aac 2893
His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn
290 295 300
tgg gga ctc agg ccg aaa tcg atg cgt gtt aaa atc ttc aac ata cag 2941
Trp Gly Leu Arg Pro Lys Ser Met Arg Val Lys Ile Phe Asn Ile Gln
305 310 315
gtc aag gag gtc acg acg tca aac ggc gag act acg gtc gct aat aac 2989
Val Lys Glu Val Thr Thr Ser Asn Gly Glu Thr Thr Val Ala Asn Asn
320 325 330 335
ctt acc agc acg gtt cag atc ttt gcg gat tcg acg tat gaa ctc cca 3037
Leu Thr Ser Thr Val Gln Ile Phe Ala Asp Ser Thr Tyr Glu Leu Pro
340 345 350
tac gtg atg gac gcc ggt cag gag ggg agc ttt cct ccg ttt ccc aac 3085
Tyr Val Met Asp Ala Gly Gln Glu Gly Ser Phe Pro Pro Phe Pro Asn
355 360 365
gac gtc ttt atg gtt ccc caa tac gga tac tgc gga gtt gtc act gga 3133
Asp Val Phe Met Val Pro Gln Tyr Gly Tyr Cys Gly Val Val Thr Gly
370 375 380
aaa aac cag aac cag aca gac aga aat gcc ttt tac tgc ctg gaa tac 3181
Lys Asn Gln Asn Gln Thr Asp Arg Asn Ala Phe Tyr Cys Leu Glu Tyr
385 390 395
ttt cca tcc caa atg cta aga act ggc aac aat ttt gaa gtc agt tac 3229
Phe Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Glu Val Ser Tyr
400 405 410 415
caa ttt gaa aaa gtt cct ttc cat tca atg tac gcg cac agc cag agc 3277
Gln Phe Glu Lys Val Pro Phe His Ser Met Tyr Ala His Ser Gln Ser
420 425 430
ctg gac aga atg atg aat cct tta ctg gat cag tac ctg tgg cat ctg 3325
Leu Asp Arg Met Met Asn Pro Leu Leu Asp Gln Tyr Leu Trp His Leu
435 440 445
caa tcg acc act acc gga aat tcc ctt aat caa gga aca gct acc acc 3373
Gln Ser Thr Thr Thr Gly Asn Ser Leu Asn Gln Gly Thr Ala Thr Thr
450 455 460
acg tac ggg aaa att acc act gga gac ttt gcc tac tac agg aaa aac 3421
Thr Tyr Gly Lys Ile Thr Thr Gly Asp Phe Ala Tyr Tyr Arg Lys Asn
465 470 475
tgg ttg cct gga gcc tgc att aaa caa caa aaa ttt tca aag aat gcc 3469
Trp Leu Pro Gly Ala Cys Ile Lys Gln Gln Lys Phe Ser Lys Asn Ala
480 485 490 495
aat caa aac tac aag att ccc gcc agc ggg gga gac gcc ctt tta aag 3517
Asn Gln Asn Tyr Lys Ile Pro Ala Ser Gly Gly Asp Ala Leu Leu Lys
500 505 510
tat gac acg cat acc act cta aat ggg cga tgg agt aac atg gct cct 3565
Tyr Asp Thr His Thr Thr Leu Asn Gly Arg Trp Ser Asn Met Ala Pro
515 520 525
gga cct cca atg gca acc gca ggt gcc ggg gac tcg gat ttt agc aac 3613
Gly Pro Pro Met Ala Thr Ala Gly Ala Gly Asp Ser Asp Phe Ser Asn
530 535 540
agc cag ctg atc ttt gcc gga ccc aat ccg agc ggt aac acg acc aca 3661
Ser Gln Leu Ile Phe Ala Gly Pro Asn Pro Ser Gly Asn Thr Thr Thr
545 550 555
tct tca aac aat ttg ttg ttt acc tca gaa gag gag att gcc aca aca 3709
Ser Ser Asn Asn Leu Leu Phe Thr Ser Glu Glu Glu Ile Ala Thr Thr
560 565 570 575
aac cca cga gac acg gac atg ttt gga cag att gca gat aat aat caa 3757
Asn Pro Arg Asp Thr Asp Met Phe Gly Gln Ile Ala Asp Asn Asn Gln
580 585 590
aat gcc acc acc gcc cct cac atc gct aac ctg gac gct atg gga att 3805
Asn Ala Thr Thr Ala Pro His Ile Ala Asn Leu Asp Ala Met Gly Ile
595 600 605
gtt ccc gga atg gtc tgg caa aac aga gac atc tac tac cag ggc cct 3853
Val Pro Gly Met Val Trp Gln Asn Arg Asp Ile Tyr Tyr Gln Gly Pro
610 615 620
att tgg gcc aag gtc cct cac acg gac gga cac ttt cac cct tcg ccg 3901
Ile Trp Ala Lys Val Pro His Thr Asp Gly His Phe His Pro Ser Pro
625 630 635
ctg atg gga gga ttt gga ctg aaa cac ccg cct cca cag att ttc atc 3949
Leu Met Gly Gly Phe Gly Leu Lys His Pro Pro Pro Gln Ile Phe Ile
640 645 650 655
aaa aac acc ccc gta ccc gcc aat ccc aat act acc ttt agc gct gca 3997
Lys Asn Thr Pro Val Pro Ala Asn Pro Asn Thr Thr Phe Ser Ala Ala
660 665 670
agg att aat tct ttt ctg acg cag tac agc acc gga caa gtt gcc gtt 4045
Arg Ile Asn Ser Phe Leu Thr Gln Tyr Ser Thr Gly Gln Val Ala Val
675 680 685
cag atc gac tgg gaa att cag aag gag cat tcc aaa cgc tgg aat ccc 4093
Gln Ile Asp Trp Glu Ile Gln Lys Glu His Ser Lys Arg Trp Asn Pro
690 695 700
gaa gtt caa ttt act tca aac tac ggc act caa aat tct atg ctg tgg 4141
Glu Val Gln Phe Thr Ser Asn Tyr Gly Thr Gln Asn Ser Met Leu Trp
705 710 715
gct ccc gac aat gct ggc aac tac cac gaa ctc cgg gct att ggg tcc 4189
Ala Pro Asp Asn Ala Gly Asn Tyr His Glu Leu Arg Ala Ile Gly Ser
720 725 730 735
cgt ttc ctc acc cac cac ttg taa 4213
Arg Phe Leu Thr His His Leu
740
<210> 36
<211> 742
<212> PRT
<213> adeno-associated virus 12
<400> 36
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro
20 25 30
Lys Ala Asn Gln Gln His Gln Asp Asn Gly Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Glu Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Lys Gln Leu Glu Gln Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Gln Arg Leu Ala Thr Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Ile Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Gly Val Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
Pro Leu Glu Lys Thr Pro Asn Arg Pro Thr Asn Pro Asp Ser Gly Lys
145 150 155 160
Ala Pro Ala Lys Lys Lys Gln Lys Asp Gly Glu Pro Ala Asp Ser Ala
165 170 175
Arg Arg Thr Leu Asp Phe Glu Asp Ser Gly Ala Gly Asp Gly Pro Pro
180 185 190
Glu Gly Ser Ser Ser Gly Glu Met Ser His Asp Ala Glu Met Arg Ala
195 200 205
Ala Pro Gly Gly Asn Ala Val Glu Ala Gly Gln Gly Ala Asp Gly Val
210 215 220
Gly Asn Ala Ser Gly Asp Trp His Cys Asp Ser Thr Trp Ser Glu Gly
225 230 235 240
Arg Val Thr Thr Thr Ser Thr Arg Thr Trp Val Leu Pro Thr Tyr Asn
245 250 255
Asn His Leu Tyr Leu Arg Ile Gly Thr Thr Ala Asn Ser Asn Thr Tyr
260 265 270
Asn Gly Phe Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His
275 280 285
Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp
290 295 300
Gly Leu Arg Pro Lys Ser Met Arg Val Lys Ile Phe Asn Ile Gln Val
305 310 315 320
Lys Glu Val Thr Thr Ser Asn Gly Glu Thr Thr Val Ala Asn Asn Leu
325 330 335
Thr Ser Thr Val Gln Ile Phe Ala Asp Ser Thr Tyr Glu Leu Pro Tyr
340 345 350
Val Met Asp Ala Gly Gln Glu Gly Ser Phe Pro Pro Phe Pro Asn Asp
355 360 365
Val Phe Met Val Pro Gln Tyr Gly Tyr Cys Gly Val Val Thr Gly Lys
370 375 380
Asn Gln Asn Gln Thr Asp Arg Asn Ala Phe Tyr Cys Leu Glu Tyr Phe
385 390 395 400
Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Glu Val Ser Tyr Gln
405 410 415
Phe Glu Lys Val Pro Phe His Ser Met Tyr Ala His Ser Gln Ser Leu
420 425 430
Asp Arg Met Met Asn Pro Leu Leu Asp Gln Tyr Leu Trp His Leu Gln
435 440 445
Ser Thr Thr Thr Gly Asn Ser Leu Asn Gln Gly Thr Ala Thr Thr Thr
450 455 460
Tyr Gly Lys Ile Thr Thr Gly Asp Phe Ala Tyr Tyr Arg Lys Asn Trp
465 470 475 480
Leu Pro Gly Ala Cys Ile Lys Gln Gln Lys Phe Ser Lys Asn Ala Asn
485 490 495
Gln Asn Tyr Lys Ile Pro Ala Ser Gly Gly Asp Ala Leu Leu Lys Tyr
500 505 510
Asp Thr His Thr Thr Leu Asn Gly Arg Trp Ser Asn Met Ala Pro Gly
515 520 525
Pro Pro Met Ala Thr Ala Gly Ala Gly Asp Ser Asp Phe Ser Asn Ser
530 535 540
Gln Leu Ile Phe Ala Gly Pro Asn Pro Ser Gly Asn Thr Thr Thr Ser
545 550 555 560
Ser Asn Asn Leu Leu Phe Thr Ser Glu Glu Glu Ile Ala Thr Thr Asn
565 570 575
Pro Arg Asp Thr Asp Met Phe Gly Gln Ile Ala Asp Asn Asn Gln Asn
580 585 590
Ala Thr Thr Ala Pro His Ile Ala Asn Leu Asp Ala Met Gly Ile Val
595 600 605
Pro Gly Met Val Trp Gln Asn Arg Asp Ile Tyr Tyr Gln Gly Pro Ile
610 615 620
Trp Ala Lys Val Pro His Thr Asp Gly His Phe His Pro Ser Pro Leu
625 630 635 640
Met Gly Gly Phe Gly Leu Lys His Pro Pro Pro Gln Ile Phe Ile Lys
645 650 655
Asn Thr Pro Val Pro Ala Asn Pro Asn Thr Thr Phe Ser Ala Ala Arg
660 665 670
Ile Asn Ser Phe Leu Thr Gln Tyr Ser Thr Gly Gln Val Ala Val Gln
675 680 685
Ile Asp Trp Glu Ile Gln Lys Glu His Ser Lys Arg Trp Asn Pro Glu
690 695 700
Val Gln Phe Thr Ser Asn Tyr Gly Thr Gln Asn Ser Met Leu Trp Ala
705 710 715 720
Pro Asp Asn Ala Gly Asn Tyr His Glu Leu Arg Ala Ile Gly Ser Arg
725 730 735
Phe Leu Thr His His Leu
740
<210> 37
<211> 4180
<212> DNA
<213> adeno-associated virus 13
<220>
<221> CDS
<222> (1948)..(4149)
<223> AAV13 VP1
<220>
<221> misc_feature
<222> (2356)..(4149)
<223> AAV13 VP2
<220>
<221> misc_feature
<222> (2551)..(4149)
<223> AAV13 VP3
<400> 37
ccgcgagtga gcgaaccagg agctccattt tgcccgcgaa ttttgaacga gcagcagcca 60
tgccgggatt ctacgagatt gtcctgaagg tgcccagcga cctggacgag cacctgcctg 120
gcatttctga ctcttttgta aactgggtgg cggagaagga atgggagctg ccgccggatt 180
ctgacatgga tctgaatctg attgagcagg cacccctaac cgtggccgaa aagctgcaac 240
gcgaattcct ggtcgagtgg cgccgcgtga gtaaggcccc ggaggccctc ttctttgttc 300
agttcgagaa gggggacagc tacttccacc tacacattct ggtggagacc gtgggcgtga 360
aatccatggt ggtgggccgc tacgtgagcc agattaaaga gaagctggtg acccgcatct 420
accgcggggt cgagccgcag cttccgaact ggttcgcggt gaccaagacg cgtaatggcg 480
ccggaggcgg gaacaaggtg gtggacgact gctacatccc caactacctg ctccccaaga 540
cccagcccga gctccagtgg gcgtggacta atatggacca gtatttaagc gcctgtttga 600
atctcgcgga gcgtaaacgg ctggtggcgc agcatctgac gcacgtgtcg cagacgcagg 660
agcagaacaa agagaaccag aatcccaatt ctgacgcgcc ggtgatcaga tcaaaaacct 720
ccgcgaggta catggagctg gtcgggtggc tggtggaccg cgggatcacg tcagaaaagc 780
aatggatcca ggaggaccag gcctcttaca tctccttcaa cgccgcctcc aactcgcggt 840
cacaaatcaa ggccgcactg gacaatgcct ccaaatttat gagcctgaca aaaacggctc 900
cggactacct ggtgggaaac aacccgccgg aggacattac cagcaaccgg atctacaaaa 960
tcctcgagat gaacgggtac gatccgcagt acgcggcctc cgtcttcctg ggctgggcgc 1020
aaaagaagtt cgggaagagg aacaccatct ggctctttgg gccggccacg acgggtaaaa 1080
ccaacatcgc tgaagctatc gcccacgccg tgccctttta cggctgcgtg aactggacca 1140
atgagaactt tccgttcaac gattgcgtcg acaagatggt gatctggtgg gaggagggca 1200
agatgacggc caaggtcgtg gagtccgcca aggccattct gggcggaagc aaggtgcgcg 1260
tggaccaaaa gtgcaagtca tcggcccaga tcgacccaac tcccgtcatc gtcacctcca 1320
acaccaacat gtgcgcggtc atcgacggaa attccaccac cttcgagcac caacaaccac 1380
tccaagaccg gatgttcaag ttcgagctca ccaagcgcct ggagcacgac tttggcaagg 1440
tcaccaagca ggaagtcaag gactttttcc ggtgggcgtc agatcacgtg actgaggtgt 1500
ctcacgagtt ttacgtcaga aagggtggag ctagaaagag gcccgccccc aatgacgcag 1560
atataagtga gcccaagcgg gcctgtccgt cagttgcgca gccatcgacg tcagacgcgg 1620
aagctccggt ggactacgcg gacaggtacc aaaacaaatg ttctcgtcac gtgggcatga 1680
atctgatgct ttttccctgc cggcaatgcg agagaatgaa tcagaatgtg gacatttgct 1740
tcacgcacgg ggtcatggac tgtgccgagt gcttccccgt gtcagaatct caacccgtgt 1800
ctgtcgtcag aaagcggaca tatcagaaac tgtgtccgat tcatcacatc atggggaggg 1860
cgcccgaggt ggcttgttcg gcctgcgatc tggccaatgt ggacttggat gactgtgaca 1920
tggagcaata aatgactcaa accagat atg act gac ggt tac ctt cca gat tgg 1974
Met Thr Asp Gly Tyr Leu Pro Asp Trp
1 5
cta gag gac aac ctc tct gaa ggc gtt cga gag tgg tgg gcg ctg caa 2022
Leu Glu Asp Asn Leu Ser Glu Gly Val Arg Glu Trp Trp Ala Leu Gln
10 15 20 25
cct gga gcc cct aaa ccc aag gca aat caa caa cat cag gac aac gct 2070
Pro Gly Ala Pro Lys Pro Lys Ala Asn Gln Gln His Gln Asp Asn Ala
30 35 40
cgg ggt ctt gtg ctt ccg ggt tac aaa tac ctc gga ccc ggc aac gga 2118
Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly
45 50 55
ctt gac aag ggg gaa ccc gtc aac gca gcg gac gcg gca gcc ctc gaa 2166
Leu Asp Lys Gly Glu Pro Val Asn Ala Ala Asp Ala Ala Ala Leu Glu
60 65 70
cac gac aag gcc tac gac cag cag ctc aag gcc ggt gac aac ccc tac 2214
His Asp Lys Ala Tyr Asp Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr
75 80 85
ctc aag tac aac cac gcc gac gcc gag ttt cag gag cgt ctt caa gaa 2262
Leu Lys Tyr Asn His Ala Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu
90 95 100 105
gat acg tct ttt ggg ggc aac ctc gga cga gca gtc ttc cag gcc aaa 2310
Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg Ala Val Phe Gln Ala Lys
110 115 120
aag agg atc ctt gag cct ctg ggt ctg gtt gag gaa gcg gct aag acg 2358
Lys Arg Ile Leu Glu Pro Leu Gly Leu Val Glu Glu Ala Ala Lys Thr
125 130 135
gct cct gga aaa aag aga cct gta gag caa tct cca gca gaa ccg gac 2406
Ala Pro Gly Lys Lys Arg Pro Val Glu Gln Ser Pro Ala Glu Pro Asp
140 145 150
tcc tct tcg ggc atc ggc aaa tca ggc cag cag ccc gct aga aaa aga 2454
Ser Ser Ser Gly Ile Gly Lys Ser Gly Gln Gln Pro Ala Arg Lys Arg
155 160 165
ctg aat ttt ggt cag act ggc gac aca gag tca gtc cca gac cct caa 2502
Leu Asn Phe Gly Gln Thr Gly Asp Thr Glu Ser Val Pro Asp Pro Gln
170 175 180 185
cca ctc gga caa cct ccc gca gcc ccc tct ggt gtg gga tct act aca 2550
Pro Leu Gly Gln Pro Pro Ala Ala Pro Ser Gly Val Gly Ser Thr Thr
190 195 200
atg gct tca ggc ggt ggc gca cca atg gca gac aat aac gag ggt gcc 2598
Met Ala Ser Gly Gly Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala
205 210 215
gat gga gtg ggt aat tcc tca gga aat tgg cat tgc gat tcc caa tgg 2646
Asp Gly Val Gly Asn Ser Ser Gly Asn Trp His Cys Asp Ser Gln Trp
220 225 230
ctg ggc gac aga gtc atc acc acc agc acc cgc acc tgg gcc ctg ccc 2694
Leu Gly Asp Arg Val Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro
235 240 245
acc tac aac aat cac ctc tac aag caa atc tcc agc caa tca gga gcc 2742
Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile Ser Ser Gln Ser Gly Ala
250 255 260 265
acc aac gac aac cac tac ttt ggc tac agc acc ccc tgg ggg tat ttt 2790
Thr Asn Asp Asn His Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe
270 275 280
gac ttc aac aga ttc cac tgc cac ttt tca cca cgt gac tgg caa aga 2838
Asp Phe Asn Arg Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg
285 290 295
ctc atc aac aac aac tgg gga ttc cga ccc aag aga ctc aac ttc aag 2886
Leu Ile Asn Asn Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys
300 305 310
ctc ttt aac att caa gtc aaa gag gtc acg cag aat gac ggt acg acg 2934
Leu Phe Asn Ile Gln Val Lys Glu Val Thr Gln Asn Asp Gly Thr Thr
315 320 325
acg att gcc aat aac ctt acc agc acg gtt cag gtg ttt act gac tcc 2982
Thr Ile Ala Asn Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser
330 335 340 345
gag tac cag ctc ccg tac gtc ctc ggc tcg gcg cat cag gga tgc ctc 3030
Glu Tyr Gln Leu Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu
350 355 360
ccg ccg ttc cca gca gac gtc ttc atg gtc cca cag tat gga tac ctc 3078
Pro Pro Phe Pro Ala Asp Val Phe Met Val Pro Gln Tyr Gly Tyr Leu
365 370 375
acc ctg aac aac ggg agt cag gcg gta gga cgc tct tcc ttt tac tgc 3126
Thr Leu Asn Asn Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys
380 385 390
ctg gag tac ttt cct tct cag atg ctg cgt act gga aac aac ttt cag 3174
Leu Glu Tyr Phe Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln
395 400 405
ttt agc tac act ttt gaa gac gtg cct ttc cac agc agc tac gct cac 3222
Phe Ser Tyr Thr Phe Glu Asp Val Pro Phe His Ser Ser Tyr Ala His
410 415 420 425
agc caa agt ctg gac cgt ctc atg aat cct ctg atc gac cag tac ctg 3270
Ser Gln Ser Leu Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu
430 435 440
tac tat ctg aac agg aca caa aca gcc agt gga act cag cag tct cgg 3318
Tyr Tyr Leu Asn Arg Thr Gln Thr Ala Ser Gly Thr Gln Gln Ser Arg
445 450 455
cta ctg ttt agc caa gct gga ccc acc agt atg tct ctt caa gct aaa 3366
Leu Leu Phe Ser Gln Ala Gly Pro Thr Ser Met Ser Leu Gln Ala Lys
460 465 470
aac tgg ctg cct gga cct tgc tac aga cag cag cgt ctg tca aag cag 3414
Asn Trp Leu Pro Gly Pro Cys Tyr Arg Gln Gln Arg Leu Ser Lys Gln
475 480 485
gca aac gac aac aac aac agc aac ttt ccc tgg act ggt gcc acc aaa 3462
Ala Asn Asp Asn Asn Asn Ser Asn Phe Pro Trp Thr Gly Ala Thr Lys
490 495 500 505
tat cat ctg aat ggc cgg gac tca ttg gtg aac ccg ggc cct gct atg 3510
Tyr His Leu Asn Gly Arg Asp Ser Leu Val Asn Pro Gly Pro Ala Met
510 515 520
gcc agt cac aag gat gac aaa gaa aag ttt ttc ccc atg cat gga acc 3558
Ala Ser His Lys Asp Asp Lys Glu Lys Phe Phe Pro Met His Gly Thr
525 530 535
ctg ata ttt ggt aaa gaa gga aca aat gcc aac aac gcg gat ttg gaa 3606
Leu Ile Phe Gly Lys Glu Gly Thr Asn Ala Asn Asn Ala Asp Leu Glu
540 545 550
aat gtc atg att aca gat gaa gaa gaa atc cgc acc acc aat ccc gtg 3654
Asn Val Met Ile Thr Asp Glu Glu Glu Ile Arg Thr Thr Asn Pro Val
555 560 565
gct acg gag cag tac ggg act gtg tca aat aat ttg caa aac tca aac 3702
Ala Thr Glu Gln Tyr Gly Thr Val Ser Asn Asn Leu Gln Asn Ser Asn
570 575 580 585
gct ggt cca act act gga act gtc aat cac caa gga gcg tta cct ggt 3750
Ala Gly Pro Thr Thr Gly Thr Val Asn His Gln Gly Ala Leu Pro Gly
590 595 600
atg gtg tgg cag gat cga gac gtg tac ctg cag gga ccc att tgg gcc 3798
Met Val Trp Gln Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala
605 610 615
aag att cct cac acc gat gga cac ttt cat cct tct cca ctg atg gga 3846
Lys Ile Pro His Thr Asp Gly His Phe His Pro Ser Pro Leu Met Gly
620 625 630
ggt ttt ggg ctc aaa cac ccg cct cct cag atc atg atc aaa aac act 3894
Gly Phe Gly Leu Lys His Pro Pro Pro Gln Ile Met Ile Lys Asn Thr
635 640 645
ccc gtt cca gcc aat cct ccc aca aac ttt agt gcg gca aag ttt gct 3942
Pro Val Pro Ala Asn Pro Pro Thr Asn Phe Ser Ala Ala Lys Phe Ala
650 655 660 665
tcc ttc atc aca cag tac tcc acg ggg cag gtc agc gtg gag atc gag 3990
Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu
670 675 680
tgg gag ctg cag aag gag aac agc aaa cgc tgg aat ccc gaa att cag 4038
Trp Glu Leu Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln
685 690 695
tac act tcc aac tac aac aaa tct gtt aat gtg gac ttt act gtg gac 4086
Tyr Thr Ser Asn Tyr Asn Lys Ser Val Asn Val Asp Phe Thr Val Asp
700 705 710
act aat ggt gtg tat tca gag cct cgc ccc att ggc acc aga tac ctg 4134
Thr Asn Gly Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu
715 720 725
act cgt aat ctg taa ttgcttgtta atcaataaac cggttaattc g 4180
Thr Arg Asn Leu
730
<210> 38
<211> 733
<212> PRT
<213> adeno-associated virus 13
<400> 38
Met Thr Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu
1 5 10 15
Gly Val Arg Glu Trp Trp Ala Leu Gln Pro Gly Ala Pro Lys Pro Lys
20 25 30
Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro Gly
35 40 45
Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val
50 55 60
Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln
65 70 75 80
Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp
85 90 95
Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly Asn
100 105 110
Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Ile Leu Glu Pro Leu
115 120 125
Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg Pro
130 135 140
Val Glu Gln Ser Pro Ala Glu Pro Asp Ser Ser Ser Gly Ile Gly Lys
145 150 155 160
Ser Gly Gln Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln Thr Gly
165 170 175
Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Gln Pro Pro Ala
180 185 190
Ala Pro Ser Gly Val Gly Ser Thr Thr Met Ala Ser Gly Gly Gly Ala
195 200 205
Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn Ser Ser
210 215 220
Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile Thr
225 230 235 240
Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr
245 250 255
Lys Gln Ile Ser Ser Gln Ser Gly Ala Thr Asn Asp Asn His Tyr Phe
260 265 270
Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys
275 280 285
His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly
290 295 300
Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val Lys
305 310 315 320
Glu Val Thr Gln Asn Asp Gly Thr Thr Thr Ile Ala Asn Asn Leu Thr
325 330 335
Ser Thr Val Gln Val Phe Thr Asp Ser Glu Tyr Gln Leu Pro Tyr Val
340 345 350
Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro Ala Asp Val
355 360 365
Phe Met Val Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn Gly Ser Gln
370 375 380
Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln
385 390 395 400
Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Thr Phe Glu Asp
405 410 415
Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp Arg Leu
420 425 430
Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Asn Arg Thr Gln
435 440 445
Thr Ala Ser Gly Thr Gln Gln Ser Arg Leu Leu Phe Ser Gln Ala Gly
450 455 460
Pro Thr Ser Met Ser Leu Gln Ala Lys Asn Trp Leu Pro Gly Pro Cys
465 470 475 480
Tyr Arg Gln Gln Arg Leu Ser Lys Gln Ala Asn Asp Asn Asn Asn Ser
485 490 495
Asn Phe Pro Trp Thr Gly Ala Thr Lys Tyr His Leu Asn Gly Arg Asp
500 505 510
Ser Leu Val Asn Pro Gly Pro Ala Met Ala Ser His Lys Asp Asp Lys
515 520 525
Glu Lys Phe Phe Pro Met His Gly Thr Leu Ile Phe Gly Lys Glu Gly
530 535 540
Thr Asn Ala Asn Asn Ala Asp Leu Glu Asn Val Met Ile Thr Asp Glu
545 550 555 560
Glu Glu Ile Arg Thr Thr Asn Pro Val Ala Thr Glu Gln Tyr Gly Thr
565 570 575
Val Ser Asn Asn Leu Gln Asn Ser Asn Ala Gly Pro Thr Thr Gly Thr
580 585 590
Val Asn His Gln Gly Ala Leu Pro Gly Met Val Trp Gln Asp Arg Asp
595 600 605
Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly
610 615 620
His Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Leu Lys His Pro
625 630 635 640
Pro Pro Gln Ile Met Ile Lys Asn Thr Pro Val Pro Ala Asn Pro Pro
645 650 655
Thr Asn Phe Ser Ala Ala Lys Phe Ala Ser Phe Ile Thr Gln Tyr Ser
660 665 670
Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn
675 680 685
Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Asn Lys
690 695 700
Ser Val Asn Val Asp Phe Thr Val Asp Thr Asn Gly Val Tyr Ser Glu
705 710 715 720
Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu
725 730
<210> 39
<211> 2175
<212> DNA
<213> adeno-associated virus 5
<220>
<221> CDS
<222> (1)..(2175)
<223> AAV5 VP1
<400> 39
atg tct ttt gtt gat cac cca ccc gat tgg ttg gaa gaa gtt ggt gaa 48
Met Ser Phe Val Asp His Pro Pro Asp Trp Leu Glu Glu Val Gly Glu
1 5 10 15
ggt ctt cgc gag ttt ttg ggc ctt gaa gcg ggc cca ccg aaa cca aaa 96
Gly Leu Arg Glu Phe Leu Gly Leu Glu Ala Gly Pro Pro Lys Pro Lys
20 25 30
ccc aat cag cag cat caa gat caa gcc cgt ggt ctt gtg ctg cct ggt 144
Pro Asn Gln Gln His Gln Asp Gln Ala Arg Gly Leu Val Leu Pro Gly
35 40 45
tat aac tat ctc gga ccc gga aac ggt ctc gat cga gga gag cct gtc 192
Tyr Asn Tyr Leu Gly Pro Gly Asn Gly Leu Asp Arg Gly Glu Pro Val
50 55 60
aac agg gca gac gag gtc gcg cga gag cac gac atc tcg tac aac gag 240
Asn Arg Ala Asp Glu Val Ala Arg Glu His Asp Ile Ser Tyr Asn Glu
65 70 75 80
cag ctt gag gcg gga gac aac ccc tac ctc aag tac aac cac gcg gac 288
Gln Leu Glu Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp
85 90 95
gcc gag ttt cag gag aag ctc gcc gac gac aca tcc ttc ggg gga aac 336
Ala Glu Phe Gln Glu Lys Leu Ala Asp Asp Thr Ser Phe Gly Gly Asn
100 105 110
ctc gga aag gca gtc ttt cag gcc aag aaa agg gtt ctc gaa cct ttt 384
Leu Gly Lys Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro Phe
115 120 125
ggc ctg gtt gaa gag ggt gct aag acg gcc cct acc gga aag cgg ata 432
Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Thr Gly Lys Arg Ile
130 135 140
gac gac cac ttt cca aaa aga aag aag gct cgg acc gaa gag gac tcc 480
Asp Asp His Phe Pro Lys Arg Lys Lys Ala Arg Thr Glu Glu Asp Ser
145 150 155 160
aag cct tcc acc tcg tca gac gcc gaa gct gga ccc agc gga tcc cag 528
Lys Pro Ser Thr Ser Ser Asp Ala Glu Ala Gly Pro Ser Gly Ser Gln
165 170 175
cag ctg caa atc cca gcc caa cca gcc tca agt ttg gga gct gat aca 576
Gln Leu Gln Ile Pro Ala Gln Pro Ala Ser Ser Leu Gly Ala Asp Thr
180 185 190
atg tct gcg gga ggt ggc ggc cca ttg ggc gac aat aac caa ggt gcc 624
Met Ser Ala Gly Gly Gly Gly Pro Leu Gly Asp Asn Asn Gln Gly Ala
195 200 205
gat gga gtg ggc aat gcc tcg gga gat tgg cat tgc gat tcc acg tgg 672
Asp Gly Val Gly Asn Ala Ser Gly Asp Trp His Cys Asp Ser Thr Trp
210 215 220
atg ggg gac aga gtc gtc acc aag tcc acc cga acc tgg gtg ctg ccc 720
Met Gly Asp Arg Val Val Thr Lys Ser Thr Arg Thr Trp Val Leu Pro
225 230 235 240
agc tac aac aac cac cag tac cga gag atc aaa agc ggc tcc gtc gac 768
Ser Tyr Asn Asn His Gln Tyr Arg Glu Ile Lys Ser Gly Ser Val Asp
245 250 255
gga agc aac gcc aac gcc tac ttt gga tac agc acc ccc tgg ggg tac 816
Gly Ser Asn Ala Asn Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr
260 265 270
ttt gac ttt aac cgc ttc cac agc cac tgg agc ccc cga gac tgg caa 864
Phe Asp Phe Asn Arg Phe His Ser His Trp Ser Pro Arg Asp Trp Gln
275 280 285
aga ctc atc aac aac tac tgg ggc ttc aga ccc cgg tcc ctc aga gtc 912
Arg Leu Ile Asn Asn Tyr Trp Gly Phe Arg Pro Arg Ser Leu Arg Val
290 295 300
aaa atc ttc aac att caa gtc aaa gag gtc acg gtg cag gac tcc acc 960
Lys Ile Phe Asn Ile Gln Val Lys Glu Val Thr Val Gln Asp Ser Thr
305 310 315 320
acc acc atc gcc aac aac ctc acc tcc acc gtc caa gtg ttt acg gac 1008
Thr Thr Ile Ala Asn Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp
325 330 335
gac gac tac cag ctg ccc tac gtc gtc ggc aac ggg acc gag gga tgc 1056
Asp Asp Tyr Gln Leu Pro Tyr Val Val Gly Asn Gly Thr Glu Gly Cys
340 345 350
ctg ccg gcc ttc cct ccg cag gtc ttt acg ctg ccg cag tac ggt tac 1104
Leu Pro Ala Phe Pro Pro Gln Val Phe Thr Leu Pro Gln Tyr Gly Tyr
355 360 365
gcg acg ctg aac cgc gac aac aca gaa aat ccc acc gag agg agc agc 1152
Ala Thr Leu Asn Arg Asp Asn Thr Glu Asn Pro Thr Glu Arg Ser Ser
370 375 380
ttc ttc tgc cta gag tac ttt ccc agc aag atg ctg aga acg ggc aac 1200
Phe Phe Cys Leu Glu Tyr Phe Pro Ser Lys Met Leu Arg Thr Gly Asn
385 390 395 400
aac ttt gag ttt acc tac aac ttt gag gag gtg ccc ttc cac tcc agc 1248
Asn Phe Glu Phe Thr Tyr Asn Phe Glu Glu Val Pro Phe His Ser Ser
405 410 415
ttc gct ccc agt cag aac ctg ttc aag ctg gcc aac ccg ctg gtg gac 1296
Phe Ala Pro Ser Gln Asn Leu Phe Lys Leu Ala Asn Pro Leu Val Asp
420 425 430
cag tac ttg tac cgc ttc gtg agc aca aat aac act ggc gga gtc cag 1344
Gln Tyr Leu Tyr Arg Phe Val Ser Thr Asn Asn Thr Gly Gly Val Gln
435 440 445
ttc aac aag aac ctg gcc ggg aga tac gcc aac acc tac aaa aac tgg 1392
Phe Asn Lys Asn Leu Ala Gly Arg Tyr Ala Asn Thr Tyr Lys Asn Trp
450 455 460
ttc ccg ggg ccc atg ggc cga acc cag ggc tgg aac ctg ggc tcc ggg 1440
Phe Pro Gly Pro Met Gly Arg Thr Gln Gly Trp Asn Leu Gly Ser Gly
465 470 475 480
gtc aac cgc gcc agt gtc agc gcc ttc gcc acg acc aat agg atg gag 1488
Val Asn Arg Ala Ser Val Ser Ala Phe Ala Thr Thr Asn Arg Met Glu
485 490 495
ctc gag ggc gcg agt tac cag gtg ccc ccg cag ccg aac ggc atg acc 1536
Leu Glu Gly Ala Ser Tyr Gln Val Pro Pro Gln Pro Asn Gly Met Thr
500 505 510
aac aac ctc cag ggc agc aac acc tat gcc ctg gag aac act atg atc 1584
Asn Asn Leu Gln Gly Ser Asn Thr Tyr Ala Leu Glu Asn Thr Met Ile
515 520 525
ttc aac agc cag ccg gcg aac ccg ggc acc acc gcc acg tac ctc gag 1632
Phe Asn Ser Gln Pro Ala Asn Pro Gly Thr Thr Ala Thr Tyr Leu Glu
530 535 540
ggc aac atg ctc atc acc agc gag agc gag acg cag ccg gtg aac cgc 1680
Gly Asn Met Leu Ile Thr Ser Glu Ser Glu Thr Gln Pro Val Asn Arg
545 550 555 560
gtg gcg tac aac gtc ggc ggg cag atg gcc acc aac aac cag agc tcc 1728
Val Ala Tyr Asn Val Gly Gly Gln Met Ala Thr Asn Asn Gln Ser Ser
565 570 575
acc act gcc ccc gcg acc ggc acg tac aac ctc cag gaa atc gtg ccc 1776
Thr Thr Ala Pro Ala Thr Gly Thr Tyr Asn Leu Gln Glu Ile Val Pro
580 585 590
ggc agc gtg tgg atg gag agg gac gtg tac ctc caa gga ccc atc tgg 1824
Gly Ser Val Trp Met Glu Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp
595 600 605
gcc aag atc cca gag acg ggg gcg cac ttt cac ccc tct ccg gcc atg 1872
Ala Lys Ile Pro Glu Thr Gly Ala His Phe His Pro Ser Pro Ala Met
610 615 620
ggc gga ttc gga ctc aaa cac cca ccg ccc atg atg ctc atc aag aac 1920
Gly Gly Phe Gly Leu Lys His Pro Pro Pro Met Met Leu Ile Lys Asn
625 630 635 640
acg cct gtg ccc gga aat atc acc agc ttc tcg gac gtg ccc gtc agc 1968
Thr Pro Val Pro Gly Asn Ile Thr Ser Phe Ser Asp Val Pro Val Ser
645 650 655
agc ttc atc acc cag tac agc acc ggg cag gtc acc gtg gag atg gag 2016
Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val Thr Val Glu Met Glu
660 665 670
tgg gag ctc aag aag gaa aac tcc aag agg tgg aac cca gag atc cag 2064
Trp Glu Leu Lys Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln
675 680 685
tac aca aac aac tac aac gac ccc cag ttt gtg gac ttt gcc ccg gac 2112
Tyr Thr Asn Asn Tyr Asn Asp Pro Gln Phe Val Asp Phe Ala Pro Asp
690 695 700
agc acc ggg gaa tac aga acc acc aga cct atc gga acc cga tac ctt 2160
Ser Thr Gly Glu Tyr Arg Thr Thr Arg Pro Ile Gly Thr Arg Tyr Leu
705 710 715 720
acc cga ccc ctt taa 2175
Thr Arg Pro Leu
<210> 40
<211> 724
<212> PRT
<213> adeno-associated virus 5
<400> 40
Met Ser Phe Val Asp His Pro Pro Asp Trp Leu Glu Glu Val Gly Glu
1 5 10 15
Gly Leu Arg Glu Phe Leu Gly Leu Glu Ala Gly Pro Pro Lys Pro Lys
20 25 30
Pro Asn Gln Gln His Gln Asp Gln Ala Arg Gly Leu Val Leu Pro Gly
35 40 45
Tyr Asn Tyr Leu Gly Pro Gly Asn Gly Leu Asp Arg Gly Glu Pro Val
50 55 60
Asn Arg Ala Asp Glu Val Ala Arg Glu His Asp Ile Ser Tyr Asn Glu
65 70 75 80
Gln Leu Glu Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp
85 90 95
Ala Glu Phe Gln Glu Lys Leu Ala Asp Asp Thr Ser Phe Gly Gly Asn
100 105 110
Leu Gly Lys Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro Phe
115 120 125
Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Thr Gly Lys Arg Ile
130 135 140
Asp Asp His Phe Pro Lys Arg Lys Lys Ala Arg Thr Glu Glu Asp Ser
145 150 155 160
Lys Pro Ser Thr Ser Ser Asp Ala Glu Ala Gly Pro Ser Gly Ser Gln
165 170 175
Gln Leu Gln Ile Pro Ala Gln Pro Ala Ser Ser Leu Gly Ala Asp Thr
180 185 190
Met Ser Ala Gly Gly Gly Gly Pro Leu Gly Asp Asn Asn Gln Gly Ala
195 200 205
Asp Gly Val Gly Asn Ala Ser Gly Asp Trp His Cys Asp Ser Thr Trp
210 215 220
Met Gly Asp Arg Val Val Thr Lys Ser Thr Arg Thr Trp Val Leu Pro
225 230 235 240
Ser Tyr Asn Asn His Gln Tyr Arg Glu Ile Lys Ser Gly Ser Val Asp
245 250 255
Gly Ser Asn Ala Asn Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr
260 265 270
Phe Asp Phe Asn Arg Phe His Ser His Trp Ser Pro Arg Asp Trp Gln
275 280 285
Arg Leu Ile Asn Asn Tyr Trp Gly Phe Arg Pro Arg Ser Leu Arg Val
290 295 300
Lys Ile Phe Asn Ile Gln Val Lys Glu Val Thr Val Gln Asp Ser Thr
305 310 315 320
Thr Thr Ile Ala Asn Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp
325 330 335
Asp Asp Tyr Gln Leu Pro Tyr Val Val Gly Asn Gly Thr Glu Gly Cys
340 345 350
Leu Pro Ala Phe Pro Pro Gln Val Phe Thr Leu Pro Gln Tyr Gly Tyr
355 360 365
Ala Thr Leu Asn Arg Asp Asn Thr Glu Asn Pro Thr Glu Arg Ser Ser
370 375 380
Phe Phe Cys Leu Glu Tyr Phe Pro Ser Lys Met Leu Arg Thr Gly Asn
385 390 395 400
Asn Phe Glu Phe Thr Tyr Asn Phe Glu Glu Val Pro Phe His Ser Ser
405 410 415
Phe Ala Pro Ser Gln Asn Leu Phe Lys Leu Ala Asn Pro Leu Val Asp
420 425 430
Gln Tyr Leu Tyr Arg Phe Val Ser Thr Asn Asn Thr Gly Gly Val Gln
435 440 445
Phe Asn Lys Asn Leu Ala Gly Arg Tyr Ala Asn Thr Tyr Lys Asn Trp
450 455 460
Phe Pro Gly Pro Met Gly Arg Thr Gln Gly Trp Asn Leu Gly Ser Gly
465 470 475 480
Val Asn Arg Ala Ser Val Ser Ala Phe Ala Thr Thr Asn Arg Met Glu
485 490 495
Leu Glu Gly Ala Ser Tyr Gln Val Pro Pro Gln Pro Asn Gly Met Thr
500 505 510
Asn Asn Leu Gln Gly Ser Asn Thr Tyr Ala Leu Glu Asn Thr Met Ile
515 520 525
Phe Asn Ser Gln Pro Ala Asn Pro Gly Thr Thr Ala Thr Tyr Leu Glu
530 535 540
Gly Asn Met Leu Ile Thr Ser Glu Ser Glu Thr Gln Pro Val Asn Arg
545 550 555 560
Val Ala Tyr Asn Val Gly Gly Gln Met Ala Thr Asn Asn Gln Ser Ser
565 570 575
Thr Thr Ala Pro Ala Thr Gly Thr Tyr Asn Leu Gln Glu Ile Val Pro
580 585 590
Gly Ser Val Trp Met Glu Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp
595 600 605
Ala Lys Ile Pro Glu Thr Gly Ala His Phe His Pro Ser Pro Ala Met
610 615 620
Gly Gly Phe Gly Leu Lys His Pro Pro Pro Met Met Leu Ile Lys Asn
625 630 635 640
Thr Pro Val Pro Gly Asn Ile Thr Ser Phe Ser Asp Val Pro Val Ser
645 650 655
Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val Thr Val Glu Met Glu
660 665 670
Trp Glu Leu Lys Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln
675 680 685
Tyr Thr Asn Asn Tyr Asn Asp Pro Gln Phe Val Asp Phe Ala Pro Asp
690 695 700
Ser Thr Gly Glu Tyr Arg Thr Thr Arg Pro Ile Gly Thr Arg Tyr Leu
705 710 715 720
Thr Arg Pro Leu
<210> 41
<211> 2184
<212> DNA
<213> Artificial
<220>
<223> construct based on AAV5
<220>
<221> misc_feature
<222> (1)..(9)
<223> VP2 initiatior context
<220>
<221> misc_feature
<222> (10)..(12)
<223> suboptimal translation initiation codon
<220>
<221> misc_feature
<222> (30)..(30)
<223> splicing site
<220>
<221> misc_feature
<222> (33)..(33)
<223> splicing site
<400> 41
cctgttaaga cgtcttttgt tgatcaccca cccgattggt tggaagaagt tggtgaaggt 60
cttcgcgagt ttttgggcct tgaagcgggc ccaccgaaac caaaacccaa tcagcagcat 120
caagatcaag cccgtggtct tgtgctgcct ggttataact atctcggacc cggaaacggt 180
ctcgatcgag gagagcctgt caacagggca gacgaggtcg cgcgagagca cgacatctcg 240
tacaacgagc agcttgaggc gggagacaac ccctacctca agtacaacca cgcggacgcc 300
gagtttcagg agaagctcgc cgacgacaca tccttcgggg gaaacctcgg aaaggcagtc 360
tttcaggcca agaaaagggt tctcgaacct tttggcctgg ttgaagaggg tgctaagacg 420
gcccctaccg gaaagcggat agacgaccac tttccaaaaa gaaagaaggc tcggaccgaa 480
gaggactcca agccttccac ctcgtcagac gccgaagctg gacccagcgg atcccagcag 540
ctgcaaatcc cagcccaacc agcctcaagt ttgggagctg atacaatgtc tgcgggaggt 600
ggcggcccat tgggcgacaa taaccaaggt gccgatggag tgggcaatgc ctcgggagat 660
tggcattgcg attccacgtg gatgggggac agagtcgtca ccaagtccac ccgaacctgg 720
gtgctgccca gctacaacaa ccaccagtac cgagagatca aaagcggctc cgtcgacgga 780
agcaacgcca acgcctactt tggatacagc accccctggg ggtactttga ctttaaccgc 840
ttccacagcc actggagccc ccgagactgg caaagactca tcaacaacta ctggggcttc 900
agaccccggt ccctcagagt caaaatcttc aacattcaag tcaaagaggt cacggtgcag 960
gactccacca ccaccatcgc caacaacctc acctccaccg tccaagtgtt tacggacgac 1020
gactaccagc tgccctacgt cgtcggcaac gggaccgagg gatgcctgcc ggccttccct 1080
ccgcaggtct ttacgctgcc gcagtacggt tacgcgacgc tgaaccgcga caacacagaa 1140
aatcccaccg agaggagcag cttcttctgc ctagagtact ttcccagcaa gatgctgaga 1200
acgggcaaca actttgagtt tacctacaac tttgaggagg tgcccttcca ctccagcttc 1260
gctcccagtc agaacctctt caagctggcc aacccgctgg tggaccagta cttgtaccgc 1320
ttcgtgagca caaataacac tggcggagtc cagttcaaca agaacctggc cgggagatac 1380
gccaacacct acaaaaactg gttcccgggg cccatgggcc gaacccaggg ctggaacctg 1440
ggctccgggg tcaaccgcgc cagtgtcagc gccttcgcca cgaccaatag gatggagctc 1500
gagggcgcga gttaccaggt gcccccgcag ccgaacggca tgaccaacaa cctccagggc 1560
agcaacacct atgccctgga gaacactatg atcttcaaca gccagccggc gaacccgggc 1620
accaccgcca cgtacctcga gggcaacatg ctcatcacca gcgagagcga gacgcagccg 1680
gtgaaccgcg tggcgtacaa cgtcggcggg cagatggcca ccaacaacca gagctccacc 1740
actgcccccg cgaccggcac gtacaacctc caggaaatcg tgcccggcag cgtgtggatg 1800
gagagggacg tgtacctcca aggacccatc tgggccaaga tcccagagac gggggcgcac 1860
tttcacccct ctccggccat gggcggattc ggactcaaac acccaccgcc catgatgctc 1920
atcaagaaca cgcctgtgcc cggaaatatc accagcttct cggacgtgcc cgtcagcagc 1980
ttcatcaccc agtacagcac cgggcaggtc accgtggaga tggagtggga gctcaagaag 2040
gaaaactcca agaggtggaa cccagagatc cagtacacaa acaactacaa cgacccccag 2100
tttgtggact ttgccccgga cagcaccggg gaatacagaa ccaccagacc tatcggaacc 2160
cgatacctta cccgacccct ttaa 2184
<210> 42
<211> 2187
<212> DNA
<213> artificial
<220>
<223> artificial sequence based on AAV5
<220>
<221> misc_feature
<222> (1)..(9)
<223> VP2 initiator context
<220>
<221> misc_feature
<222> (10)..(12)
<223> suboptimal translation initiation codon
<220>
<221> misc_feature
<222> (13)..(15)
<223> additional triplet added to sequence
<220>
<221> mutation
<222> (33)..(33)
<223> remove splice site
<220>
<221> mutation
<222> (36)..(36)
<223> remove splice site
<400> 42
cctgttaaga cggcttcttt tgttgatcac ccacccgatt ggttggaaga agttggtgaa 60
ggtcttcgcg agtttttggg ccttgaagcg ggcccaccga aaccaaaacc caatcagcag 120
catcaagatc aagcccgtgg tcttgtgctg cctggttata actatctcgg acccggaaac 180
ggtctcgatc gaggagagcc tgtcaacagg gcagacgagg tcgcgcgaga gcacgacatc 240
tcgtacaacg agcagcttga ggcgggagac aacccctacc tcaagtacaa ccacgcggac 300
gccgagtttc aggagaagct cgccgacgac acatccttcg ggggaaacct cggaaaggca 360
gtctttcagg ccaagaaaag ggttctcgaa ccttttggcc tggttgaaga gggtgctaag 420
acggccccta ccggaaagcg gatagacgac cactttccaa aaagaaagaa ggctcggacc 480
gaagaggact ccaagccttc cacctcgtca gacgccgaag ctggacccag cggatcccag 540
cagctgcaaa tcccagccca accagcctca agtttgggag ctgatacaat gtctgcggga 600
ggtggcggcc cattgggcga caataaccaa ggtgccgatg gagtgggcaa tgcctcggga 660
gattggcatt gcgattccac gtggatgggg gacagagtcg tcaccaagtc cacccgaacc 720
tgggtgctgc ccagctacaa caaccaccag taccgagaga tcaaaagcgg ctccgtcgac 780
ggaagcaacg ccaacgccta ctttggatac agcaccccct gggggtactt tgactttaac 840
cgcttccaca gccactggag cccccgagac tggcaaagac tcatcaacaa ctactggggc 900
ttcagacccc ggtccctcag agtcaaaatc ttcaacattc aagtcaaaga ggtcacggtg 960
caggactcca ccaccaccat cgccaacaac ctcacctcca ccgtccaagt gtttacggac 1020
gacgactacc agctgcccta cgtcgtcggc aacgggaccg agggatgcct gccggccttc 1080
cctccgcagg tctttacgct gccgcagtac ggttacgcga cgctgaaccg cgacaacaca 1140
gaaaatccca ccgagaggag cagcttcttc tgcctagagt actttcccag caagatgctg 1200
agaacgggca acaactttga gtttacctac aactttgagg aggtgccctt ccactccagc 1260
ttcgctccca gtcagaacct cttcaagctg gccaacccgc tggtggacca gtacttgtac 1320
cgcttcgtga gcacaaataa cactggcgga gtccagttca acaagaacct ggccgggaga 1380
tacgccaaca cctacaaaaa ctggttcccg gggcccatgg gccgaaccca gggctggaac 1440
ctgggctccg gggtcaaccg cgccagtgtc agcgccttcg ccacgaccaa taggatggag 1500
ctcgagggcg cgagttacca ggtgcccccg cagccgaacg gcatgaccaa caacctccag 1560
ggcagcaaca cctatgccct ggagaacact atgatcttca acagccagcc ggcgaacccg 1620
ggcaccaccg ccacgtacct cgagggcaac atgctcatca ccagcgagag cgagacgcag 1680
ccggtgaacc gcgtggcgta caacgtcggc gggcagatgg ccaccaacaa ccagagctcc 1740
accactgccc ccgcgaccgg cacgtacaac ctccaggaaa tcgtgcccgg cagcgtgtgg 1800
atggagaggg acgtgtacct ccaaggaccc atctgggcca agatcccaga gacgggggcg 1860
cactttcacc cctctccggc catgggcgga ttcggactca aacacccacc gcccatgatg 1920
ctcatcaaga acacgcctgt gcccggaaat atcaccagct tctcggacgt gcccgtcagc 1980
agcttcatca cccagtacag caccgggcag gtcaccgtgg agatggagtg ggagctcaag 2040
aaggaaaact ccaagaggtg gaacccagag atccagtaca caaacaacta caacgacccc 2100
cagtttgtgg actttgcccc ggacagcacc ggggaataca gaaccaccag acctatcgga 2160
acccgatacc ttacccgacc cctttaa 2187
<210> 43
<211> 2184
<212> DNA
<213> Artificial
<220>
<223> artificial sequence based on AAV5
<220>
<221> misc_feature
<222> (1)..(9)
<223> VP2 initiator context
<220>
<221> misc_feature
<222> (10)..(12)
<223> suboptimal translation initiation codon
<220>
<221> mutation
<222> (13)..(13)
<223> point mutation to G
<220>
<221> mutation
<222> (30)..(30)
<223> point mutation to remove splice site
<220>
<221> mutation
<222> (33)..(33)
<223> point mutation to remove splice site
<400> 43
cctgttaaga cggcttttgt tgatcaccca cccgattggt tggaagaagt tggtgaaggt 60
cttcgcgagt ttttgggcct tgaagcgggc ccaccgaaac caaaacccaa tcagcagcat 120
caagatcaag cccgtggtct tgtgctgcct ggttataact atctcggacc cggaaacggt 180
ctcgatcgag gagagcctgt caacagggca gacgaggtcg cgcgagagca cgacatctcg 240
tacaacgagc agcttgaggc gggagacaac ccctacctca agtacaacca cgcggacgcc 300
gagtttcagg agaagctcgc cgacgacaca tccttcgggg gaaacctcgg aaaggcagtc 360
tttcaggcca agaaaagggt tctcgaacct tttggcctgg ttgaagaggg tgctaagacg 420
gcccctaccg gaaagcggat agacgaccac tttccaaaaa gaaagaaggc tcggaccgaa 480
gaggactcca agccttccac ctcgtcagac gccgaagctg gacccagcgg atcccagcag 540
ctgcaaatcc cagcccaacc agcctcaagt ttgggagctg atacaatgtc tgcgggaggt 600
ggcggcccat tgggcgacaa taaccaaggt gccgatggag tgggcaatgc ctcgggagat 660
tggcattgcg attccacgtg gatgggggac agagtcgtca ccaagtccac ccgaacctgg 720
gtgctgccca gctacaacaa ccaccagtac cgagagatca aaagcggctc cgtcgacgga 780
agcaacgcca acgcctactt tggatacagc accccctggg ggtactttga ctttaaccgc 840
ttccacagcc actggagccc ccgagactgg caaagactca tcaacaacta ctggggcttc 900
agaccccggt ccctcagagt caaaatcttc aacattcaag tcaaagaggt cacggtgcag 960
gactccacca ccaccatcgc caacaacctc acctccaccg tccaagtgtt tacggacgac 1020
gactaccagc tgccctacgt cgtcggcaac gggaccgagg gatgcctgcc ggccttccct 1080
ccgcaggtct ttacgctgcc gcagtacggt tacgcgacgc tgaaccgcga caacacagaa 1140
aatcccaccg agaggagcag cttcttctgc ctagagtact ttcccagcaa gatgctgaga 1200
acgggcaaca actttgagtt tacctacaac tttgaggagg tgcccttcca ctccagcttc 1260
gctcccagtc agaacctctt caagctggcc aacccgctgg tggaccagta cttgtaccgc 1320
ttcgtgagca caaataacac tggcggagtc cagttcaaca agaacctggc cgggagatac 1380
gccaacacct acaaaaactg gttcccgggg cccatgggcc gaacccaggg ctggaacctg 1440
ggctccgggg tcaaccgcgc cagtgtcagc gccttcgcca cgaccaatag gatggagctc 1500
gagggcgcga gttaccaggt gcccccgcag ccgaacggca tgaccaacaa cctccagggc 1560
agcaacacct atgccctgga gaacactatg atcttcaaca gccagccggc gaacccgggc 1620
accaccgcca cgtacctcga gggcaacatg ctcatcacca gcgagagcga gacgcagccg 1680
gtgaaccgcg tggcgtacaa cgtcggcggg cagatggcca ccaacaacca gagctccacc 1740
actgcccccg cgaccggcac gtacaacctc caggaaatcg tgcccggcag cgtgtggatg 1800
gagagggacg tgtacctcca aggacccatc tgggccaaga tcccagagac gggggcgcac 1860
tttcacccct ctccggccat gggcggattc ggactcaaac acccaccgcc catgatgctc 1920
atcaagaaca cgcctgtgcc cggaaatatc accagcttct cggacgtgcc cgtcagcagc 1980
ttcatcaccc agtacagcac cgggcaggtc accgtggaga tggagtggga gctcaagaag 2040
gaaaactcca agaggtggaa cccagagatc cagtacacaa acaactacaa cgacccccag 2100
tttgtggact ttgccccgga cagcaccggg gaatacagaa ccaccagacc tatcggaacc 2160
cgatacctta cccgacccct ttaa 2184
<210> 44
<211> 2184
<212> DNA
<213> Artificial
<220>
<223> artificial sequence based on AAV5
<220>
<221> misc_feature
<222> (1)..(9)
<223> VP2 initiator context
<220>
<221> misc_feature
<222> (10)..(12)
<223> suboptimal translation initiation codon
<220>
<221> mutation
<222> (13)..(13)
<223> piont mutation to threonine
<220>
<221> mutation
<222> (30)..(30)
<223> point mutation to remove splice site
<220>
<221> mutation
<222> (33)..(33)
<223> point mutation to remove splice site
<400> 44
cctgttaagc tgacttttgt tgatcaccca cccgattggt tggaagaagt tggtgaaggt 60
cttcgcgagt ttttgggcct tgaagcgggc ccaccgaaac caaaacccaa tcagcagcat 120
caagatcaag cccgtggtct tgtgctgcct ggttataact atctcggacc cggaaacggt 180
ctcgatcgag gagagcctgt caacagggca gacgaggtcg cgcgagagca cgacatctcg 240
tacaacgagc agcttgaggc gggagacaac ccctacctca agtacaacca cgcggacgcc 300
gagtttcagg agaagctcgc cgacgacaca tccttcgggg gaaacctcgg aaaggcagtc 360
tttcaggcca agaaaagggt tctcgaacct tttggcctgg ttgaagaggg tgctaagacg 420
gcccctaccg gaaagcggat agacgaccac tttccaaaaa gaaagaaggc tcggaccgaa 480
gaggactcca agccttccac ctcgtcagac gccgaagctg gacccagcgg atcccagcag 540
ctgcaaatcc cagcccaacc agcctcaagt ttgggagctg atacaatgtc tgcgggaggt 600
ggcggcccat tgggcgacaa taaccaaggt gccgatggag tgggcaatgc ctcgggagat 660
tggcattgcg attccacgtg gatgggggac agagtcgtca ccaagtccac ccgaacctgg 720
gtgctgccca gctacaacaa ccaccagtac cgagagatca aaagcggctc cgtcgacgga 780
agcaacgcca acgcctactt tggatacagc accccctggg ggtactttga ctttaaccgc 840
ttccacagcc actggagccc ccgagactgg caaagactca tcaacaacta ctggggcttc 900
agaccccggt ccctcagagt caaaatcttc aacattcaag tcaaagaggt cacggtgcag 960
gactccacca ccaccatcgc caacaacctc acctccaccg tccaagtgtt tacggacgac 1020
gactaccagc tgccctacgt cgtcggcaac gggaccgagg gatgcctgcc ggccttccct 1080
ccgcaggtct ttacgctgcc gcagtacggt tacgcgacgc tgaaccgcga caacacagaa 1140
aatcccaccg agaggagcag cttcttctgc ctagagtact ttcccagcaa gatgctgaga 1200
acgggcaaca actttgagtt tacctacaac tttgaggagg tgcccttcca ctccagcttc 1260
gctcccagtc agaacctctt caagctggcc aacccgctgg tggaccagta cttgtaccgc 1320
ttcgtgagca caaataacac tggcggagtc cagttcaaca agaacctggc cgggagatac 1380
gccaacacct acaaaaactg gttcccgggg cccatgggcc gaacccaggg ctggaacctg 1440
ggctccgggg tcaaccgcgc cagtgtcagc gccttcgcca cgaccaatag gatggagctc 1500
gagggcgcga gttaccaggt gcccccgcag ccgaacggca tgaccaacaa cctccagggc 1560
agcaacacct atgccctgga gaacactatg atcttcaaca gccagccggc gaacccgggc 1620
accaccgcca cgtacctcga gggcaacatg ctcatcacca gcgagagcga gacgcagccg 1680
gtgaaccgcg tggcgtacaa cgtcggcggg cagatggcca ccaacaacca gagctccacc 1740
actgcccccg cgaccggcac gtacaacctc caggaaatcg tgcccggcag cgtgtggatg 1800
gagagggacg tgtacctcca aggacccatc tgggccaaga tcccagagac gggggcgcac 1860
tttcacccct ctccggccat gggcggattc ggactcaaac acccaccgcc catgatgctc 1920
atcaagaaca cgcctgtgcc cggaaatatc accagcttct cggacgtgcc cgtcagcagc 1980
ttcatcaccc agtacagcac cgggcaggtc accgtggaga tggagtggga gctcaagaag 2040
gaaaactcca agaggtggaa cccagagatc cagtacacaa acaactacaa cgacccccag 2100
tttgtggact ttgccccgga cagcaccggg gaatacagaa ccaccagacc tatcggaacc 2160
cgatacctta cccgacccct ttaa 2184
<210> 45
<211> 2187
<212> DNA
<213> Artificial
<220>
<223> artificial sequence based on AAV5: construct 163
<220>
<221> misc_feature
<222> (1)..(9)
<223> VP2 initiator context
<220>
<221> misc_feature
<222> (10)..(12)
<223> suboptimal translation initiation codon
<220>
<221> mutation
<222> (13)..(15)
<223> insertion of triplet as compared to native AAV5 sequence
<220>
<221> mutation
<222> (16)..(18)
<223> mutation of triplet as compared to native AAV5 sequence
<220>
<221> mutation
<222> (33)..(33)
<223> point mutation to remove splice site
<220>
<221> mutation
<222> (36)..(36)
<223> point mutation to remove splice site
<400> 45
cctgttaagc tgactagctt tgttgatcac ccacccgatt ggttggaaga agttggtgaa 60
ggtcttcgcg agtttttggg ccttgaagcg ggcccaccga aaccaaaacc caatcagcag 120
catcaagatc aagcccgtgg tcttgtgctg cctggttata actatctcgg acccggaaac 180
ggtctcgatc gaggagagcc tgtcaacagg gcagacgagg tcgcgcgaga gcacgacatc 240
tcgtacaacg agcagcttga ggcgggagac aacccctacc tcaagtacaa ccacgcggac 300
gccgagtttc aggagaagct cgccgacgac acatccttcg ggggaaacct cggaaaggca 360
gtctttcagg ccaagaaaag ggttctcgaa ccttttggcc tggttgaaga gggtgctaag 420
acggccccta ccggaaagcg gatagacgac cactttccaa aaagaaagaa ggctcggacc 480
gaagaggact ccaagccttc cacctcgtca gacgccgaag ctggacccag cggatcccag 540
cagctgcaaa tcccagccca accagcctca agtttgggag ctgatacaat gtctgcggga 600
ggtggcggcc cattgggcga caataaccaa ggtgccgatg gagtgggcaa tgcctcggga 660
gattggcatt gcgattccac gtggatgggg gacagagtcg tcaccaagtc cacccgaacc 720
tgggtgctgc ccagctacaa caaccaccag taccgagaga tcaaaagcgg ctccgtcgac 780
ggaagcaacg ccaacgccta ctttggatac agcaccccct gggggtactt tgactttaac 840
cgcttccaca gccactggag cccccgagac tggcaaagac tcatcaacaa ctactggggc 900
ttcagacccc ggtccctcag agtcaaaatc ttcaacattc aagtcaaaga ggtcacggtg 960
caggactcca ccaccaccat cgccaacaac ctcacctcca ccgtccaagt gtttacggac 1020
gacgactacc agctgcccta cgtcgtcggc aacgggaccg agggatgcct gccggccttc 1080
cctccgcagg tctttacgct gccgcagtac ggttacgcga cgctgaaccg cgacaacaca 1140
gaaaatccca ccgagaggag cagcttcttc tgcctagagt actttcccag caagatgctg 1200
agaacgggca acaactttga gtttacctac aactttgagg aggtgccctt ccactccagc 1260
ttcgctccca gtcagaacct cttcaagctg gccaacccgc tggtggacca gtacttgtac 1320
cgcttcgtga gcacaaataa cactggcgga gtccagttca acaagaacct ggccgggaga 1380
tacgccaaca cctacaaaaa ctggttcccg gggcccatgg gccgaaccca gggctggaac 1440
ctgggctccg gggtcaaccg cgccagtgtc agcgccttcg ccacgaccaa taggatggag 1500
ctcgagggcg cgagttacca ggtgcccccg cagccgaacg gcatgaccaa caacctccag 1560
ggcagcaaca cctatgccct ggagaacact atgatcttca acagccagcc ggcgaacccg 1620
ggcaccaccg ccacgtacct cgagggcaac atgctcatca ccagcgagag cgagacgcag 1680
ccggtgaacc gcgtggcgta caacgtcggc gggcagatgg ccaccaacaa ccagagctcc 1740
accactgccc ccgcgaccgg cacgtacaac ctccaggaaa tcgtgcccgg cagcgtgtgg 1800
atggagaggg acgtgtacct ccaaggaccc atctgggcca agatcccaga gacgggggcg 1860
cactttcacc cctctccggc catgggcgga ttcggactca aacacccacc gcccatgatg 1920
ctcatcaaga acacgcctgt gcccggaaat atcaccagct tctcggacgt gcccgtcagc 1980
agcttcatca cccagtacag caccgggcag gtcaccgtgg agatggagtg ggagctcaag 2040
aaggaaaact ccaagaggtg gaacccagag atccagtaca caaacaacta caacgacccc 2100
cagtttgtgg actttgcccc ggacagcacc ggggaataca gaaccaccag acctatcgga 2160
acccgatacc ttacccgacc cctttaa 2187
<210> 46
<211> 2184
<212> DNA
<213> Artificial
<220>
<223> sequence based on AAV5: construct 164
<220>
<221> misc_feature
<222> (1)..(9)
<223> VP2 initiator context
<220>
<221> misc_feature
<222> (10)..(12)
<223> suboptimal translation initiation codon
<220>
<221> mutation
<222> (13)..(15)
<223> mutation of triplet to serine codon
<220>
<221> mutation
<222> (30)..(30)
<223> point mutation to remove splice site
<220>
<221> mutation
<222> (33)..(33)
<223> point mutation to remove splice site
<400> 46
cctgttaagc tgagttttgt tgatcaccca cccgattggt tggaagaagt tggtgaaggt 60
cttcgcgagt ttttgggcct tgaagcgggc ccaccgaaac caaaacccaa tcagcagcat 120
caagatcaag cccgtggtct tgtgctgcct ggttataact atctcggacc cggaaacggt 180
ctcgatcgag gagagcctgt caacagggca gacgaggtcg cgcgagagca cgacatctcg 240
tacaacgagc agcttgaggc gggagacaac ccctacctca agtacaacca cgcggacgcc 300
gagtttcagg agaagctcgc cgacgacaca tccttcgggg gaaacctcgg aaaggcagtc 360
tttcaggcca agaaaagggt tctcgaacct tttggcctgg ttgaagaggg tgctaagacg 420
gcccctaccg gaaagcggat agacgaccac tttccaaaaa gaaagaaggc tcggaccgaa 480
gaggactcca agccttccac ctcgtcagac gccgaagctg gacccagcgg atcccagcag 540
ctgcaaatcc cagcccaacc agcctcaagt ttgggagctg atacaatgtc tgcgggaggt 600
ggcggcccat tgggcgacaa taaccaaggt gccgatggag tgggcaatgc ctcgggagat 660
tggcattgcg attccacgtg gatgggggac agagtcgtca ccaagtccac ccgaacctgg 720
gtgctgccca gctacaacaa ccaccagtac cgagagatca aaagcggctc cgtcgacgga 780
agcaacgcca acgcctactt tggatacagc accccctggg ggtactttga ctttaaccgc 840
ttccacagcc actggagccc ccgagactgg caaagactca tcaacaacta ctggggcttc 900
agaccccggt ccctcagagt caaaatcttc aacattcaag tcaaagaggt cacggtgcag 960
gactccacca ccaccatcgc caacaacctc acctccaccg tccaagtgtt tacggacgac 1020
gactaccagc tgccctacgt cgtcggcaac gggaccgagg gatgcctgcc ggccttccct 1080
ccgcaggtct ttacgctgcc gcagtacggt tacgcgacgc tgaaccgcga caacacagaa 1140
aatcccaccg agaggagcag cttcttctgc ctagagtact ttcccagcaa gatgctgaga 1200
acgggcaaca actttgagtt tacctacaac tttgaggagg tgcccttcca ctccagcttc 1260
gctcccagtc agaacctctt caagctggcc aacccgctgg tggaccagta cttgtaccgc 1320
ttcgtgagca caaataacac tggcggagtc cagttcaaca agaacctggc cgggagatac 1380
gccaacacct acaaaaactg gttcccgggg cccatgggcc gaacccaggg ctggaacctg 1440
ggctccgggg tcaaccgcgc cagtgtcagc gccttcgcca cgaccaatag gatggagctc 1500
gagggcgcga gttaccaggt gcccccgcag ccgaacggca tgaccaacaa cctccagggc 1560
agcaacacct atgccctgga gaacactatg atcttcaaca gccagccggc gaacccgggc 1620
accaccgcca cgtacctcga gggcaacatg ctcatcacca gcgagagcga gacgcagccg 1680
gtgaaccgcg tggcgtacaa cgtcggcggg cagatggcca ccaacaacca gagctccacc 1740
actgcccccg cgaccggcac gtacaacctc caggaaatcg tgcccggcag cgtgtggatg 1800
gagagggacg tgtacctcca aggacccatc tgggccaaga tcccagagac gggggcgcac 1860
tttcacccct ctccggccat gggcggattc ggactcaaac acccaccgcc catgatgctc 1920
atcaagaaca cgcctgtgcc cggaaatatc accagcttct cggacgtgcc cgtcagcagc 1980
ttcatcaccc agtacagcac cgggcaggtc accgtggaga tggagtggga gctcaagaag 2040
gaaaactcca agaggtggaa cccagagatc cagtacacaa acaactacaa cgacccccag 2100
tttgtggact ttgccccgga cagcaccggg gaatacagaa ccaccagacc tatcggaacc 2160
cgatacctta cccgacccct ttaa 2184
<210> 47
<211> 2187
<212> DNA
<213> Artificial
<220>
<223> sequence based on AAV5: sequence 761
<220>
<221> misc_feature
<222> (1)..(9)
<223> VP2 initiator context
<220>
<221> misc_feature
<222> (10)..(12)
<223> suboptimal translation initiation codon
<220>
<221> mutation
<222> (13)..(15)
<223> insertion of triplet encoding alanine
<220>
<221> mutation
<222> (33)..(33)
<223> point mutation to remove splice site
<220>
<221> mutation
<222> (36)..(36)
<223> point mutation to remove splice site
<400> 47
cctgttaaga cggcttcttt tgttgatcac ccacccgatt ggttggaaga agttggtgaa 60
ggtcttcgcg agtttttggg ccttgaagcg ggcccaccga aaccaaaacc caatcagcag 120
catcaagatc aagcccgtgg tcttgtgctg cctggttata actatctcgg acccggaaac 180
ggtctcgatc gaggagagcc tgtcaacagg gcagacgagg tcgcgcgaga gcacgacatc 240
tcgtacaacg agcagcttga ggcgggagac aacccctacc tcaagtacaa ccacgcggac 300
gccgagtttc aggagaagct cgccgacgac acatccttcg ggggaaacct cggaaaggca 360
gtctttcagg ccaagaaaag ggttctcgaa ccttttggcc tggttgaaga gggtgctaag 420
acggccccta ccggaaagcg gatagacgac cactttccaa aaagaaagaa ggctcggacc 480
gaagaggact ccaagccttc cacctcgtca gacgccgaag ctggacccag cggatcccag 540
cagctgcaaa tcccagccca accagcctca agtttgggag ctgatacaat gtctgcggga 600
ggtggcggcc cattgggcga caataaccaa ggtgccgatg gagtgggcaa tgcctcggga 660
gattggcatt gcgattccac gtggatgggg gacagagtcg tcaccaagtc cacccgaacc 720
tgggtgctgc ccagctacaa caaccaccag taccgagaga tcaaaagcgg ctccgtcgac 780
ggaagcaacg ccaacgccta ctttggatac agcaccccct gggggtactt tgactttaac 840
cgcttccaca gccactggag cccccgagac tggcaaagac tcatcaacaa ctactggggc 900
ttcagacccc ggtccctcag agtcaaaatc ttcaacattc aagtcaaaga ggtcacggtg 960
caggactcca ccaccaccat cgccaacaac ctcacctcca ccgtccaagt gtttacggac 1020
gacgactacc agctgcccta cgtcgtcggc aacgggaccg agggatgcct gccggccttc 1080
cctccgcagg tctttacgct gccgcagtac ggttacgcga cgctgaaccg cgacaacaca 1140
gaaaatccca ccgagaggag cagcttcttc tgcctagagt actttcccag caagatgctg 1200
agaacgggca acaactttga gtttacctac aactttgagg aggtgccctt ccactccagc 1260
ttcgctccca gtcagaacct gttcaagctg gccaacccgc tggtggacca gtacttgtac 1320
cgcttcgtga gcacaaataa cactggcgga gtccagttca acaagaacct ggccgggaga 1380
tacgccaaca cctacaaaaa ctggttcccg gggcccatgg gccgaaccca gggctggaac 1440
ctgggctccg gggtcaaccg cgccagtgtc agcgccttcg ccacgaccaa taggatggag 1500
ctcgagggcg cgagttacca ggtgcccccg cagccgaacg gcatgaccaa caacctccag 1560
ggcagcaaca cctatgccct ggagaacact atgatcttca acagccagcc ggcgaacccg 1620
ggcaccaccg ccacgtacct cgagggcaac atgctcatca ccagcgagag cgagacgcag 1680
ccggtgaacc gcgtggcgta caacgtcggc gggcagatgg ccaccaacaa ccagagctcc 1740
accactgccc ccgcgaccgg cacgtacaac ctccaggaaa tcgtgcccgg cagcgtgtgg 1800
atggagaggg acgtgtacct ccaaggaccc atctgggcca agatcccaga gacgggggcg 1860
cactttcacc cctctccggc catgggcgga ttcggactca aacacccacc gcccatgatg 1920
ctcatcaaga acacgcctgt gcccggaaat atcaccagct tctcggacgt gcccgtcagc 1980
agcttcatca cccagtacag caccgggcag gtcaccgtgg agatggagtg ggagctcaag 2040
aaggaaaact ccaagaggtg gaacccagag atccagtaca caaacaacta caacgacccc 2100
cagtttgtgg actttgcccc ggacagcacc ggggaataca gaaccaccag acctatcgga 2160
acccgatacc ttacccgacc cctttaa 2187
<210> 48
<211> 2178
<212> DNA
<213> Artificial
<220>
<223> sequence based on AAV5: sequence 762
<220>
<221> misc_feature
<222> (1)..(3)
<223> suboptimal translation initiation codon
<220>
<221> mutation
<222> (4)..(6)
<223> insertion of triplet encoding alanine
<220>
<221> mutation
<222> (24)..(24)
<223> point mutation to remove splice site
<220>
<221> mutation
<222> (27)..(27)
<223> point mutation to remove splice site
<400> 48
acggcttctt ttgttgatca cccacccgat tggttggaag aagttggtga aggtcttcgc 60
gagtttttgg gccttgaagc gggcccaccg aaaccaaaac ccaatcagca gcatcaagat 120
caagcccgtg gtcttgtgct gcctggttat aactatctcg gacccggaaa cggtctcgat 180
cgaggagagc ctgtcaacag ggcagacgag gtcgcgcgag agcacgacat ctcgtacaac 240
gagcagcttg aggcgggaga caacccctac ctcaagtaca accacgcgga cgccgagttt 300
caggagaagc tcgccgacga cacatccttc gggggaaacc tcggaaaggc agtctttcag 360
gccaagaaaa gggttctcga accttttggc ctggttgaag agggtgctaa gacggcccct 420
accggaaagc ggatagacga ccactttcca aaaagaaaga aggctcggac cgaagaggac 480
tccaagcctt ccacctcgtc agacgccgaa gctggaccca gcggatccca gcagctgcaa 540
atcccagccc aaccagcctc aagtttggga gctgatacaa tgtctgcggg aggtggcggc 600
ccattgggcg acaataacca aggtgccgat ggagtgggca atgcctcggg agattggcat 660
tgcgattcca cgtggatggg ggacagagtc gtcaccaagt ccacccgaac ctgggtgctg 720
cccagctaca acaaccacca gtaccgagag atcaaaagcg gctccgtcga cggaagcaac 780
gccaacgcct actttggata cagcaccccc tgggggtact ttgactttaa ccgcttccac 840
agccactgga gcccccgaga ctggcaaaga ctcatcaaca actactgggg cttcagaccc 900
cggtccctca gagtcaaaat cttcaacatt caagtcaaag aggtcacggt gcaggactcc 960
accaccacca tcgccaacaa cctcacctcc accgtccaag tgtttacgga cgacgactac 1020
cagctgccct acgtcgtcgg caacgggacc gagggatgcc tgccggcctt ccctccgcag 1080
gtctttacgc tgccgcagta cggttacgcg acgctgaacc gcgacaacac agaaaatccc 1140
accgagagga gcagcttctt ctgcctagag tactttccca gcaagatgct gagaacgggc 1200
aacaactttg agtttaccta caactttgag gaggtgccct tccactccag cttcgctccc 1260
agtcagaacc tgttcaagct ggccaacccg ctggtggacc agtacttgta ccgcttcgtg 1320
agcacaaata acactggcgg agtccagttc aacaagaacc tggccgggag atacgccaac 1380
acctacaaaa actggttccc ggggcccatg ggccgaaccc agggctggaa cctgggctcc 1440
ggggtcaacc gcgccagtgt cagcgccttc gccacgacca ataggatgga gctcgagggc 1500
gcgagttacc aggtgccccc gcagccgaac ggcatgacca acaacctcca gggcagcaac 1560
acctatgccc tggagaacac tatgatcttc aacagccagc cggcgaaccc gggcaccacc 1620
gccacgtacc tcgagggcaa catgctcatc accagcgaga gcgagacgca gccggtgaac 1680
cgcgtggcgt acaacgtcgg cgggcagatg gccaccaaca accagagctc caccactgcc 1740
cccgcgaccg gcacgtacaa cctccaggaa atcgtgcccg gcagcgtgtg gatggagagg 1800
gacgtgtacc tccaaggacc catctgggcc aagatcccag agacgggggc gcactttcac 1860
ccctctccgg ccatgggcgg attcggactc aaacacccac cgcccatgat gctcatcaag 1920
aacacgcctg tgcccggaaa tatcaccagc ttctcggacg tgcccgtcag cagcttcatc 1980
acccagtaca gcaccgggca ggtcaccgtg gagatggagt gggagctcaa gaaggaaaac 2040
tccaagaggt ggaacccaga gatccagtac acaaacaact acaacgaccc ccagtttgtg 2100
gactttgccc cggacagcac cggggaatac agaaccacca gacctatcgg aacccgatac 2160
cttacccgac ccctttaa 2178
<210> 49
<211> 2175
<212> DNA
<213> adeno-associated virus 5
<220>
<221> misc_feature
<222> (1)..(3)
<223> start codon
<400> 49
atgtcttttg ttgatcaccc tccagattgg ttggaagaag ttggtgaagg tcttcgcgag 60
tttttgggcc ttgaagcggg cccaccgaaa ccaaaaccca atcagcagca tcaagatcaa 120
gcccgtggtc ttgtgctgcc tggttataac tatctcggac ccggaaacgg tctcgatcga 180
ggagagcctg tcaacagggc agacgaggtc gcgcgagagc acgacatctc gtacaacgag 240
cagcttgagg cgggagacaa cccctacctc aagtacaacc acgcggacgc cgagtttcag 300
gagaagctcg ccgacgacac atccttcggg ggaaacctcg gaaaggcagt ctttcaggcc 360
aagaaaaggg ttctcgaacc ttttggcctg gttgaagagg gtgctaagac ggcccctacc 420
ggaaagcgga tagacgacca ctttccaaaa agaaagaagg ctcggaccga agaggactcc 480
aagccttcca cctcgtcaga cgccgaagct ggacccagcg gatcccagca gctgcaaatc 540
ccagcccaac cagcctcaag tttgggagct gatacaatgt ctgcgggagg tggcggccca 600
ttgggcgaca ataaccaagg tgccgatgga gtgggcaatg cctcgggaga ttggcattgc 660
gattccacgt ggatggggga cagagtcgtc accaagtcca cccgaacctg ggtgctgccc 720
agctacaaca accaccagta ccgagagatc aaaagcggct ccgtcgacgg aagcaacgcc 780
aacgcctact ttggatacag caccccctgg gggtactttg actttaaccg cttccacagc 840
cactggagcc cccgagactg gcaaagactc atcaacaact actggggctt cagaccccgg 900
tccctcagag tcaaaatctt caacattcaa gtcaaagagg tcacggtgca ggactccacc 960
accaccatcg ccaacaacct cacctccacc gtccaagtgt ttacggacga cgactaccag 1020
ctgccctacg tcgtcggcaa cgggaccgag ggatgcctgc cggccttccc tccgcaggtc 1080
tttacgctgc cgcagtacgg ttacgcgacg ctgaaccgcg acaacacaga aaatcccacc 1140
gagaggagca gcttcttctg cctagagtac tttcccagca agatgctgag aacgggcaac 1200
aactttgagt ttacctacaa ctttgaggag gtgcccttcc actccagctt cgctcccagt 1260
cagaacctgt tcaagctggc caacccgctg gtggaccagt acttgtaccg cttcgtgagc 1320
acaaataaca ctggcggagt ccagttcaac aagaacctgg ccgggagata cgccaacacc 1380
tacaaaaact ggttcccggg gcccatgggc cgaacccagg gctggaacct gggctccggg 1440
gtcaaccgcg ccagtgtcag cgccttcgcc acgaccaata ggatggagct cgagggcgcg 1500
agttaccagg tgcccccgca gccgaacggc atgaccaaca acctccaggg cagcaacacc 1560
tatgccctgg agaacactat gatcttcaac agccagccgg cgaacccggg caccaccgcc 1620
acgtacctcg agggcaacat gctcatcacc agcgagagcg agacgcagcc ggtgaaccgc 1680
gtggcgtaca acgtcggcgg gcagatggcc accaacaacc agagctccac cactgccccc 1740
gcgaccggca cgtacaacct ccaggaaatc gtgcccggca gcgtgtggat ggagagggac 1800
gtgtacctcc aaggacccat ctgggccaag atcccagaga cgggggcgca ctttcacccc 1860
tctccggcca tgggcggatt cggactcaaa cacccaccgc ccatgatgct catcaagaac 1920
acgcctgtgc ccggaaatat caccagcttc tcggacgtgc ccgtcagcag cttcatcacc 1980
cagtacagca ccgggcaggt caccgtggag atggagtggg agctcaagaa ggaaaactcc 2040
aagaggtgga acccagagat ccagtacaca aacaactaca acgaccccca gtttgtggac 2100
tttgccccgg acagcaccgg ggaatacaga accaccagac ctatcggaac ccgatacctt 2160
acccgacccc tttaa 2175
<210> 50
<211> 2178
<212> DNA
<213> Artificial
<220>
<223> sequence based on AAV5: sequence 764
<220>
<221> misc_feature
<222> (1)..(3)
<223> suboptimal translation initiation codon
<220>
<221> mutation
<222> (4)..(6)
<223> insertion of triplet encoding alanine
<220>
<221> mutation
<222> (24)..(24)
<223> point mutation to remove splice site
<220>
<221> mutation
<222> (27)..(27)
<223> point mutation to remove splice site
<400> 50
ttggcttctt ttgttgatca cccacccgat tggttggaag aagttggtga aggtcttcgc 60
gagtttttgg gccttgaagc gggcccaccg aaaccaaaac ccaatcagca gcatcaagat 120
caagcccgtg gtcttgtgct gcctggttat aactatctcg gacccggaaa cggtctcgat 180
cgaggagagc ctgtcaacag ggcagacgag gtcgcgcgag agcacgacat ctcgtacaac 240
gagcagcttg aggcgggaga caacccctac ctcaagtaca accacgcgga cgccgagttt 300
caggagaagc tcgccgacga cacatccttc gggggaaacc tcggaaaggc agtctttcag 360
gccaagaaaa gggttctcga accttttggc ctggttgaag agggtgctaa gacggcccct 420
accggaaagc ggatagacga ccactttcca aaaagaaaga aggctcggac cgaagaggac 480
tccaagcctt ccacctcgtc agacgccgaa gctggaccca gcggatccca gcagctgcaa 540
atcccagccc aaccagcctc aagtttggga gctgatacaa tgtctgcggg aggtggcggc 600
ccattgggcg acaataacca aggtgccgat ggagtgggca atgcctcggg agattggcat 660
tgcgattcca cgtggatggg ggacagagtc gtcaccaagt ccacccgaac ctgggtgctg 720
cccagctaca acaaccacca gtaccgagag atcaaaagcg gctccgtcga cggaagcaac 780
gccaacgcct actttggata cagcaccccc tgggggtact ttgactttaa ccgcttccac 840
agccactgga gcccccgaga ctggcaaaga ctcatcaaca actactgggg cttcagaccc 900
cggtccctca gagtcaaaat cttcaacatt caagtcaaag aggtcacggt gcaggactcc 960
accaccacca tcgccaacaa cctcacctcc accgtccaag tgtttacgga cgacgactac 1020
cagctgccct acgtcgtcgg caacgggacc gagggatgcc tgccggcctt ccctccgcag 1080
gtctttacgc tgccgcagta cggttacgcg acgctgaacc gcgacaacac agaaaatccc 1140
accgagagga gcagcttctt ctgcctagag tactttccca gcaagatgct gagaacgggc 1200
aacaactttg agtttaccta caactttgag gaggtgccct tccactccag cttcgctccc 1260
agtcagaacc tgttcaagct ggccaacccg ctggtggacc agtacttgta ccgcttcgtg 1320
agcacaaata acactggcgg agtccagttc aacaagaacc tggccgggag atacgccaac 1380
acctacaaaa actggttccc ggggcccatg ggccgaaccc agggctggaa cctgggctcc 1440
ggggtcaacc gcgccagtgt cagcgccttc gccacgacca ataggatgga gctcgagggc 1500
gcgagttacc aggtgccccc gcagccgaac ggcatgacca acaacctcca gggcagcaac 1560
acctatgccc tggagaacac tatgatcttc aacagccagc cggcgaaccc gggcaccacc 1620
gccacgtacc tcgagggcaa catgctcatc accagcgaga gcgagacgca gccggtgaac 1680
cgcgtggcgt acaacgtcgg cgggcagatg gccaccaaca accagagctc caccactgcc 1740
cccgcgaccg gcacgtacaa cctccaggaa atcgtgcccg gcagcgtgtg gatggagagg 1800
gacgtgtacc tccaaggacc catctgggcc aagatcccag agacgggggc gcactttcac 1860
ccctctccgg ccatgggcgg attcggactc aaacacccac cgcccatgat gctcatcaag 1920
aacacgcctg tgcccggaaa tatcaccagc ttctcggacg tgcccgtcag cagcttcatc 1980
acccagtaca gcaccgggca ggtcaccgtg gagatggagt gggagctcaa gaaggaaaac 2040
tccaagaggt ggaacccaga gatccagtac acaaacaact acaacgaccc ccagtttgtg 2100
gactttgccc cggacagcac cggggaatac agaaccacca gacctatcgg aacccgatac 2160
cttacccgac ccctttaa 2178
<210> 51
<211> 2178
<212> DNA
<213> Artificial
<220>
<223> sequence based on AAV5: sequence 765
<220>
<221> misc_feature
<222> (1)..(3)
<223> suboptimal translation initiation codon
<220>
<221> mutation
<222> (4)..(6)
<223> insertion of triplet encoding alanine
<220>
<221> mutation
<222> (24)..(24)
<223> point mutation to remove splice site
<220>
<221> mutation
<222> (27)..(27)
<223> point mutation to remove splice site
<400> 51
ctggcttctt ttgttgatca cccacccgat tggttggaag aagttggtga aggtcttcgc 60
gagtttttgg gccttgaagc gggcccaccg aaaccaaaac ccaatcagca gcatcaagat 120
caagcccgtg gtcttgtgct gcctggttat aactatctcg gacccggaaa cggtctcgat 180
cgaggagagc ctgtcaacag ggcagacgag gtcgcgcgag agcacgacat ctcgtacaac 240
gagcagcttg aggcgggaga caacccctac ctcaagtaca accacgcgga cgccgagttt 300
caggagaagc tcgccgacga cacatccttc gggggaaacc tcggaaaggc agtctttcag 360
gccaagaaaa gggttctcga accttttggc ctggttgaag agggtgctaa gacggcccct 420
accggaaagc ggatagacga ccactttcca aaaagaaaga aggctcggac cgaagaggac 480
tccaagcctt ccacctcgtc agacgccgaa gctggaccca gcggatccca gcagctgcaa 540
atcccagccc aaccagcctc aagtttggga gctgatacaa tgtctgcggg aggtggcggc 600
ccattgggcg acaataacca aggtgccgat ggagtgggca atgcctcggg agattggcat 660
tgcgattcca cgtggatggg ggacagagtc gtcaccaagt ccacccgaac ctgggtgctg 720
cccagctaca acaaccacca gtaccgagag atcaaaagcg gctccgtcga cggaagcaac 780
gccaacgcct actttggata cagcaccccc tgggggtact ttgactttaa ccgcttccac 840
agccactgga gcccccgaga ctggcaaaga ctcatcaaca actactgggg cttcagaccc 900
cggtccctca gagtcaaaat cttcaacatt caagtcaaag aggtcacggt gcaggactcc 960
accaccacca tcgccaacaa cctcacctcc accgtccaag tgtttacgga cgacgactac 1020
cagctgccct acgtcgtcgg caacgggacc gagggatgcc tgccggcctt ccctccgcag 1080
gtctttacgc tgccgcagta cggttacgcg acgctgaacc gcgacaacac agaaaatccc 1140
accgagagga gcagcttctt ctgcctagag tactttccca gcaagatgct gagaacgggc 1200
aacaactttg agtttaccta caactttgag gaggtgccct tccactccag cttcgctccc 1260
agtcagaacc tgttcaagct ggccaacccg ctggtggacc agtacttgta ccgcttcgtg 1320
agcacaaata acactggcgg agtccagttc aacaagaacc tggccgggag atacgccaac 1380
acctacaaaa actggttccc ggggcccatg ggccgaaccc agggctggaa cctgggctcc 1440
ggggtcaacc gcgccagtgt cagcgccttc gccacgacca ataggatgga gctcgagggc 1500
gcgagttacc aggtgccccc gcagccgaac ggcatgacca acaacctcca gggcagcaac 1560
acctatgccc tggagaacac tatgatcttc aacagccagc cggcgaaccc gggcaccacc 1620
gccacgtacc tcgagggcaa catgctcatc accagcgaga gcgagacgca gccggtgaac 1680
cgcgtggcgt acaacgtcgg cgggcagatg gccaccaaca accagagctc caccactgcc 1740
cccgcgaccg gcacgtacaa cctccaggaa atcgtgcccg gcagcgtgtg gatggagagg 1800
gacgtgtacc tccaaggacc catctgggcc aagatcccag agacgggggc gcactttcac 1860
ccctctccgg ccatgggcgg attcggactc aaacacccac cgcccatgat gctcatcaag 1920
aacacgcctg tgcccggaaa tatcaccagc ttctcggacg tgcccgtcag cagcttcatc 1980
acccagtaca gcaccgggca ggtcaccgtg gagatggagt gggagctcaa gaaggaaaac 2040
tccaagaggt ggaacccaga gatccagtac acaaacaact acaacgaccc ccagtttgtg 2100
gactttgccc cggacagcac cggggaatac agaaccacca gacctatcgg aacccgatac 2160
cttacccgac ccctttaa 2178
<210> 52
<211> 2178
<212> DNA
<213> Artificial
<220>
<223> sequence based on AAV5: sequence 766
<220>
<221> misc_feature
<222> (1)..(3)
<223> suboptimal translation initiation codon
<220>
<221> mutation
<222> (4)..(6)
<223> insertion of triplet encoding alanine
<220>
<221> mutation
<222> (24)..(24)
<223> point mutation to remove splice site
<220>
<221> mutation
<222> (27)..(27)
<223> point mutation to remove splice site
<400> 52
gtggcttctt ttgttgatca cccacccgat tggttggaag aagttggtga aggtcttcgc 60
gagtttttgg gccttgaagc gggcccaccg aaaccaaaac ccaatcagca gcatcaagat 120
caagcccgtg gtcttgtgct gcctggttat aactatctcg gacccggaaa cggtctcgat 180
cgaggagagc ctgtcaacag ggcagacgag gtcgcgcgag agcacgacat ctcgtacaac 240
gagcagcttg aggcgggaga caacccctac ctcaagtaca accacgcgga cgccgagttt 300
caggagaagc tcgccgacga cacatccttc gggggaaacc tcggaaaggc agtctttcag 360
gccaagaaaa gggttctcga accttttggc ctggttgaag agggtgctaa gacggcccct 420
accggaaagc ggatagacga ccactttcca aaaagaaaga aggctcggac cgaagaggac 480
tccaagcctt ccacctcgtc agacgccgaa gctggaccca gcggatccca gcagctgcaa 540
atcccagccc aaccagcctc aagtttggga gctgatacaa tgtctgcggg aggtggcggc 600
ccattgggcg acaataacca aggtgccgat ggagtgggca atgcctcggg agattggcat 660
tgcgattcca cgtggatggg ggacagagtc gtcaccaagt ccacccgaac ctgggtgctg 720
cccagctaca acaaccacca gtaccgagag atcaaaagcg gctccgtcga cggaagcaac 780
gccaacgcct actttggata cagcaccccc tgggggtact ttgactttaa ccgcttccac 840
agccactgga gcccccgaga ctggcaaaga ctcatcaaca actactgggg cttcagaccc 900
cggtccctca gagtcaaaat cttcaacatt caagtcaaag aggtcacggt gcaggactcc 960
accaccacca tcgccaacaa cctcacctcc accgtccaag tgtttacgga cgacgactac 1020
cagctgccct acgtcgtcgg caacgggacc gagggatgcc tgccggcctt ccctccgcag 1080
gtctttacgc tgccgcagta cggttacgcg acgctgaacc gcgacaacac agaaaatccc 1140
accgagagga gcagcttctt ctgcctagag tactttccca gcaagatgct gagaacgggc 1200
aacaactttg agtttaccta caactttgag gaggtgccct tccactccag cttcgctccc 1260
agtcagaacc tgttcaagct ggccaacccg ctggtggacc agtacttgta ccgcttcgtg 1320
agcacaaata acactggcgg agtccagttc aacaagaacc tggccgggag atacgccaac 1380
acctacaaaa actggttccc ggggcccatg ggccgaaccc agggctggaa cctgggctcc 1440
ggggtcaacc gcgccagtgt cagcgccttc gccacgacca ataggatgga gctcgagggc 1500
gcgagttacc aggtgccccc gcagccgaac ggcatgacca acaacctcca gggcagcaac 1560
acctatgccc tggagaacac tatgatcttc aacagccagc cggcgaaccc gggcaccacc 1620
gccacgtacc tcgagggcaa catgctcatc accagcgaga gcgagacgca gccggtgaac 1680
cgcgtggcgt acaacgtcgg cgggcagatg gccaccaaca accagagctc caccactgcc 1740
cccgcgaccg gcacgtacaa cctccaggaa atcgtgcccg gcagcgtgtg gatggagagg 1800
gacgtgtacc tccaaggacc catctgggcc aagatcccag agacgggggc gcactttcac 1860
ccctctccgg ccatgggcgg attcggactc aaacacccac cgcccatgat gctcatcaag 1920
aacacgcctg tgcccggaaa tatcaccagc ttctcggacg tgcccgtcag cagcttcatc 1980
acccagtaca gcaccgggca ggtcaccgtg gagatggagt gggagctcaa gaaggaaaac 2040
tccaagaggt ggaacccaga gatccagtac acaaacaact acaacgaccc ccagtttgtg 2100
gactttgccc cggacagcac cggggaatac agaaccacca gacctatcgg aacccgatac 2160
cttacccgac ccctttaa 2178
<210> 53
<211> 250
<212> DNA
<213> Artificial
<220>
<223> polH promoter long
<400> 53
tgtaatgaga cgcacaaact aatatcacaa actggaaatg tctatcaata tatagttgct 60
gatctatgca tcagctgcta gtactccgga atattaatag atcatggaga taattaaaat 120
gataaccatc tcgcaaataa ataagtattt tactgttttc gtaacagttt tgtaataaaa 180
aaacctataa atattccgga ttattcatac cgtcccacca tcgggcgcgg atcgtaccgg 240
gcccaagctt 250
<210> 54
<211> 155
<212> DNA
<213> Artificial
<220>
<223> polH promoter short
<400> 54
tgtaatgaga cgcacaaact aatatcacaa actggaaatg tctatcaata tatagttgct 60
gatatcatgg agataattaa aatgataacc atctcgcaaa taaataagta ttttactgtt 120
ttcgtaacag ttttgtaata aaaaaaccta taaat 155
<210> 55
<211> 19
<212> DNA
<213> Artificial
<220>
<223> primer
<400> 55
aatgggcggt aggcgtgta 19
<210> 56
<211> 22
<212> DNA
<213> Artificial
<220>
<223> primer
<400> 56
aggcgatctg acggttcact aa 22
<210> 57
<211> 20
<212> DNA
<213> Artificial
<220>
<223> probe
<400> 57
tgggaggtct atataagcag 20
<210> 58
<211> 26
<212> DNA
<213> Artificial
<220>
<223> primer
<400> 58
caagtatggc atctacacca aagtct 26
<210> 59
<211> 25
<212> DNA
<213> Artificial
<220>
<223> primer
<400> 59
gcaatagcat cacaaatttc acaaa 25
<210> 60
<211> 29
<212> DNA
<213> Artificial
<220>
<223> probe
<400> 60
tgtgaactgg atcaaggaga agaccaagc 29
<210> 61
<211> 28
<212> DNA
<213> Artificial
<220>
<223> 5' part of AAV5 capsid sequence
<220>
<221> CDS
<222> (1)..(27)
<400> 61
tct ttt gtt gat cac cct cca gat tgg t 28
Ser Phe Val Asp His Pro Pro Asp Trp
1 5
<210> 62
<211> 9
<212> PRT
<213> Artificial
<220>
<223> Synthetic Construct
<400> 62
Ser Phe Val Asp His Pro Pro Asp Trp
1 5
<210> 63
<211> 28
<212> DNA
<213> Artificial
<220>
<223> 5' part of AAV5 capsid sequence with splice sites removed
<220>
<221> CDS
<222> (1)..(27)
<400> 63
tct ttt gtt gat cac cca ccc gat tgg t 28
Ser Phe Val Asp His Pro Pro Asp Trp
1 5
<210> 64
<211> 9
<212> PRT
<213> Artificial
<220>
<223> Synthetic Construct
<400> 64
Ser Phe Val Asp His Pro Pro Asp Trp
1 5
<210> 65
<211> 28
<212> DNA
<213> Artificial
<220>
<223> 5' part of AAV5 capsid sequence with splice sites removed and
alanine substitution
<400> 65
gcttttgttg atcacccacc cgattggt 28
<210> 66
<211> 28
<212> DNA
<213> Artificial
<220>
<223> 5' part of AAV5 capsid sequence with splice sites removed and
threonine substitution
<400> 66
acttttgttg atcacccacc cgattggt 28
<210> 67
<211> 28
<212> DNA
<213> Artificial
<220>
<223> 5' part of AAV5 capsid sequence with splice sites removed and
point mutations at positions 4-6
<400> 67
agctttgttg atcacccacc cgattggt 28
<210> 68
<211> 28
<212> DNA
<213> Artificial
<220>
<223> 5' part of AAV5 capsid sequence with splice sites removed and
point mutations at positions 4 and 5
<400> 68
agttttgttg atcacccacc cgattggt 28
<210> 69
<211> 2187
<212> DNA
<213> Artificial
<220>
<223> artificial sequence based on AAV5
<220>
<221> misc_feature
<222> (1)..(9)
<223> VP2 initiator context
<220>
<221> misc_feature
<222> (10)..(12)
<223> suboptimal translation initiation codon
<220>
<221> misc_feature
<222> (13)..(15)
<223> additional triplet added to sequence
<220>
<221> mutation
<222> (33)..(33)
<223> remove splice site
<220>
<221> mutation
<222> (36)..(36)
<223> remove splice site
<400> 69
cctgttaagc tggcttcttt tgttgatcac ccacccgatt ggttggaaga agttggtgaa 60
ggtcttcgcg agtttttggg ccttgaagcg ggcccaccga aaccaaaacc caatcagcag 120
catcaagatc aagcccgtgg tcttgtgctg cctggttata actatctcgg acccggaaac 180
ggtctcgatc gaggagagcc tgtcaacagg gcagacgagg tcgcgcgaga gcacgacatc 240
tcgtacaacg agcagcttga ggcgggagac aacccctacc tcaagtacaa ccacgcggac 300
gccgagtttc aggagaagct cgccgacgac acatccttcg ggggaaacct cggaaaggca 360
gtctttcagg ccaagaaaag ggttctcgaa ccttttggcc tggttgaaga gggtgctaag 420
acggccccta ccggaaagcg gatagacgac cactttccaa aaagaaagaa ggctcggacc 480
gaagaggact ccaagccttc cacctcgtca gacgccgaag ctggacccag cggatcccag 540
cagctgcaaa tcccagccca accagcctca agtttgggag ctgatacaat gtctgcggga 600
ggtggcggcc cattgggcga caataaccaa ggtgccgatg gagtgggcaa tgcctcggga 660
gattggcatt gcgattccac gtggatgggg gacagagtcg tcaccaagtc cacccgaacc 720
tgggtgctgc ccagctacaa caaccaccag taccgagaga tcaaaagcgg ctccgtcgac 780
ggaagcaacg ccaacgccta ctttggatac agcaccccct gggggtactt tgactttaac 840
cgcttccaca gccactggag cccccgagac tggcaaagac tcatcaacaa ctactggggc 900
ttcagacccc ggtccctcag agtcaaaatc ttcaacattc aagtcaaaga ggtcacggtg 960
caggactcca ccaccaccat cgccaacaac ctcacctcca ccgtccaagt gtttacggac 1020
gacgactacc agctgcccta cgtcgtcggc aacgggaccg agggatgcct gccggccttc 1080
cctccgcagg tctttacgct gccgcagtac ggttacgcga cgctgaaccg cgacaacaca 1140
gaaaatccca ccgagaggag cagcttcttc tgcctagagt actttcccag caagatgctg 1200
agaacgggca acaactttga gtttacctac aactttgagg aggtgccctt ccactccagc 1260
ttcgctccca gtcagaacct cttcaagctg gccaacccgc tggtggacca gtacttgtac 1320
cgcttcgtga gcacaaataa cactggcgga gtccagttca acaagaacct ggccgggaga 1380
tacgccaaca cctacaaaaa ctggttcccg gggcccatgg gccgaaccca gggctggaac 1440
ctgggctccg gggtcaaccg cgccagtgtc agcgccttcg ccacgaccaa taggatggag 1500
ctcgagggcg cgagttacca ggtgcccccg cagccgaacg gcatgaccaa caacctccag 1560
ggcagcaaca cctatgccct ggagaacact atgatcttca acagccagcc ggcgaacccg 1620
ggcaccaccg ccacgtacct cgagggcaac atgctcatca ccagcgagag cgagacgcag 1680
ccggtgaacc gcgtggcgta caacgtcggc gggcagatgg ccaccaacaa ccagagctcc 1740
accactgccc ccgcgaccgg cacgtacaac ctccaggaaa tcgtgcccgg cagcgtgtgg 1800
atggagaggg acgtgtacct ccaaggaccc atctgggcca agatcccaga gacgggggcg 1860
cactttcacc cctctccggc catgggcgga ttcggactca aacacccacc gcccatgatg 1920
ctcatcaaga acacgcctgt gcccggaaat atcaccagct tctcggacgt gcccgtcagc 1980
agcttcatca cccagtacag caccgggcag gtcaccgtgg agatggagtg ggagctcaag 2040
aaggaaaact ccaagaggtg gaacccagag atccagtaca caaacaacta caacgacccc 2100
cagtttgtgg actttgcccc ggacagcacc ggggaataca gaaccaccag acctatcgga 2160
acccgatacc ttacccgacc cctttaa 2187
<210> 70
<211> 4382
<212> DNA
<213> adeno-associated virus 9
<220>
<221> CDS
<222> (2116)..(4326)
<223> coding sequence for VP1
<220>
<221> misc_feature
<222> (2527)..(4326)
<223> coding sequence for VP2
<220>
<221> misc_feature
<222> (2722)..(4326)
<223> coding sequence for VP3
<400> 70
cagagaggga gtggccaact ccatcactag gggtaatcgc gaagcgcctc ccacgctgcc 60
gcgtcagcgc tgacgtagat tacgtcatag gggagtggtc ctgtattagc tgtcacgtga 120
gtgcttttgc gacattttgc gacaccacat ggccatttga ggtatatatg gccgagtgag 180
cgagcaggat ctccattttg accgcgaaat ttgaacgagc agcagccatg ccgggcttct 240
acgagattgt gatcaaggtg ccgagcgacc tggacgagca cctgccgggc atttctgact 300
cttttgtgaa ctgggtggcc gagaaggaat gggagctgcc cccggattct gacatggatc 360
ggaatctgat cgagcaggca cccctgaccg tggccgagaa gctgtagcgc gacttcctgg 420
tccaatggcg ccgcgtgagt aaggccccgg aggccctctt ctttgttcag ttcgagaagg 480
gcgagagcta ctttcacctg cacgttctgg tcgagaccac gggggtcaag tccatggtgc 540
taggccgctt cctgagtcag attcgggaga agctggtcca gaccatctac cgcgggatcg 600
agccgaccct gcccaactgg ttcgcggtga ccaagacgcg taatggcgcc ggcgggggga 660
acaaggtggt ggacgagtgc tacatcccca actacctcct gcccaagact cagcccgagc 720
tgcagtgggc gtggactaac atggaggagt atataagcgc gtgcttgaac ctggccgagc 780
gcaaacggct cgtggcgcag cacctgaccc acgtcagcca gacgcaggag cagaacaagg 840
agaatctgaa ccccaattct gacgcgcccg tgatcaggtc aaaaacctcc gcgcgctaca 900
tggagctggt cgggtggctg gtggaccggg gcatcacctc cgagaagcag tggatccagg 960
aggaccaggc ctcgtacatc tccttcaacg ccgcctccaa ctcgcggtcc cagatcaagg 1020
ccgcgctgga caatgccggc aagatcatgg cgctgaccaa atccgcgccc gactacctgg 1080
taggcccttc acttccggtg gacattacgc agaaccgcat ctaccgcatc ctgcagctca 1140
acggctacga ccctgcctac gccggctccg tctttctcgg ctgggcacaa aagaagttcg 1200
ggaaacgcaa caccatctgg ctgtttgggc cggccaccac gggaaagacc aacatcgcag 1260
aagccattgc ccacgccgtg cccttctacg gctgcgtcaa ctggaccaat gagaactttc 1320
ccttcaacga ttgcgtcgac aagatggtga tctggtggga ggagggcaag atgacggcca 1380
aggtcgtgga gtccgccaag gccattctcg gcggcagcaa ggtgcgcgtg gaccaaaagt 1440
gcaagtcgtc cgcccagatc gaccccactc ccgtgatcgt cacctccaac accaacatgt 1500
gcgccgtgat tgacgggaac agcaccacct tcgagcacca gcagcctctc caggaccgga 1560
tgtttaagtt cgaactcacc cgccgtctgg agcacgactt tggcaaggtg acaaagcagg 1620
aagtcaaaga gttcttccgc tgggccagtg atcacgtgac cgaggtggcg catgagtttt 1680
acgtcagaaa gggcggagcc agcaaaagac ccgcccccga tgacgcggat aaaagcgagc 1740
ccaagcgggc ctgcccctca gtcgcggatc catcgacgtc agacgcggaa ggagctccgg 1800
tggactttgc cgacaggtac caaaacaaat gttctcgtca cgcgggcatg cttcagatgc 1860
tgcttccctg caaaacgtgc gagagaatga atcagaattt caacatttgc ttcacacacg 1920
gggtcagaga ctgctcagag tgtttccccg gcgtgtcaga atctcaaccg gtcgtcagaa 1980
agaggacgta tcggaaactc tgtgcgattc atcatctgct ggggcgggct cccgagattg 2040
cttgctcggc ctgcgatctg gtcaacgtgg acctggatga ctgtgtttct gagcaataaa 2100
tgacttaaac caggt atg gct gcc gat ggt tat ctt cca gat tgg ctc gag 2151
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu
1 5 10
gac aac ctc tct gag ggc att cgc gag tgg tgg gac ctg aaa cct gga 2199
Asp Asn Leu Ser Glu Gly Ile Arg Glu Trp Trp Asp Leu Lys Pro Gly
15 20 25
gcc ccg aaa ccc aaa gcc aac cag caa aag cag gac gac ggc cgg ggt 2247
Ala Pro Lys Pro Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly
30 35 40
ctg gtg ctt cct ggc tac aag tac ctc gga ccc ttc aac gga ctc gac 2295
Leu Val Leu Pro Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp
45 50 55 60
aag ggg gag ccc gtc aac gcg gcg gac gca gcg gcc ctc gag cac gac 2343
Lys Gly Glu Pro Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp
65 70 75
aag gcc tac gac cag cag ctc aaa gcg ggt gac aat ccg tac ctg cgg 2391
Lys Ala Tyr Asp Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Arg
80 85 90
tat aac cac gcc gac gcc gag ttt cag gag cgt ctg caa gaa gat acg 2439
Tyr Asn His Ala Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr
95 100 105
tct ttt ggg ggc aac ctc ggg cga gca gtc ttc cag gcc aag aag cgg 2487
Ser Phe Gly Gly Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg
110 115 120
gtt ctc gaa cct ctc ggt ctg gtt gag gaa ggc gct aag acg gct cct 2535
Val Leu Glu Pro Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro
125 130 135 140
gga aag aag aga ccg gta gag cag tca ccc caa gaa cca gac tca tcc 2583
Gly Lys Lys Arg Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser
145 150 155
tcg ggc atc ggc aaa tca ggc cag cag ccc gct aaa aag aga ctc aat 2631
Ser Gly Ile Gly Lys Ser Gly Gln Gln Pro Ala Lys Lys Arg Leu Asn
160 165 170
ttt ggt cag act ggc gac tca gag tca gtc ccc gac cca caa cct ctc 2679
Phe Gly Gln Thr Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu
175 180 185
gga gaa cct cca gaa gcc ccc tca ggt ctg gga cct aat aca atg gct 2727
Gly Glu Pro Pro Glu Ala Pro Ser Gly Leu Gly Pro Asn Thr Met Ala
190 195 200
tca ggc ggt ggc gct cca atg gca gac aat aac gaa ggc gcc gac gga 2775
Ser Gly Gly Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly
205 210 215 220
gtg ggt aat tcc tcg gga aat tgg cat tgc gat tcc aca tgg ctg ggg 2823
Val Gly Asn Ser Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly
225 230 235
gac aga gtc atc acc acc agc acc cga acc tgg gca ttg ccc acc tac 2871
Asp Arg Val Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr
240 245 250
aac aac cac ctc tac aag caa atc tcc aat gga aca tcg gga gga agc 2919
Asn Asn His Leu Tyr Lys Gln Ile Ser Asn Gly Thr Ser Gly Gly Ser
255 260 265
acc aac gac aac acc tac ttt ggc tac agc acc ccc tgg ggg tat ttt 2967
Thr Asn Asp Asn Thr Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe
270 275 280
gac ttc aac aga ttc cac tgc cac ttc tca cca cgt gac tgg cag cga 3015
Asp Phe Asn Arg Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg
285 290 295 300
ctc atc aac aac aac tgg gga ttc cgg cca aag aga ctc aac ttc aag 3063
Leu Ile Asn Asn Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys
305 310 315
ctg ttc aac atc cag gtc aag gag gtt acg acg aac gaa ggc acc aag 3111
Leu Phe Asn Ile Gln Val Lys Glu Val Thr Thr Asn Glu Gly Thr Lys
320 325 330
acc atc gcc aat aac ctt acc agc acc gtc cag gtc ttt acg gac tcg 3159
Thr Ile Ala Asn Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser
335 340 345
gag tac cag cta ccg tac gtc cta ggc tct gcc cac caa gga tgc ctg 3207
Glu Tyr Gln Leu Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu
350 355 360
cca ccg ttt cct gca gac gtc ttc atg gtt cct cag tac ggc tac ctg 3255
Pro Pro Phe Pro Ala Asp Val Phe Met Val Pro Gln Tyr Gly Tyr Leu
365 370 375 380
acg ctc aac aat gga agt caa gcg tta gga cgt tct tct ttc tac tgt 3303
Thr Leu Asn Asn Gly Ser Gln Ala Leu Gly Arg Ser Ser Phe Tyr Cys
385 390 395
ctg gaa tac ttc cct tct cag atg ctg aga acc ggc aac aac ttt cag 3351
Leu Glu Tyr Phe Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln
400 405 410
ttc agc tac act ttc gag gac gtg cct ttc cac agc agc tac gca cac 3399
Phe Ser Tyr Thr Phe Glu Asp Val Pro Phe His Ser Ser Tyr Ala His
415 420 425
agc cag agt cta gat cga ctg atg aac ccc ctc atc gac cag tac cta 3447
Ser Gln Ser Leu Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu
430 435 440
tac tac ctg gtc aga aca cag aca act gga act ggg gga act caa act 3495
Tyr Tyr Leu Val Arg Thr Gln Thr Thr Gly Thr Gly Gly Thr Gln Thr
445 450 455 460
ttg gca ttc agc caa gca ggc cct agc tca atg gcc aat cag gct aga 3543
Leu Ala Phe Ser Gln Ala Gly Pro Ser Ser Met Ala Asn Gln Ala Arg
465 470 475
aac tgg gta ccc ggg cct tgc tac cgt cag cag cgc gtc tcc aca acc 3591
Asn Trp Val Pro Gly Pro Cys Tyr Arg Gln Gln Arg Val Ser Thr Thr
480 485 490
acc aac caa aat aac aac agc aac ttt gcg tgg acg gga gct gct aaa 3639
Thr Asn Gln Asn Asn Asn Ser Asn Phe Ala Trp Thr Gly Ala Ala Lys
495 500 505
ttc aag ctg aac ggg aga gac tcg cta atg aat cct ggc gtg gct atg 3687
Phe Lys Leu Asn Gly Arg Asp Ser Leu Met Asn Pro Gly Val Ala Met
510 515 520
gca tcg cac aaa gac gac gag gac cgc ttc ttt cca tca agt ggc gtt 3735
Ala Ser His Lys Asp Asp Glu Asp Arg Phe Phe Pro Ser Ser Gly Val
525 530 535 540
ctc ata ttt ggc aag caa gga gcc ggg aac gat gga gtc gac tac agc 3783
Leu Ile Phe Gly Lys Gln Gly Ala Gly Asn Asp Gly Val Asp Tyr Ser
545 550 555
cag gtg ctg att aca gat gag gaa gaa att aaa gcc acc aac cct gta 3831
Gln Val Leu Ile Thr Asp Glu Glu Glu Ile Lys Ala Thr Asn Pro Val
560 565 570
gcc aca gag gaa tac gga gca gtg gcc atc aac aac cag gcc gct aac 3879
Ala Thr Glu Glu Tyr Gly Ala Val Ala Ile Asn Asn Gln Ala Ala Asn
575 580 585
acg cag gcg caa act gga ctt gtg cat aac cag gga gtt att cct ggt 3927
Thr Gln Ala Gln Thr Gly Leu Val His Asn Gln Gly Val Ile Pro Gly
590 595 600
atg gtc tgg cag aac cgg gac gtg tac ctg cag ggc cct att tgg gct 3975
Met Val Trp Gln Asn Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala
605 610 615 620
aaa ata cct cac aca gat ggc aac ttt cac ccg tct cct ctg atg ggt 4023
Lys Ile Pro His Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly
625 630 635
gga ttt gga ctg aaa cac cca cct cca cag att cta att aaa aat aca 4071
Gly Phe Gly Leu Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr
640 645 650
cca gtg ccg gca gat cct cct ctt acc ttc aat caa gcc aag ctg aac 4119
Pro Val Pro Ala Asp Pro Pro Leu Thr Phe Asn Gln Ala Lys Leu Asn
655 660 665
tct ttc atc acg cag tac agc acg gga caa gtc agc gtg gaa atc gag 4167
Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu
670 675 680
tgg gag ctg cag aaa gaa aac agc aag cgc tgg aat cca gag atc cag 4215
Trp Glu Leu Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln
685 690 695 700
tat act tca aac tac tac aaa tct aca aat gtg gac ttt gct gtc aat 4263
Tyr Thr Ser Asn Tyr Tyr Lys Ser Thr Asn Val Asp Phe Ala Val Asn
705 710 715
acc gaa ggt gtt tac tct gag cct cgc ccc att ggt act cgt tac ctc 4311
Thr Glu Gly Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu
720 725 730
acc cgt aat ttg taa ttgcctgtta atcaataaac cggttaattc gtttcagttg 4366
Thr Arg Asn Leu
735
aactttggtc tctgcg 4382
<210> 71
<211> 736
<212> PRT
<213> adeno-associated virus 9
<400> 71
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Asp Leu Lys Pro Gly Ala Pro Lys Pro
20 25 30
Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ser Gly Ile Gly
145 150 155 160
Lys Ser Gly Gln Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr
165 170 175
Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro Pro
180 185 190
Glu Ala Pro Ser Gly Leu Gly Pro Asn Thr Met Ala Ser Gly Gly Gly
195 200 205
Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Asn Ser
210 215 220
Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val Ile
225 230 235 240
Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu
245 250 255
Tyr Lys Gln Ile Ser Asn Gly Thr Ser Gly Gly Ser Thr Asn Asp Asn
260 265 270
Thr Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg
275 280 285
Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn
290 295 300
Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile
305 310 315 320
Gln Val Lys Glu Val Thr Thr Asn Glu Gly Thr Lys Thr Ile Ala Asn
325 330 335
Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Glu Tyr Gln Leu
340 345 350
Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe Pro
355 360 365
Ala Asp Val Phe Met Val Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asn
370 375 380
Gly Ser Gln Ala Leu Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe
385 390 395 400
Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Thr
405 410 415
Phe Glu Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu
420 425 430
Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Val
435 440 445
Arg Thr Gln Thr Thr Gly Thr Gly Gly Thr Gln Thr Leu Ala Phe Ser
450 455 460
Gln Ala Gly Pro Ser Ser Met Ala Asn Gln Ala Arg Asn Trp Val Pro
465 470 475 480
Gly Pro Cys Tyr Arg Gln Gln Arg Val Ser Thr Thr Thr Asn Gln Asn
485 490 495
Asn Asn Ser Asn Phe Ala Trp Thr Gly Ala Ala Lys Phe Lys Leu Asn
500 505 510
Gly Arg Asp Ser Leu Met Asn Pro Gly Val Ala Met Ala Ser His Lys
515 520 525
Asp Asp Glu Asp Arg Phe Phe Pro Ser Ser Gly Val Leu Ile Phe Gly
530 535 540
Lys Gln Gly Ala Gly Asn Asp Gly Val Asp Tyr Ser Gln Val Leu Ile
545 550 555 560
Thr Asp Glu Glu Glu Ile Lys Ala Thr Asn Pro Val Ala Thr Glu Glu
565 570 575
Tyr Gly Ala Val Ala Ile Asn Asn Gln Ala Ala Asn Thr Gln Ala Gln
580 585 590
Thr Gly Leu Val His Asn Gln Gly Val Ile Pro Gly Met Val Trp Gln
595 600 605
Asn Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His
610 615 620
Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Leu
625 630 635 640
Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala
645 650 655
Asp Pro Pro Leu Thr Phe Asn Gln Ala Lys Leu Asn Ser Phe Ile Thr
660 665 670
Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln
675 680 685
Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn
690 695 700
Tyr Tyr Lys Ser Thr Asn Val Asp Phe Ala Val Asn Thr Glu Gly Val
705 710 715 720
Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu
725 730 735
<210> 72
<211> 2211
<212> DNA
<213> adeno-associated virus AAV9
<220>
<221> CDS
<222> (1)..(2211)
<223> coding sequence for VP1
<220>
<221> misc_feature
<222> (412)..(2211)
<223> coding sequence for VP2
<220>
<221> misc_feature
<222> (607)..(2211)
<223> coding sequence for VP3
<400> 72
atg gct gcc gat ggt tat ctt cca gat tgg ctc gag gac aac ctt agt 48
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
gaa gga att cgc gag tgg tgg gct ttg aaa cct gga gcc cct caa ccc 96
Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro
20 25 30
aag gca aat caa caa cat caa gac aac gct cga ggt ctt gtg ctt ccg 144
Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro
35 40 45
ggt tac aaa tac ctt gga ccc ggc aac gga ctc gac aag ggg gag ccg 192
Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
gtc aac gca gca gac gcg gcg gcc ctc gag cac gac aag gcc tac gac 240
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
cag cag ctc aag gcc gga gac aac ccg tac ctc aag tac aac cac gcc 288
Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala
85 90 95
gac gcc gag ttc cag gag cgg ctc aaa gaa gat acg tct ttt ggg ggc 336
Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly
100 105 110
aac ctc ggg cga gca gtc ttc cag gcc aaa aag agg ctt ctt gaa cct 384
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro
115 120 125
ctt ggt ctg gtt gag gaa gcg gct aag acg gct cct gga aag aag agg 432
Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
cct gta gag cag tct cct cag gaa ccg gac tcc tcc gcg ggt att ggc 480
Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly
145 150 155 160
aaa tcg ggt gca cag ccc gct aaa aag aga ctc aat ttc ggt cag act 528
Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr
165 170 175
ggc gac aca gag tca gtc cca gac cct caa cca atc gga gaa cct ccc 576
Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro
180 185 190
gca gcc ccc tca ggt gtg gga tct ctt aca atg gct tca ggt ggt ggc 624
Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly
195 200 205
gca cca gtg gca gac aat aac gaa ggt gcc gat gga gtg ggt agt tcc 672
Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser
210 215 220
tcg gga aat tgg cat tgc gat tcc caa tgg ctg ggg gac aga gtc atc 720
Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile
225 230 235 240
acc acc agc acc cga acc tgg gcc ctg ccc acc tac aac aat cac ctc 768
Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu
245 250 255
tac aag caa atc tcc aac agc aca tct gga gga tct tca aat gac aac 816
Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn
260 265 270
gcc tac ttc ggc tac agc acc ccc tgg ggg tat ttt gac ttc aac aga 864
Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg
275 280 285
ttc cac tgc cac ttc tca cca cgt gac tgg cag cga ctc atc aac aac 912
Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn
290 295 300
aac tgg gga ttc cgg cct aag cga ctc aac ttc aag ctc ttc aac att 960
Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile
305 310 315 320
cag gtc aaa gag gtt acg gac aac aat gga gtc aag acc atc gcc aat 1008
Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn
325 330 335
aac ctt acc agc acg gtc cag gtc ttc acg gac tca gac tat cag ctc 1056
Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu
340 345 350
ccg tac gtg ctc ggg tcg gct cac gag ggc tgc ctc ccg ccg ttc cca 1104
Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro
355 360 365
gcg gac gtt ttc atg att cct cag tac ggg tat ctg acg ctt aat gat 1152
Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp
370 375 380
gga agc cag gcc gtg ggt cgt tcg tcc ttt tac tgc ctg gaa tat ttc 1200
Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe
385 390 395 400
ccg tcg caa atg cta aga acg ggt aac aac ttc cag ttc agc tac gag 1248
Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu
405 410 415
ttt gag aac gta cct ttc cat agc agc tac gct cac agc caa agc ctg 1296
Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu
420 425 430
gac cga cta atg aat cca ctc atc gac caa tac ttg tac tat ctc tca 1344
Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser
435 440 445
aag act att aac ggt tct gga cag aat caa caa acg cta aaa ttc agt 1392
Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser
450 455 460
gtg gcc gga ccc agc aac atg gct gtc cag gga aga aac tac ata cct 1440
Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro
465 470 475 480
gga ccc agc tac cga caa caa cgt gtc tca acc act gtg act caa aac 1488
Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn
485 490 495
aac aac agc gaa ttt gct tgg cct gga gct tct tct tgg gct ctc aat 1536
Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn
500 505 510
gga cgt aat agc ttg atg aat cct gga cct gct atg gcc agc cac aaa 1584
Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys
515 520 525
gaa gga gag gac cgt ttc ttt cct ttg tct gga tct tta att ttt ggc 1632
Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly
530 535 540
aaa caa gga act gga aga gac aac gtg gat gcg gac aaa gtc atg ata 1680
Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile
545 550 555 560
acc aac gaa gaa gaa att aaa act act aac ccg gta gca acg gag tcc 1728
Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser
565 570 575
tat gga caa gtg gcc aca aac cac cag agt gcc caa gca cag gcg cag 1776
Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ala Gln Ala Gln
580 585 590
acc ggc tgg gtt caa aac caa gga ata ctt ccg ggt atg gtt tgg cag 1824
Thr Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln
595 600 605
gac aga gat gtg tac ctg caa gga ccc att tgg gcc aaa att cct cac 1872
Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His
610 615 620
acg gac ggc aac ttt cac cct tct ccg ctg atg gga ggg ttt gga atg 1920
Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met
625 630 635 640
aag cac ccg cct cct cag atc ctc atc aaa aac aca cct gta cct gcg 1968
Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala
645 650 655
gat cct cca acg gcc ttc aac aag gac aag ctg aac tct ttc atc acc 2016
Asp Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr
660 665 670
cag tat tct act ggc caa gtc agc gtg gag atc gag tgg gag ctg cag 2064
Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln
675 680 685
aag gaa aac agc aag cgc tgg aac ccg gag atc cag tac act tcc aac 2112
Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn
690 695 700
tat tac aag tct aat aat gtt gaa ttt gct gtt aat act gaa ggt gta 2160
Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val
705 710 715 720
tat agt gaa ccc cgc ccc att ggc acc aga tac ctg act cgt aat ctg 2208
Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu
725 730 735
taa 2211
<210> 73
<211> 736
<212> PRT
<213> adeno-associated virus AAV9
<400> 73
Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser
1 5 10 15
Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro
20 25 30
Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro
35 40 45
Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro
50 55 60
Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp
65 70 75 80
Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala
85 90 95
Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly
100 105 110
Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro
115 120 125
Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg
130 135 140
Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly
145 150 155 160
Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr
165 170 175
Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro
180 185 190
Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly
195 200 205
Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser
210 215 220
Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile
225 230 235 240
Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu
245 250 255
Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn
260 265 270
Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg
275 280 285
Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn
290 295 300
Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile
305 310 315 320
Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn
325 330 335
Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu
340 345 350
Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro
355 360 365
Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp
370 375 380
Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe
385 390 395 400
Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu
405 410 415
Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu
420 425 430
Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser
435 440 445
Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser
450 455 460
Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro
465 470 475 480
Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn
485 490 495
Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn
500 505 510
Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys
515 520 525
Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly
530 535 540
Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile
545 550 555 560
Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser
565 570 575
Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ala Gln Ala Gln
580 585 590
Thr Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln
595 600 605
Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His
610 615 620
Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met
625 630 635 640
Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala
645 650 655
Asp Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr
660 665 670
Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln
675 680 685
Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn
690 695 700
Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val
705 710 715 720
Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu
725 730 735
Claims (22)
- 오픈 리딩 프레임을 포함하는 뉴클레오타이드 서열을 갖는 핵산 분자로서, 여기서 상기 리딩 프레임은 5'에서 3' 순서로
(i) CTG 및 ACG로 구성된 그룹으로부터 선택된 차선 번역 개시 코돈인, 제1 코돈;
(ii) 알라닌을 암호화하는 제2 코돈; 및
(iii) 아데노 연관 바이러스(AAV) 캡시드 단백질을 암호화하는 서열로서, 이에 따른 서열은 단지 VP1 번역 개시 코돈만이 결핍된, 아데노 연관 바이러스(AAV) 캡시드 단백질을 암호화하는 서열
을 포함하며, AAV 캡시드 단백질은 AAV 혈청형 5 캡시드 단백질인, 핵산 분자.
- 제1항에 있어서,
상기 리딩 프레임은 5'에서 3' 순서로 상기 제2 코돈 다음에 부가적인 아미노산 잔기를 암호화하는 하나 이상의 코돈을 더 포함하는, 핵산 분자.
- 제1항에 있어서,
상기 캡시드 단백질은 SEQ ID NO: 22의 아미노산 서열을 갖는, 핵산 분자.
- 제1항에 있어서,
상기 제2 코돈은 GCT, GCC, GCA 및 GCG로 구성된 그룹으로부터 선택되거나, 또는 여기서 상기 코돈은 GCT인, 핵산 분자.
- 제1항에 따른 핵산 분자를 포함하는 핵산 구조체로서, 여기서 상기 아데노 연관 바이러스(AAV) 캡시드 단백질을 암호화하는 리딩 프레임의 뉴클레오타이드 서열은 곤충 세포에서 발현을 위해 발현 조절 서열에 작동가능하게 연결된, 핵산 구조체.
- 제5항에 있어서,
상기 리딩 프레임의 뉴클레오타이드 서열은 폴리헤드론 프로모터, p10 프로모터, 4xHsp27 EcRE+minimal Hsp70 프로모터, deltaE1 프로모터, 및 E1 프로모터로 구성된 그룹으로부터 선택된 프로모터에 작동가능하게 연결된, 핵산 구조체.
- 제5항에 있어서,
상기 구조체는 곤충-호환성 벡터, 또는 배큘로바이러스 벡터인, 핵산 구조체.
- 제5항에 있어서,
상기 핵산 분자는 SEQ ID NO: 51, 69, 42, 47 및 48로 구성된 그룹으로부터 선택된 오픈 리딩 프레임, 또는 SEQ ID NO:51의 오픈 리딩 프레임을 포함하는, 핵산 구조체.
- 제5항에 따른 핵산 구조체를 포함하는 곤충 세포.
- 제9항에 있어서,
상기 곤충 세포는
(a) 적어도 하나의 AAV 역위 말단 반복(ITR) 뉴클레오타이드 서열을 포함하는 제2 뉴클레오타이드 서열; 및
(b) 곤충 세포에서 발현을 위한 발현 조절 서열에 작동가능하게 연결된 Rep78 또는 Rep68 코딩 서열을 포함하는 제3 뉴클레오타이드 서열
을 더 포함하는, 곤충 세포.
- 제10항에 있어서,
상기 곤충 세포는
(c) 곤충 세포에서 발현을 위한 발현 조절 서열에 작동가능하게 연결된 Rep52 또는 Rep40 코딩 서열을 포함하는 제4 뉴클레오타이드 서열
을 더 포함하는, 곤충 세포.
- 제10항에 있어서,
상기 곤충 세포는
(a) 제5항 내지 제8항 중 어느 한 항에 따른 제1 핵산 구조체로서, 이에 따른 제1 핵산 구조체는 제10항의 (b)에서 정의된 제3 뉴클레오타이드 서열 및 제11항의 (c)에서 정의된 제4 뉴클레오티드 서열을 더 포함하는 제1 핵산 구조체; 및
(b) 제10항의 (a)에서 정의된 제2 뉴클레오타이드 서열을 포함하는 제2 핵산 구조체, 또는 상기 제2 핵산 구조체로서 곤충 세포-호환성 벡터, 또는 배큘로바이러스 벡터
를 포함하는, 곤충 세포.
- 제10항에 있어서,
상기 제2 뉴클레오타이드 서열은 포유류 세포에서 발현을 위한 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열을 더 포함하며, 그리고 이에 따른 상기 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열은 곤충 세포에서 생산된 AAV 혈청형 5의 게놈 내로 통합되는, 곤충 세포.
- 제13항에 있어서,
상기 제2 뉴클레오타이드 서열은 2개의 AAV ITR 뉴클레오타이드 서열들을 포함하며, 그리고 여기서 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열은 상기 2개의 AAV ITR 뉴클레오타이드 서열들 사이에 위치하는, 곤충 세포.
- 제10항에 있어서,
제1 뉴클레오타이드 서열, 제2 뉴클레오타이드 서열 및 제3 뉴클레오타이드 서열은 곤충 세포의 게놈 내로 안정하게 통합되는, 곤충 세포.
- 제11항에 있어서,
제4 뉴클레오타이드 서열은 곤충 세포의 게놈 내로 안정하게 통합되는, 곤충 세포.
- AAV 비리온으로서, 이의 게놈 내에 관심있는 유전자 산물을 암호화하는 적어도 하나의 뉴클레오타이드 서열을 포함하며, 이에 따른 상기 적어도 하나의 뉴클레오타이드 서열은 네이티브 AAV 뉴클레오타이드 서열이 아니며, 그리고 여기서 AAV VP1 캡시드 단백질은 N 말단으로부터 C 말단으로
(i) 제1 아미노산 잔기로서, 이는 번역 개시 코돈에 의해 암호화되거나, 또는 CTG 및 ACG로 구성된 그룹으로부터 선택된 차선의 번역 개시 코돈에 의해 암호화되는 제1 아미노산 잔기;
(ii) 제2 아미노산 잔기로서, 알라닌; 및
(iii) AAV 혈청형 5 VP1 캡시드 단백질의 아미노산 서열로서, 이에 따른 서열은 VP1 번역 개시 코돈에 의해 암호화된 아미노산 잔기만 결핍된 AAV 혈청형 5 VP1 캡시드 단백질의 아미노산 서열
을 포함하는, AAV 비리온.
- 제17항에 있어서,
AAV VP1 캡시드 단백질은 N 말단으로부터 C 말단으로
제2 아미노산 잔기 다음에 하나 이상의 부가적인 아미노산 잔기
를 더 포함하는, AAV 비리온.
- (a) AAV가 생산되는 조건하에서 제9항에 정의된 곤충 세포를 배양하는 단계를 포함하는, 곤충 세포에서 AAV를 생산하는 방법.
- 제19항에 있어서,
b) 상기 AAV의 회수 단계를 더 포함하는, 곤충 세포에서 AAV를 생산하는 방법.
- 제17항에 있어서,
관심있는 유전자 산물은 팩터 IX 또는 팩터 VIII 단백질을 암호화하는 AAV 비리온. - 삭제
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP14158610.7 | 2014-03-10 | ||
EP14158610 | 2014-03-10 | ||
PCT/NL2015/050149 WO2015137802A1 (en) | 2014-03-10 | 2015-03-10 | Further improved aav vectors produced in insect cells |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20160131032A KR20160131032A (ko) | 2016-11-15 |
KR102572449B1 true KR102572449B1 (ko) | 2023-08-31 |
Family
ID=50238253
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020167026098A KR102572449B1 (ko) | 2014-03-10 | 2015-03-10 | 곤충 세포에서 생산된 더욱 향상된 aav 벡터 |
Country Status (17)
Country | Link |
---|---|
US (2) | US10837027B2 (ko) |
EP (2) | EP3117005B1 (ko) |
JP (2) | JP6683397B2 (ko) |
KR (1) | KR102572449B1 (ko) |
CN (1) | CN106459984B (ko) |
AU (1) | AU2015230094B2 (ko) |
BR (1) | BR112016020783A2 (ko) |
CA (1) | CA2942289C (ko) |
DK (1) | DK3117005T3 (ko) |
EA (1) | EA201691809A1 (ko) |
FI (1) | FI3117005T3 (ko) |
IL (1) | IL247729B2 (ko) |
MX (1) | MX2016011585A (ko) |
PT (1) | PT3117005T (ko) |
UA (1) | UA120923C2 (ko) |
WO (1) | WO2015137802A1 (ko) |
ZA (1) | ZA201606552B (ko) |
Families Citing this family (59)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3151866B1 (en) | 2014-06-09 | 2023-03-08 | Voyager Therapeutics, Inc. | Chimeric capsids |
SG10202007103TA (en) | 2014-11-05 | 2020-09-29 | Voyager Therapeutics Inc | Aadc polynucleotides for the treatment of parkinson's disease |
CN114717264A (zh) | 2014-11-14 | 2022-07-08 | 沃雅戈治疗公司 | 治疗肌萎缩性侧索硬化(als)的组合物和方法 |
DK3218386T3 (da) | 2014-11-14 | 2021-06-07 | Voyager Therapeutics Inc | Modulatorisk polynukleotid |
EP3230441A4 (en) | 2014-12-12 | 2018-10-03 | Voyager Therapeutics, Inc. | Compositions and methods for the production of scaav |
CN109311932B (zh) | 2016-04-16 | 2022-05-03 | 佛罗里达大学研究基金会有限公司 | 提高杆状病毒系统产生的重组腺相关病毒的生物学效力的方法 |
US11299751B2 (en) | 2016-04-29 | 2022-04-12 | Voyager Therapeutics, Inc. | Compositions for the treatment of disease |
WO2017189959A1 (en) | 2016-04-29 | 2017-11-02 | Voyager Therapeutics, Inc. | Compositions for the treatment of disease |
RU2758488C2 (ru) | 2016-05-18 | 2021-10-28 | Вояджер Терапьютикс, Инк. | Модулирующие полинуклеотиды |
BR112018073472A2 (pt) | 2016-05-18 | 2019-08-27 | Voyager Therapeutics Inc | composições e métodos de tratamento da doença de huntington |
IL312792A (en) | 2016-08-15 | 2024-07-01 | Genzyme Corp | Methods for determining the serotype and heterogeneity of a viral particle and for improving the stability of RAAV particles and of AAV capsid protein variants, and the compositions containing them |
AU2017315679B2 (en) | 2016-08-23 | 2023-12-14 | Akouos, Inc. | Compositions and methods for treating non-age-associated hearing impairment in a human subject |
CA3035522A1 (en) | 2016-08-30 | 2018-03-08 | The Regents Of The University Of California | Methods for biomedical targeting and delivery and devices and systems for practicing the same |
CN111108198A (zh) | 2017-05-05 | 2020-05-05 | 沃雅戈治疗公司 | 治疗亨廷顿病的组合物和方法 |
SG11201909870SA (en) | 2017-05-05 | 2019-11-28 | Voyager Therapeutics Inc | Compositions and methods of treating amyotrophic lateral sclerosis (als) |
JOP20190269A1 (ar) | 2017-06-15 | 2019-11-20 | Voyager Therapeutics Inc | بولي نوكليوتيدات aadc لعلاج مرض باركنسون |
EP3652326B1 (en) | 2017-07-10 | 2024-10-09 | uniQure IP B.V. | Means and methods for aav gene therapy in humans |
WO2019018342A1 (en) | 2017-07-17 | 2019-01-24 | Voyager Therapeutics, Inc. | NETWORK EQUIPMENT TRACK GUIDE SYSTEM |
EP3655538A1 (en) | 2017-07-20 | 2020-05-27 | uniQure IP B.V. | Improved aav capsid production in insect cells |
MX2020001187A (es) | 2017-08-03 | 2020-10-05 | Voyager Therapeutics Inc | Composiciones y métodos para la administración de virus adenoasociados. |
WO2019079242A1 (en) | 2017-10-16 | 2019-04-25 | Voyager Therapeutics, Inc. | TREATMENT OF AMYOTROPHIC LATERAL SCLEROSIS (ALS) |
TW202413649A (zh) | 2017-10-16 | 2024-04-01 | 美商航海家醫療公司 | 肌萎縮性脊髓側索硬化症(als)之治療 |
AU2018394287B2 (en) * | 2017-12-29 | 2024-09-26 | Uniqure Ip B.V. | Modified viral vectors and methods of making and using the same |
GB201800903D0 (en) | 2018-01-19 | 2018-03-07 | Oxford Genetics Ltd | Vectors |
EP3807404A1 (en) | 2018-06-13 | 2021-04-21 | Voyager Therapeutics, Inc. | Engineered 5' untranslated regions (5' utr) for aav production |
US11702673B2 (en) | 2018-10-18 | 2023-07-18 | University Of Florida Research Foundation, Incorporated | Methods of enhancing biological potency of baculovirus system-produced recombinant adeno-associated virus |
WO2020104469A1 (en) | 2018-11-19 | 2020-05-28 | Uniqure Ip B.V. | Method and means to deliver mirna to target cells |
WO2020104480A1 (en) | 2018-11-19 | 2020-05-28 | Uniqure Biopharma B.V. | Adeno-associated virus vectors for expressing fviii mimetics and uses thereof |
CA3119721A1 (en) | 2018-11-19 | 2020-05-28 | Uniqure Ip B.V. | A companion diagnostic to monitor the effects of gene therapy |
US20220064671A1 (en) | 2019-01-18 | 2022-03-03 | Voyager Therapeutics, Inc. | Methods and systems for producing aav particles |
WO2020168222A1 (en) * | 2019-02-15 | 2020-08-20 | Generation Bio Co. | Modulation of rep protein activity in closed-ended dna (cedna) production |
CN111084888B (zh) * | 2019-03-15 | 2021-12-28 | 北京锦篮基因科技有限公司 | 一种用于治疗严重高甘油三酯血症的基因药物 |
BR112021017603A2 (pt) * | 2019-04-24 | 2021-11-16 | Takara Bio Inc | Vírus adenoassociado (aav) mutante tendo propriedade de direcionamento para o cérebro |
WO2020223236A1 (en) * | 2019-04-29 | 2020-11-05 | The Trustees Of The University Of Pennsylvania | Novel aav capsids and compositions containing same |
US20220241430A1 (en) * | 2019-05-24 | 2022-08-04 | Regeneron Pharmaceuticals, Inc. | Modified viral particles and uses thereof |
CN114072510A (zh) * | 2019-06-27 | 2022-02-18 | X-化学有限公司 | 用于昆虫细胞和哺乳动物细胞中的蛋白质表达的重组转移载体 |
US10557149B1 (en) * | 2019-07-15 | 2020-02-11 | Vigene Biosciences, Inc. | Recombinantly-modified adeno-associated virus helper vectors and their use to improve the packaging efficiency of recombinantly-modified adeno-associated virus |
WO2021053018A1 (en) | 2019-09-16 | 2021-03-25 | Uniqure Ip B.V. | Targeting misspliced transcripts in genetic disorders |
WO2021168362A1 (en) | 2020-02-21 | 2021-08-26 | Akouos, Inc. | Compositions and methods for treating non-age-associated hearing impairment in a human subject |
WO2021195491A2 (en) * | 2020-03-26 | 2021-09-30 | Asklepios Biopharmaceutical, Inc. | Inducible promoter for viral vector production |
JP2023519138A (ja) | 2020-04-02 | 2023-05-10 | ユニキュアー バイオファーマ ビー.ブイ. | 新規細胞株 |
CN115997006A (zh) | 2020-04-02 | 2023-04-21 | 优尼科生物制药有限公司 | 用于产生aav的双双功能载体 |
EP4305157A1 (en) | 2021-03-09 | 2024-01-17 | Huidagene Therapeutics (Singapore) Pte. Ltd. | Engineered crispr/cas13 system and uses thereof |
EP4314258A1 (en) | 2021-04-02 | 2024-02-07 | uniQure biopharma B.V. | Methods for producing single insect cell clones |
AU2022285138A1 (en) | 2021-06-02 | 2023-11-30 | Uniqure Biopharma B.V. | Adeno-associated virus vectors modified to bind high-density lipoprotein |
WO2022253955A2 (en) | 2021-06-02 | 2022-12-08 | Uniqure Biopharma B.V. | Insect cell production of parvoviral vectors with modified capsid proteins |
WO2022268811A1 (en) | 2021-06-21 | 2022-12-29 | Uniqure Biopharma B.V. | Improved lysis procedures |
WO2023283962A1 (en) | 2021-07-16 | 2023-01-19 | Huigene Therapeutics Co., Ltd. | Modified aav capsid for gene therapy and methods thereof |
EP4392434A1 (en) | 2021-08-26 | 2024-07-03 | uniQure biopharma B.V. | Insect cell-produced high potency aav vectors with cns-tropism |
CN114703203B (zh) * | 2022-02-11 | 2024-08-06 | 上海渤因生物科技有限公司 | 杆状病毒载体及其用途 |
WO2023198745A1 (en) | 2022-04-12 | 2023-10-19 | Uniqure Biopharma B.V. | Nucleic acid regulation of apoe |
WO2023198702A1 (en) | 2022-04-12 | 2023-10-19 | Uniqure Biopharma B.V. | Nucleic acid regulation of c9orf72 |
WO2023198662A1 (en) | 2022-04-12 | 2023-10-19 | Uniqure Biopharma B.V. | Novel systems for nucleic acid regulation |
WO2023198663A1 (en) | 2022-04-12 | 2023-10-19 | Uniqure Biopharma B.V. | Nucleic acid regulation of snca |
CN117343943A (zh) * | 2022-10-13 | 2024-01-05 | 康霖生物科技(杭州)有限公司 | 腺相关病毒的衣壳蛋白编码基因改造方法 |
US20240358820A1 (en) * | 2023-03-23 | 2024-10-31 | Carbon Biosciences, Inc. | Protoparvovirus compositions comprising a protoparvovirus variant vp1 capsid polypeptide and related methods |
WO2024196965A1 (en) * | 2023-03-23 | 2024-09-26 | Carbon Biosciences, Inc. | Parvovirus compositions and related methods for gene therapy |
WO2024218204A1 (en) | 2023-04-18 | 2024-10-24 | Uniqure Biopharma B.V. | Gene delivery vehicles comprising rna and antibodies |
WO2024218192A1 (en) | 2023-04-18 | 2024-10-24 | Uniqure Biopharma B.V. | Novel neurotropic adeno-associated virus capsids with detargeting of peripheral organs |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003042361A3 (en) | 2001-11-09 | 2006-01-19 | Ment Of Health And Human Servi | Production of adeno-associated virus in insect cells |
WO2007046703A2 (en) * | 2005-10-20 | 2007-04-26 | Amsterdam Molecular Therapeutics B.V. | Improved aav vectors produced in insect cells |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4745051A (en) | 1983-05-27 | 1988-05-17 | The Texas A&M University System | Method for producing a recombinant baculovirus expression vector |
PT1200117E (pt) | 1999-06-24 | 2008-11-26 | Univ British Columbia | Tratamento à base de variantes da lipoproteína lipase |
US6723551B2 (en) | 2001-11-09 | 2004-04-20 | The United States Of America As Represented By The Department Of Health And Human Services | Production of adeno-associated virus in insect cells |
DK1463805T3 (en) | 2001-12-17 | 2015-01-19 | Univ Pennsylvania | Sequences of adeno-associated virus (AAV) serotype 9, vectors containing these as well as their uses |
WO2003074714A1 (en) | 2002-03-05 | 2003-09-12 | Stichting Voor De Technische Wetenschappen | Baculovirus expression system |
SI3211085T1 (sl) | 2003-09-30 | 2021-12-31 | The Trustees Of The University Of Pennsylvania | Genotipske skupine, zaporedja, vektorji, ki vsebujejo le-te z adenovirusi povezanega virusa (AAV) in njihova uporaba |
ES2381796T3 (es) | 2004-09-22 | 2012-05-31 | St. Jude Children's Research Hospital | Expresión mejorada de Factor IX en vectores de terapia génica |
JP5364903B2 (ja) * | 2006-06-21 | 2013-12-11 | ユニキュアー アイピー ビー.ブイ. | 昆虫細胞におけるaavの生成に有用なaav−rep78の翻訳の改変型開始コドンを有するベクター |
EP3561063A1 (en) * | 2007-07-26 | 2019-10-30 | uniQure IP B.V. | Baculoviral vectors comprising repeated coding sequences with differential codon biases |
EP2297185A1 (en) | 2008-06-17 | 2011-03-23 | Amsterdam Molecular Therapeutics (AMT) B.V. | Parvoviral capsid with incorporated gly-ala repeat region |
WO2011122950A1 (en) | 2010-04-01 | 2011-10-06 | Amsterdam Molecular Therapeutics (Amt) Ip B.V. | Monomeric duplex aav vectors |
-
2015
- 2015-03-10 PT PT157154766T patent/PT3117005T/pt unknown
- 2015-03-10 AU AU2015230094A patent/AU2015230094B2/en active Active
- 2015-03-10 DK DK15715476.6T patent/DK3117005T3/da active
- 2015-03-10 EP EP15715476.6A patent/EP3117005B1/en active Active
- 2015-03-10 CA CA2942289A patent/CA2942289C/en active Active
- 2015-03-10 FI FIEP15715476.6T patent/FI3117005T3/fi active
- 2015-03-10 CN CN201580019215.0A patent/CN106459984B/zh active Active
- 2015-03-10 EP EP24184886.0A patent/EP4450630A2/en active Pending
- 2015-03-10 IL IL247729A patent/IL247729B2/en unknown
- 2015-03-10 JP JP2016555770A patent/JP6683397B2/ja active Active
- 2015-03-10 UA UAA201609350A patent/UA120923C2/uk unknown
- 2015-03-10 EA EA201691809A patent/EA201691809A1/ru unknown
- 2015-03-10 WO PCT/NL2015/050149 patent/WO2015137802A1/en active Application Filing
- 2015-03-10 MX MX2016011585A patent/MX2016011585A/es unknown
- 2015-03-10 US US15/124,139 patent/US10837027B2/en active Active
- 2015-03-10 BR BR112016020783A patent/BR112016020783A2/pt not_active Application Discontinuation
- 2015-03-10 KR KR1020167026098A patent/KR102572449B1/ko active IP Right Grant
-
2016
- 2016-09-22 ZA ZA2016/06552A patent/ZA201606552B/en unknown
-
2020
- 2020-01-06 JP JP2020000121A patent/JP2020062045A/ja active Pending
- 2020-11-05 US US17/090,807 patent/US20210222198A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003042361A3 (en) | 2001-11-09 | 2006-01-19 | Ment Of Health And Human Servi | Production of adeno-associated virus in insect cells |
WO2007046703A2 (en) * | 2005-10-20 | 2007-04-26 | Amsterdam Molecular Therapeutics B.V. | Improved aav vectors produced in insect cells |
Non-Patent Citations (1)
Title |
---|
The American Society for Microbiology, (2006) Vol.80, No. 4, pp1874-1885 |
Also Published As
Publication number | Publication date |
---|---|
BR112016020783A2 (pt) | 2017-10-03 |
WO2015137802A1 (en) | 2015-09-17 |
EA201691809A1 (ru) | 2017-01-30 |
AU2015230094A9 (en) | 2016-09-22 |
KR20160131032A (ko) | 2016-11-15 |
UA120923C2 (uk) | 2020-03-10 |
JP2017510264A (ja) | 2017-04-13 |
CN106459984A (zh) | 2017-02-22 |
US20170356008A1 (en) | 2017-12-14 |
ZA201606552B (en) | 2017-11-29 |
AU2015230094A1 (en) | 2016-09-15 |
AU2015230094B2 (en) | 2021-05-27 |
EP3117005B1 (en) | 2024-07-03 |
IL247729B1 (en) | 2023-05-01 |
EP3117005A1 (en) | 2017-01-18 |
JP6683397B2 (ja) | 2020-04-22 |
PT3117005T (pt) | 2024-07-30 |
JP2020062045A (ja) | 2020-04-23 |
US10837027B2 (en) | 2020-11-17 |
CN106459984B (zh) | 2021-09-07 |
CA2942289C (en) | 2024-05-21 |
DK3117005T3 (da) | 2024-08-12 |
US20210222198A1 (en) | 2021-07-22 |
CA2942289A1 (en) | 2015-09-17 |
EP4450630A2 (en) | 2024-10-23 |
MX2016011585A (es) | 2016-11-29 |
IL247729A0 (en) | 2016-11-30 |
FI3117005T3 (fi) | 2024-08-29 |
IL247729B2 (en) | 2023-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102572449B1 (ko) | 곤충 세포에서 생산된 더욱 향상된 aav 벡터 | |
US20240093231A1 (en) | Aav capsid production in insect cells | |
JP5305913B2 (ja) | 昆虫細胞で産生される改良されたaavベクター | |
KR101597695B1 (ko) | 차등 코돈 바이어스를 갖는 반복 암호 서열을 포함하는 배큘로바이러스 벡터 | |
EP2035564B1 (en) | Vectors with modified initiation codon for the translation of aav-rep78 useful for production of aav in insect cells | |
KR20180019543A (ko) | 캡시드(Capsid) | |
EP2297185A1 (en) | Parvoviral capsid with incorporated gly-ala repeat region | |
EP4392434A1 (en) | Insect cell-produced high potency aav vectors with cns-tropism | |
AU2013254897B2 (en) | Vectors with modified initiation codon for the translation of AAV-Rep78 useful for production of AAV in insect cells | |
EA042960B1 (ru) | Молекула нуклеиновой кислоты, конструкция нуклеиновой кислоты, клетка насекомого и способ получения aav в клетке насекомого |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E902 | Notification of reason for refusal | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right |