CN114539365B - Modified human papilloma virus 52 type L1 protein and application thereof - Google Patents
Modified human papilloma virus 52 type L1 protein and application thereof Download PDFInfo
- Publication number
- CN114539365B CN114539365B CN202011351390.9A CN202011351390A CN114539365B CN 114539365 B CN114539365 B CN 114539365B CN 202011351390 A CN202011351390 A CN 202011351390A CN 114539365 B CN114539365 B CN 114539365B
- Authority
- CN
- China
- Prior art keywords
- gly
- leu
- ser
- thr
- pro
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 47
- 102000004169 proteins and genes Human genes 0.000 title abstract description 20
- 241000701603 Human papillomavirus type 52 Species 0.000 title description 6
- 239000002245 particle Substances 0.000 claims abstract description 36
- 229960005486 vaccine Drugs 0.000 claims abstract description 31
- 108700042300 human papillomavirus type 52 L1 Proteins 0.000 claims abstract description 30
- 208000009608 Papillomavirus Infections Diseases 0.000 claims abstract description 11
- 201000010099 disease Diseases 0.000 claims abstract description 7
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 7
- 239000013598 vector Substances 0.000 claims abstract description 7
- 230000002265 prevention Effects 0.000 claims abstract description 3
- 150000001413 amino acids Chemical class 0.000 claims description 60
- 210000004027 cell Anatomy 0.000 claims description 35
- 235000001014 amino acid Nutrition 0.000 claims description 27
- 241000238631 Hexapoda Species 0.000 claims description 26
- 238000000034 method Methods 0.000 claims description 17
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 claims description 16
- 108091033319 polynucleotide Proteins 0.000 claims description 9
- 239000002157 polynucleotide Substances 0.000 claims description 9
- 102000040430 polynucleotide Human genes 0.000 claims description 9
- 239000002671 adjuvant Substances 0.000 claims description 8
- 108020004705 Codon Proteins 0.000 claims description 7
- 210000004899 c-terminal region Anatomy 0.000 claims description 7
- 238000012986 modification Methods 0.000 claims description 7
- 230000004048 modification Effects 0.000 claims description 7
- 241000701447 unidentified baculovirus Species 0.000 claims description 7
- 238000006467 substitution reaction Methods 0.000 claims description 6
- 239000004220 glutamic acid Substances 0.000 claims description 5
- 230000035772 mutation Effects 0.000 claims description 5
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 claims description 4
- 235000003704 aspartic acid Nutrition 0.000 claims description 4
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 claims description 4
- 235000013922 glutamic acid Nutrition 0.000 claims description 4
- 239000000546 pharmaceutical excipient Substances 0.000 claims description 3
- 238000002360 preparation method Methods 0.000 claims description 3
- 210000005253 yeast cell Anatomy 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 2
- 239000013612 plasmid Substances 0.000 claims description 2
- 229920000642 polymer Polymers 0.000 claims 2
- 239000000203 mixture Substances 0.000 claims 1
- 239000002773 nucleotide Substances 0.000 abstract description 87
- 125000003729 nucleotide group Chemical group 0.000 abstract description 87
- 239000012646 vaccine adjuvant Substances 0.000 abstract description 2
- 229940124931 vaccine adjuvant Drugs 0.000 abstract description 2
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 54
- 108700042769 prolyl-leucyl-glycine Proteins 0.000 description 52
- 108010050848 glycylleucine Proteins 0.000 description 47
- 230000014509 gene expression Effects 0.000 description 44
- DLZKEQQWXODGGZ-KWQFWETISA-N Tyr-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DLZKEQQWXODGGZ-KWQFWETISA-N 0.000 description 34
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 33
- 108010090894 prolylleucine Proteins 0.000 description 32
- 108010068265 aspartyltyrosine Proteins 0.000 description 31
- 108010005834 tyrosyl-alanyl-glycine Proteins 0.000 description 31
- VQPPIMUZCZCOIL-GUBZILKMSA-N Leu-Gln-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O VQPPIMUZCZCOIL-GUBZILKMSA-N 0.000 description 29
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 29
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 28
- HHQCBFGKQDMWSP-GUBZILKMSA-N Gln-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HHQCBFGKQDMWSP-GUBZILKMSA-N 0.000 description 27
- ODUQLUADRKMHOZ-JYJNAYRXSA-N Lys-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)O ODUQLUADRKMHOZ-JYJNAYRXSA-N 0.000 description 27
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 27
- 108010045383 histidyl-glycyl-glutamic acid Proteins 0.000 description 27
- 108010009298 lysylglutamic acid Proteins 0.000 description 27
- TWTPDFFBLQEBOE-IUCAKERBSA-N Gly-Leu-Gln Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O TWTPDFFBLQEBOE-IUCAKERBSA-N 0.000 description 26
- MROIJTGJGIDEEJ-RCWTZXSCSA-N Thr-Pro-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 MROIJTGJGIDEEJ-RCWTZXSCSA-N 0.000 description 26
- 108010020688 glycylhistidine Proteins 0.000 description 26
- SHAUZYVSXAMYAZ-JYJNAYRXSA-N Gln-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N SHAUZYVSXAMYAZ-JYJNAYRXSA-N 0.000 description 25
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 25
- 108010077245 asparaginyl-proline Proteins 0.000 description 25
- 238000010276 construction Methods 0.000 description 25
- YFGONBOFGGWKKY-VHSXEESVSA-N Gly-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)CN)C(=O)O YFGONBOFGGWKKY-VHSXEESVSA-N 0.000 description 24
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 24
- 108010071207 serylmethionine Proteins 0.000 description 24
- 108010038745 tryptophylglycine Proteins 0.000 description 24
- UQJUGHFKNKGHFQ-VZFHVOOUSA-N Ala-Cys-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UQJUGHFKNKGHFQ-VZFHVOOUSA-N 0.000 description 23
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 23
- NTXYXFDMIHXTHE-WDSOQIARSA-N Leu-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 NTXYXFDMIHXTHE-WDSOQIARSA-N 0.000 description 23
- VNGKMNPAENRGDC-JYJNAYRXSA-N Val-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 VNGKMNPAENRGDC-JYJNAYRXSA-N 0.000 description 23
- 108010027338 isoleucylcysteine Proteins 0.000 description 23
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 23
- 108010061238 threonyl-glycine Proteins 0.000 description 23
- WXVGISRWSYGEDK-KKUMJFAQSA-N Asn-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N WXVGISRWSYGEDK-KKUMJFAQSA-N 0.000 description 22
- ULZCYBYDTUMHNF-IUCAKERBSA-N Gly-Leu-Glu Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ULZCYBYDTUMHNF-IUCAKERBSA-N 0.000 description 22
- 108010065920 Insulin Lispro Proteins 0.000 description 22
- 241000880493 Leptailurus serval Species 0.000 description 22
- 108010017391 lysylvaline Proteins 0.000 description 22
- 108010077112 prolyl-proline Proteins 0.000 description 22
- NHJKZMDIMMTVCK-QXEWZRGKSA-N Ile-Gly-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N NHJKZMDIMMTVCK-QXEWZRGKSA-N 0.000 description 21
- YFNOUBWUIIJQHF-LPEHRKFASA-N Pro-Asp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O YFNOUBWUIIJQHF-LPEHRKFASA-N 0.000 description 21
- OFTXTCGQJXTNQS-XGEHTFHBSA-N Val-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N)O OFTXTCGQJXTNQS-XGEHTFHBSA-N 0.000 description 21
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 21
- IEWBEPKLKUXQBU-VOAKCMCISA-N Leu-Leu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IEWBEPKLKUXQBU-VOAKCMCISA-N 0.000 description 20
- 108010038320 lysylphenylalanine Proteins 0.000 description 20
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 19
- LXAARTARZJJCMB-CIQUZCHMSA-N Ala-Ile-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LXAARTARZJJCMB-CIQUZCHMSA-N 0.000 description 19
- JJQGZGOEDSSHTE-FOHZUACHSA-N Asp-Thr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JJQGZGOEDSSHTE-FOHZUACHSA-N 0.000 description 19
- 241000084490 Esenbeckia delta Species 0.000 description 19
- PABFFPWEJMEVEC-JGVFFNPUSA-N Gly-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)CN)C(=O)O PABFFPWEJMEVEC-JGVFFNPUSA-N 0.000 description 19
- PDUHNKAFQXQNLH-ZETCQYMHSA-N Gly-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)NCC(O)=O PDUHNKAFQXQNLH-ZETCQYMHSA-N 0.000 description 19
- IIKJNQWOQIWWMR-CIUDSAMLSA-N Leu-Cys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(C)C)N IIKJNQWOQIWWMR-CIUDSAMLSA-N 0.000 description 19
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 19
- QQPSCXKFDSORFT-IHRRRGAJSA-N Lys-Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN QQPSCXKFDSORFT-IHRRRGAJSA-N 0.000 description 19
- GZGWILAQHOVXTD-DCAQKATOSA-N Lys-Met-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O GZGWILAQHOVXTD-DCAQKATOSA-N 0.000 description 19
- PDIDTSZKKFEDMB-UWVGGRQHSA-N Lys-Pro-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O PDIDTSZKKFEDMB-UWVGGRQHSA-N 0.000 description 19
- GTMSCDVFQLNEOY-BZSNNMDCSA-N Phe-Tyr-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N GTMSCDVFQLNEOY-BZSNNMDCSA-N 0.000 description 19
- CGSOWZUPLOKYOR-AVGNSLFASA-N Pro-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 CGSOWZUPLOKYOR-AVGNSLFASA-N 0.000 description 19
- FDMKYQQYJKYCLV-GUBZILKMSA-N Pro-Pro-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 FDMKYQQYJKYCLV-GUBZILKMSA-N 0.000 description 19
- OYEDZGNMSBZCIM-XGEHTFHBSA-N Ser-Arg-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OYEDZGNMSBZCIM-XGEHTFHBSA-N 0.000 description 19
- SWSRFJZZMNLMLY-ZKWXMUAHSA-N Ser-Asp-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O SWSRFJZZMNLMLY-ZKWXMUAHSA-N 0.000 description 19
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 19
- RKDFEMGVMMYYNG-WDCWCFNPSA-N Thr-Gln-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O RKDFEMGVMMYYNG-WDCWCFNPSA-N 0.000 description 19
- RFKVQLIXNVEOMB-WEDXCCLWSA-N Thr-Leu-Gly Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N)O RFKVQLIXNVEOMB-WEDXCCLWSA-N 0.000 description 19
- YVXIAOOYAKBAAI-SZMVWBNQSA-N Trp-Leu-Gln Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O)=CNC2=C1 YVXIAOOYAKBAAI-SZMVWBNQSA-N 0.000 description 19
- NKUGCYDFQKFVOJ-JYJNAYRXSA-N Tyr-Leu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NKUGCYDFQKFVOJ-JYJNAYRXSA-N 0.000 description 19
- GMOLURHJBLOBFW-ONGXEEELSA-N Val-Gly-His Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N GMOLURHJBLOBFW-ONGXEEELSA-N 0.000 description 19
- 108010005233 alanylglutamic acid Proteins 0.000 description 19
- 108010029020 prolylglycine Proteins 0.000 description 19
- KXFCBAHYSLJCCY-ZLUOBGJFSA-N Asn-Asn-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O KXFCBAHYSLJCCY-ZLUOBGJFSA-N 0.000 description 18
- IDDMGSKZQDEDGA-SRVKXCTJSA-N Asp-Phe-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 IDDMGSKZQDEDGA-SRVKXCTJSA-N 0.000 description 18
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 18
- JDDYEZGPYBBPBN-JRQIVUDYSA-N Asp-Thr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JDDYEZGPYBBPBN-JRQIVUDYSA-N 0.000 description 18
- XMVZMBGFIOQONW-GARJFASQSA-N Cys-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CS)N)C(=O)O XMVZMBGFIOQONW-GARJFASQSA-N 0.000 description 18
- YNJBLTDKTMKEET-ZLUOBGJFSA-N Cys-Ser-Ser Chemical compound SC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O YNJBLTDKTMKEET-ZLUOBGJFSA-N 0.000 description 18
- JILRMFFFCHUUTJ-ACZMJKKPSA-N Gln-Ser-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O JILRMFFFCHUUTJ-ACZMJKKPSA-N 0.000 description 18
- PBFGQTGPSKWHJA-QEJZJMRPSA-N Glu-Asp-Trp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O PBFGQTGPSKWHJA-QEJZJMRPSA-N 0.000 description 18
- RQNYYRHRKSVKAB-GUBZILKMSA-N Glu-Cys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O RQNYYRHRKSVKAB-GUBZILKMSA-N 0.000 description 18
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 18
- DAHLWSFUXOHMIA-FXQIFTODSA-N Glu-Ser-Gln Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O DAHLWSFUXOHMIA-FXQIFTODSA-N 0.000 description 18
- JWNZHMSRZXXGTM-XKBZYTNZSA-N Glu-Ser-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWNZHMSRZXXGTM-XKBZYTNZSA-N 0.000 description 18
- YPHPEHMXOYTEQG-LAEOZQHASA-N Glu-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O YPHPEHMXOYTEQG-LAEOZQHASA-N 0.000 description 18
- FZQLXNIMCPJVJE-YUMQZZPRSA-N Gly-Asp-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O FZQLXNIMCPJVJE-YUMQZZPRSA-N 0.000 description 18
- AVQOSMRPITVTRB-CIUDSAMLSA-N His-Asn-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AVQOSMRPITVTRB-CIUDSAMLSA-N 0.000 description 18
- OVDKXUDMKXAZIV-ZPFDUUQYSA-N Ile-Lys-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OVDKXUDMKXAZIV-ZPFDUUQYSA-N 0.000 description 18
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 18
- LZHHZYDPMZEMRX-STQMWFEESA-N Pro-Tyr-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O LZHHZYDPMZEMRX-STQMWFEESA-N 0.000 description 18
- ZMLRZBWCXPQADC-TUAOUCFPSA-N Pro-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ZMLRZBWCXPQADC-TUAOUCFPSA-N 0.000 description 18
- MFQMZDPAZRZAPV-NAKRPEOUSA-N Ser-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CO)N MFQMZDPAZRZAPV-NAKRPEOUSA-N 0.000 description 18
- NJEMRSFGDNECGF-GCJQMDKQSA-N Thr-Ala-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O NJEMRSFGDNECGF-GCJQMDKQSA-N 0.000 description 18
- XOTBWOCSLMBGMF-SUSMZKCASA-N Thr-Glu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOTBWOCSLMBGMF-SUSMZKCASA-N 0.000 description 18
- UGFMVXRXULGLNO-XPUUQOCRSA-N Val-Ser-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O UGFMVXRXULGLNO-XPUUQOCRSA-N 0.000 description 18
- IFTVANMRTIHKML-WDSKDSINSA-N Ala-Gln-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O IFTVANMRTIHKML-WDSKDSINSA-N 0.000 description 17
- MSWSRLGNLKHDEI-ACZMJKKPSA-N Ala-Ser-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O MSWSRLGNLKHDEI-ACZMJKKPSA-N 0.000 description 17
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 17
- UXJCMQFPDWCHKX-DCAQKATOSA-N Arg-Arg-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UXJCMQFPDWCHKX-DCAQKATOSA-N 0.000 description 17
- MSILNNHVVMMTHZ-UWVGGRQHSA-N Arg-His-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CN=CN1 MSILNNHVVMMTHZ-UWVGGRQHSA-N 0.000 description 17
- INOIAEUXVVNJKA-XGEHTFHBSA-N Arg-Thr-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O INOIAEUXVVNJKA-XGEHTFHBSA-N 0.000 description 17
- CASGONAXMZPHCK-FXQIFTODSA-N Asp-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)CN=C(N)N CASGONAXMZPHCK-FXQIFTODSA-N 0.000 description 17
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 17
- MWTGQXBHVRTCOR-GLLZPBPUSA-N Glu-Thr-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MWTGQXBHVRTCOR-GLLZPBPUSA-N 0.000 description 17
- IANBSEOVTQNGBZ-BQBZGAKWSA-N Gly-Cys-Met Chemical compound [H]NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CCSC)C(O)=O IANBSEOVTQNGBZ-BQBZGAKWSA-N 0.000 description 17
- IEGFSKKANYKBDU-QWHCGFSZSA-N Gly-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)CN)C(=O)O IEGFSKKANYKBDU-QWHCGFSZSA-N 0.000 description 17
- BGZIJZJBXRVBGJ-SXTJYALSSA-N Ile-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N BGZIJZJBXRVBGJ-SXTJYALSSA-N 0.000 description 17
- FZWVCYCYWCLQDH-NHCYSSNCSA-N Ile-Leu-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N FZWVCYCYWCLQDH-NHCYSSNCSA-N 0.000 description 17
- KBDIBHQICWDGDL-PPCPHDFISA-N Ile-Thr-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N KBDIBHQICWDGDL-PPCPHDFISA-N 0.000 description 17
- NXRNRBOKDBIVKQ-CXTHYWKRSA-N Ile-Tyr-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N NXRNRBOKDBIVKQ-CXTHYWKRSA-N 0.000 description 17
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 17
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 17
- ZWBCVBHKXHPCEI-BVSLBCMMSA-N Met-Phe-Trp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O)N ZWBCVBHKXHPCEI-BVSLBCMMSA-N 0.000 description 17
- ZENDEDYRYVHBEG-SRVKXCTJSA-N Phe-Asp-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZENDEDYRYVHBEG-SRVKXCTJSA-N 0.000 description 17
- LUGOKRWYNMDGTD-FXQIFTODSA-N Pro-Cys-Asn Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)N)C(=O)O LUGOKRWYNMDGTD-FXQIFTODSA-N 0.000 description 17
- FIODMZKLZFLYQP-GUBZILKMSA-N Pro-Val-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FIODMZKLZFLYQP-GUBZILKMSA-N 0.000 description 17
- ZIFYDQAFEMIZII-GUBZILKMSA-N Ser-Leu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZIFYDQAFEMIZII-GUBZILKMSA-N 0.000 description 17
- ADPHPKGWVDHWML-PPCPHDFISA-N Thr-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N ADPHPKGWVDHWML-PPCPHDFISA-N 0.000 description 17
- VYVBSMCZNHOZGD-RCWTZXSCSA-N Thr-Val-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O VYVBSMCZNHOZGD-RCWTZXSCSA-N 0.000 description 17
- NVZVJIUDICCMHZ-BZSNNMDCSA-N Tyr-Phe-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O NVZVJIUDICCMHZ-BZSNNMDCSA-N 0.000 description 17
- YMTOEGGOCHVGEH-IHRRRGAJSA-N Val-Lys-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O YMTOEGGOCHVGEH-IHRRRGAJSA-N 0.000 description 17
- BGXVHVMJZCSOCA-AVGNSLFASA-N Val-Pro-Lys Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N BGXVHVMJZCSOCA-AVGNSLFASA-N 0.000 description 17
- 108010013835 arginine glutamate Proteins 0.000 description 17
- 108010031719 prolyl-serine Proteins 0.000 description 17
- 108010003137 tyrosyltyrosine Proteins 0.000 description 17
- MMLHRUJLOUSRJX-CIUDSAMLSA-N Ala-Ser-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN MMLHRUJLOUSRJX-CIUDSAMLSA-N 0.000 description 16
- KWKQGHSSNHPGOW-BQBZGAKWSA-N Arg-Ala-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)NCC(O)=O KWKQGHSSNHPGOW-BQBZGAKWSA-N 0.000 description 16
- VITDJIPIJZAVGC-VEVYYDQMSA-N Asn-Met-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VITDJIPIJZAVGC-VEVYYDQMSA-N 0.000 description 16
- DJCAHYVLMSRBFR-QXEWZRGKSA-N Asp-Met-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC(O)=O DJCAHYVLMSRBFR-QXEWZRGKSA-N 0.000 description 16
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 16
- SBYVDRJAXWSXQL-AVGNSLFASA-N Glu-Asn-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SBYVDRJAXWSXQL-AVGNSLFASA-N 0.000 description 16
- CKOFNWCLWRYUHK-XHNCKOQMSA-N Glu-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N)C(=O)O CKOFNWCLWRYUHK-XHNCKOQMSA-N 0.000 description 16
- ZGKXAUIVGIBISK-SZMVWBNQSA-N Glu-His-Trp Chemical compound N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O ZGKXAUIVGIBISK-SZMVWBNQSA-N 0.000 description 16
- HBMRTXJZQDVRFT-DZKIICNBSA-N Glu-Tyr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O HBMRTXJZQDVRFT-DZKIICNBSA-N 0.000 description 16
- GGEJHJIXRBTJPD-BYPYZUCNSA-N Gly-Asn-Gly Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GGEJHJIXRBTJPD-BYPYZUCNSA-N 0.000 description 16
- SUDUYJOBLHQAMI-WHFBIAKZSA-N Gly-Asp-Cys Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CS)C(O)=O SUDUYJOBLHQAMI-WHFBIAKZSA-N 0.000 description 16
- IALQAMYQJBZNSK-WHFBIAKZSA-N Gly-Ser-Asn Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O IALQAMYQJBZNSK-WHFBIAKZSA-N 0.000 description 16
- 208000022361 Human papillomavirus infectious disease Diseases 0.000 description 16
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 16
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 16
- LUAJJLPHUXPQLH-KKUMJFAQSA-N Lys-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCCN)N LUAJJLPHUXPQLH-KKUMJFAQSA-N 0.000 description 16
- MIROMRNASYKZNL-ULQDDVLXSA-N Lys-Pro-Tyr Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 MIROMRNASYKZNL-ULQDDVLXSA-N 0.000 description 16
- BWTKUQPNOMMKMA-FIRPJDEBSA-N Phe-Ile-Phe Chemical compound C([C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 BWTKUQPNOMMKMA-FIRPJDEBSA-N 0.000 description 16
- AIOWVDNPESPXRB-YTWAJWBKSA-N Pro-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2)O AIOWVDNPESPXRB-YTWAJWBKSA-N 0.000 description 16
- RVMNUBQWPVOUKH-HEIBUPTGSA-N Thr-Ser-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMNUBQWPVOUKH-HEIBUPTGSA-N 0.000 description 16
- FNOQJVHFVLVMOS-AAEUAGOBSA-N Trp-Gly-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N FNOQJVHFVLVMOS-AAEUAGOBSA-N 0.000 description 16
- DZKFGCNKEVMXFA-JUKXBJQTSA-N Tyr-Ile-His Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O DZKFGCNKEVMXFA-JUKXBJQTSA-N 0.000 description 16
- BYAKMYBZADCNMN-JYJNAYRXSA-N Tyr-Lys-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O BYAKMYBZADCNMN-JYJNAYRXSA-N 0.000 description 16
- XJPXTYLVMUZGNW-IHRRRGAJSA-N Tyr-Pro-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O XJPXTYLVMUZGNW-IHRRRGAJSA-N 0.000 description 16
- 108010047495 alanylglycine Proteins 0.000 description 16
- 108010060199 cysteinylproline Proteins 0.000 description 16
- 230000008696 hypoxemic pulmonary vasoconstriction Effects 0.000 description 16
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 16
- 108010016686 methionyl-alanyl-serine Proteins 0.000 description 16
- 235000018102 proteins Nutrition 0.000 description 16
- JAQNUEWEJWBVAY-WBAXXEDZSA-N Ala-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 JAQNUEWEJWBVAY-WBAXXEDZSA-N 0.000 description 15
- SLQQPJBDBVPVQV-JYJNAYRXSA-N Arg-Phe-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O SLQQPJBDBVPVQV-JYJNAYRXSA-N 0.000 description 15
- LRPZJPMQGKGHSG-XGEHTFHBSA-N Arg-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCN=C(N)N)N)O LRPZJPMQGKGHSG-XGEHTFHBSA-N 0.000 description 15
- XLILXFRAKOYEJX-GUBZILKMSA-N Asp-Leu-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLILXFRAKOYEJX-GUBZILKMSA-N 0.000 description 15
- QSFHZPQUAAQHAQ-CIUDSAMLSA-N Asp-Ser-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O QSFHZPQUAAQHAQ-CIUDSAMLSA-N 0.000 description 15
- WQWMZOIPXWSZNE-WDSKDSINSA-N Gln-Asp-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O WQWMZOIPXWSZNE-WDSKDSINSA-N 0.000 description 15
- DFRYZTUPVZNRLG-KKUMJFAQSA-N Gln-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N DFRYZTUPVZNRLG-KKUMJFAQSA-N 0.000 description 15
- QBEWLBKBGXVVPD-RYUDHWBXSA-N Gln-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N QBEWLBKBGXVVPD-RYUDHWBXSA-N 0.000 description 15
- XRTDOIOIBMAXCT-NKWVEPMBSA-N Gly-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)CN)C(=O)O XRTDOIOIBMAXCT-NKWVEPMBSA-N 0.000 description 15
- LUJVWKKYHSLULQ-ZKWXMUAHSA-N Gly-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN LUJVWKKYHSLULQ-ZKWXMUAHSA-N 0.000 description 15
- HAXARWKYFIIHKD-ZKWXMUAHSA-N Gly-Ile-Ser Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HAXARWKYFIIHKD-ZKWXMUAHSA-N 0.000 description 15
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 15
- KUEVMUXNILMJTK-JYJNAYRXSA-N Leu-Gln-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KUEVMUXNILMJTK-JYJNAYRXSA-N 0.000 description 15
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 15
- NKKFVJRLCCUJNA-QWRGUYRKSA-N Lys-Gly-Lys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN NKKFVJRLCCUJNA-QWRGUYRKSA-N 0.000 description 15
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 15
- XYHMFGGWNOFUOU-QXEWZRGKSA-N Pro-Ile-Gly Chemical compound OC(=O)CNC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H]1CCCN1 XYHMFGGWNOFUOU-QXEWZRGKSA-N 0.000 description 15
- VGNYHOBZJKWRGI-CIUDSAMLSA-N Ser-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO VGNYHOBZJKWRGI-CIUDSAMLSA-N 0.000 description 15
- KJKQUQXDEKMPDK-FXQIFTODSA-N Ser-Met-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O KJKQUQXDEKMPDK-FXQIFTODSA-N 0.000 description 15
- KQNDIKOYWZTZIX-FXQIFTODSA-N Ser-Ser-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQNDIKOYWZTZIX-FXQIFTODSA-N 0.000 description 15
- WUXCHQZLUHBSDJ-LKXGYXEUSA-N Ser-Thr-Asp Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WUXCHQZLUHBSDJ-LKXGYXEUSA-N 0.000 description 15
- HOVLHEKTGVIKAP-WDCWCFNPSA-N Thr-Leu-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HOVLHEKTGVIKAP-WDCWCFNPSA-N 0.000 description 15
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 15
- AXWBYOVVDRBOGU-SIUGBPQLSA-N Tyr-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N AXWBYOVVDRBOGU-SIUGBPQLSA-N 0.000 description 15
- GITNQBVCEQBDQC-KKUMJFAQSA-N Tyr-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O GITNQBVCEQBDQC-KKUMJFAQSA-N 0.000 description 15
- XIFAHCUNWWKUDE-DCAQKATOSA-N Val-Cys-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N XIFAHCUNWWKUDE-DCAQKATOSA-N 0.000 description 15
- QPPZEDOTPZOSEC-RCWTZXSCSA-N Val-Met-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](C(C)C)N)O QPPZEDOTPZOSEC-RCWTZXSCSA-N 0.000 description 15
- 108010038633 aspartylglutamate Proteins 0.000 description 15
- 108010078144 glutaminyl-glycine Proteins 0.000 description 15
- 108010012058 leucyltyrosine Proteins 0.000 description 15
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 14
- OWSLLRKCHLTUND-BZSNNMDCSA-N Phe-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OWSLLRKCHLTUND-BZSNNMDCSA-N 0.000 description 14
- CBENHWCORLVGEQ-HJOGWXRNSA-N Phe-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 CBENHWCORLVGEQ-HJOGWXRNSA-N 0.000 description 14
- HNWQUBBOBKSFQV-AVGNSLFASA-N Val-Arg-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HNWQUBBOBKSFQV-AVGNSLFASA-N 0.000 description 14
- 108010092114 histidylphenylalanine Proteins 0.000 description 14
- 108010053037 kyotorphin Proteins 0.000 description 14
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 14
- BABSVXFGKFLIGW-UWVGGRQHSA-N Leu-Gly-Arg Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N BABSVXFGKFLIGW-UWVGGRQHSA-N 0.000 description 13
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 13
- LNMKRJJLEFASGA-BZSNNMDCSA-N Lys-Phe-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LNMKRJJLEFASGA-BZSNNMDCSA-N 0.000 description 13
- FXGIMYRVJJEIIM-UWVGGRQHSA-N Pro-Leu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 FXGIMYRVJJEIIM-UWVGGRQHSA-N 0.000 description 13
- 108010041407 alanylaspartic acid Proteins 0.000 description 13
- 108010062796 arginyllysine Proteins 0.000 description 13
- 108010015796 prolylisoleucine Proteins 0.000 description 13
- SMIDBHKWSYUBRZ-ACZMJKKPSA-N Ser-Glu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O SMIDBHKWSYUBRZ-ACZMJKKPSA-N 0.000 description 12
- BPGDJSUFQKWUBK-KJEVXHAQSA-N Thr-Val-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 BPGDJSUFQKWUBK-KJEVXHAQSA-N 0.000 description 12
- 108010093581 aspartyl-proline Proteins 0.000 description 12
- 108010057821 leucylproline Proteins 0.000 description 12
- 108010079317 prolyl-tyrosine Proteins 0.000 description 12
- OZEQPCDLCDRCGY-SOUVJXGZSA-N Gln-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CCC(=O)N)N)C(=O)O OZEQPCDLCDRCGY-SOUVJXGZSA-N 0.000 description 11
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 11
- GGXZOTSDJJTDGB-GUBZILKMSA-N Met-Ser-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O GGXZOTSDJJTDGB-GUBZILKMSA-N 0.000 description 11
- 108010066427 N-valyltryptophan Proteins 0.000 description 11
- VIWQOOBRKCGSDK-RYQLBKOJSA-N Trp-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O VIWQOOBRKCGSDK-RYQLBKOJSA-N 0.000 description 11
- OVLIFGQSBSNGHY-KKHAAJSZSA-N Val-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N)O OVLIFGQSBSNGHY-KKHAAJSZSA-N 0.000 description 11
- FVBZXNSRIDVYJS-AVGNSLFASA-N Arg-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N FVBZXNSRIDVYJS-AVGNSLFASA-N 0.000 description 10
- FJUKMPUELVROGK-IHRRRGAJSA-N Leu-Arg-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N FJUKMPUELVROGK-IHRRRGAJSA-N 0.000 description 10
- HFBCHNRFRYLZNV-GUBZILKMSA-N Leu-Glu-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HFBCHNRFRYLZNV-GUBZILKMSA-N 0.000 description 10
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 10
- MWQXFDIQXIXPMS-UNQGMJICSA-N Phe-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC1=CC=CC=C1)N)O MWQXFDIQXIXPMS-UNQGMJICSA-N 0.000 description 10
- XQLBWXHVZVBNJM-FXQIFTODSA-N Pro-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 XQLBWXHVZVBNJM-FXQIFTODSA-N 0.000 description 10
- ULIWFCCJIOEHMU-BQBZGAKWSA-N Pro-Gly-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 ULIWFCCJIOEHMU-BQBZGAKWSA-N 0.000 description 10
- BRKHVZNDAOMAHX-BIIVOSGPSA-N Ser-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N BRKHVZNDAOMAHX-BIIVOSGPSA-N 0.000 description 10
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 10
- CEXFELBFVHLYDZ-XGEHTFHBSA-N Thr-Arg-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CEXFELBFVHLYDZ-XGEHTFHBSA-N 0.000 description 10
- 230000028993 immune response Effects 0.000 description 10
- 230000003472 neutralizing effect Effects 0.000 description 10
- 230000001681 protective effect Effects 0.000 description 10
- 238000000746 purification Methods 0.000 description 10
- OEUQMKNNOWJREN-AVGNSLFASA-N Asp-Gln-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N OEUQMKNNOWJREN-AVGNSLFASA-N 0.000 description 9
- HHWQMFIGMMOVFK-WDSKDSINSA-N Gln-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O HHWQMFIGMMOVFK-WDSKDSINSA-N 0.000 description 9
- KFMBRBPXHVMDFN-UWVGGRQHSA-N Gly-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCNC(N)=N KFMBRBPXHVMDFN-UWVGGRQHSA-N 0.000 description 9
- 241000341655 Human papillomavirus type 16 Species 0.000 description 9
- FDBTVENULFNTAL-XQQFMLRXSA-N Leu-Val-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N FDBTVENULFNTAL-XQQFMLRXSA-N 0.000 description 9
- QUCDKEKDPYISNX-HJGDQZAQSA-N Lys-Asn-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QUCDKEKDPYISNX-HJGDQZAQSA-N 0.000 description 9
- 101150056860 N13 gene Proteins 0.000 description 9
- YTILBRIUASDGBL-BZSNNMDCSA-N Phe-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 YTILBRIUASDGBL-BZSNNMDCSA-N 0.000 description 9
- WWPAHTZOWURIMR-ULQDDVLXSA-N Phe-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 WWPAHTZOWURIMR-ULQDDVLXSA-N 0.000 description 9
- 108010034529 leucyl-lysine Proteins 0.000 description 9
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 8
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 8
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 8
- VNFSAYFQLXPHPY-CIQUZCHMSA-N Ala-Thr-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNFSAYFQLXPHPY-CIQUZCHMSA-N 0.000 description 8
- GXCSUJQOECMKPV-CIUDSAMLSA-N Arg-Ala-Gln Chemical compound C[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GXCSUJQOECMKPV-CIUDSAMLSA-N 0.000 description 8
- OLVIPTLKNSAYRJ-YUMQZZPRSA-N Asn-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N OLVIPTLKNSAYRJ-YUMQZZPRSA-N 0.000 description 8
- NTWOPSIUJBMNRI-KKUMJFAQSA-N Asn-Lys-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTWOPSIUJBMNRI-KKUMJFAQSA-N 0.000 description 8
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 8
- GKKUBLFXKRDMFC-BQBZGAKWSA-N Asn-Pro-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O GKKUBLFXKRDMFC-BQBZGAKWSA-N 0.000 description 8
- KVPHTGVUMJGMCX-BIIVOSGPSA-N Asp-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N)C(=O)O KVPHTGVUMJGMCX-BIIVOSGPSA-N 0.000 description 8
- WBDWQKRLTVCDSY-WHFBIAKZSA-N Asp-Gly-Asp Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O WBDWQKRLTVCDSY-WHFBIAKZSA-N 0.000 description 8
- CMBDUPIBCOEWNE-BJDJZHNGSA-N Asp-Leu-Asp-Gln Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O CMBDUPIBCOEWNE-BJDJZHNGSA-N 0.000 description 8
- YFGUZQQCSDZRBN-DCAQKATOSA-N Asp-Pro-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O YFGUZQQCSDZRBN-DCAQKATOSA-N 0.000 description 8
- DINOVZWPTMGSRF-QXEWZRGKSA-N Asp-Pro-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O DINOVZWPTMGSRF-QXEWZRGKSA-N 0.000 description 8
- QPDUWAUSSWGJSB-NGZCFLSTSA-N Asp-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N QPDUWAUSSWGJSB-NGZCFLSTSA-N 0.000 description 8
- VKAWJBQTFCBHQY-GUBZILKMSA-N Cys-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CS)N VKAWJBQTFCBHQY-GUBZILKMSA-N 0.000 description 8
- OXFOKRAFNYSREH-BJDJZHNGSA-N Cys-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CS)N OXFOKRAFNYSREH-BJDJZHNGSA-N 0.000 description 8
- JUUMIGUJJRFQQR-KKUMJFAQSA-N Cys-Lys-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CS)N)O JUUMIGUJJRFQQR-KKUMJFAQSA-N 0.000 description 8
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 8
- UUTGYDAKPISJAO-JYJNAYRXSA-N Glu-Tyr-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=C(O)C=C1 UUTGYDAKPISJAO-JYJNAYRXSA-N 0.000 description 8
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 8
- WJZLEENECIOOSA-WDSKDSINSA-N Gly-Asn-Gln Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)O WJZLEENECIOOSA-WDSKDSINSA-N 0.000 description 8
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 8
- SOEATRRYCIPEHA-BQBZGAKWSA-N Gly-Glu-Glu Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SOEATRRYCIPEHA-BQBZGAKWSA-N 0.000 description 8
- GMTXWRIDLGTVFC-IUCAKERBSA-N Gly-Lys-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMTXWRIDLGTVFC-IUCAKERBSA-N 0.000 description 8
- YABRDIBSPZONIY-BQBZGAKWSA-N Gly-Ser-Met Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O YABRDIBSPZONIY-BQBZGAKWSA-N 0.000 description 8
- VCBWXASUBZIFLQ-IHRRRGAJSA-N His-Pro-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O VCBWXASUBZIFLQ-IHRRRGAJSA-N 0.000 description 8
- JVEKQAYXFGIISZ-HOCLYGCPSA-N His-Trp-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)NCC(O)=O)C1=CN=CN1 JVEKQAYXFGIISZ-HOCLYGCPSA-N 0.000 description 8
- NBJAAWYRLGCJOF-UGYAYLCHSA-N Ile-Asp-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NBJAAWYRLGCJOF-UGYAYLCHSA-N 0.000 description 8
- ZIPOVLBRVPXWJQ-SPOWBLRKSA-N Ile-Cys-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N ZIPOVLBRVPXWJQ-SPOWBLRKSA-N 0.000 description 8
- CYHJCEKUMCNDFG-LAEOZQHASA-N Ile-Gln-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N CYHJCEKUMCNDFG-LAEOZQHASA-N 0.000 description 8
- LPFBXFILACZHIB-LAEOZQHASA-N Ile-Gly-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)O)N LPFBXFILACZHIB-LAEOZQHASA-N 0.000 description 8
- YBGTWSFIGHUWQE-MXAVVETBSA-N Ile-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CN=CN1 YBGTWSFIGHUWQE-MXAVVETBSA-N 0.000 description 8
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 8
- RKQAYOWLSFLJEE-SVSWQMSJSA-N Ile-Thr-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)O)N RKQAYOWLSFLJEE-SVSWQMSJSA-N 0.000 description 8
- POJPZSMTTMLSTG-SRVKXCTJSA-N Leu-Asn-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N POJPZSMTTMLSTG-SRVKXCTJSA-N 0.000 description 8
- NHHKSOGJYNQENP-SRVKXCTJSA-N Leu-Cys-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)O)N NHHKSOGJYNQENP-SRVKXCTJSA-N 0.000 description 8
- VPKIQULSKFVCSM-SRVKXCTJSA-N Leu-Gln-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VPKIQULSKFVCSM-SRVKXCTJSA-N 0.000 description 8
- AXZGZMGRBDQTEY-SRVKXCTJSA-N Leu-Gln-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O AXZGZMGRBDQTEY-SRVKXCTJSA-N 0.000 description 8
- LAPSXOAUPNOINL-YUMQZZPRSA-N Leu-Gly-Asp Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O LAPSXOAUPNOINL-YUMQZZPRSA-N 0.000 description 8
- FYPWFNKQVVEELI-ULQDDVLXSA-N Leu-Phe-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 FYPWFNKQVVEELI-ULQDDVLXSA-N 0.000 description 8
- AIQWYVFNBNNOLU-RHYQMDGZSA-N Leu-Thr-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O AIQWYVFNBNNOLU-RHYQMDGZSA-N 0.000 description 8
- QYOXSYXPHUHOJR-GUBZILKMSA-N Lys-Asn-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QYOXSYXPHUHOJR-GUBZILKMSA-N 0.000 description 8
- HEWWNLVEWBJBKA-WDCWCFNPSA-N Lys-Gln-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN HEWWNLVEWBJBKA-WDCWCFNPSA-N 0.000 description 8
- IMAKMJCBYCSMHM-AVGNSLFASA-N Lys-Glu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN IMAKMJCBYCSMHM-AVGNSLFASA-N 0.000 description 8
- CANPXOLVTMKURR-WEDXCCLWSA-N Lys-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN CANPXOLVTMKURR-WEDXCCLWSA-N 0.000 description 8
- NCZIQZYZPUPMKY-PPCPHDFISA-N Lys-Ile-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NCZIQZYZPUPMKY-PPCPHDFISA-N 0.000 description 8
- TWPCWKVOZDUYAA-KKUMJFAQSA-N Lys-Phe-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O TWPCWKVOZDUYAA-KKUMJFAQSA-N 0.000 description 8
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 8
- WXHHTBVYQOSYSL-FXQIFTODSA-N Met-Ala-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O WXHHTBVYQOSYSL-FXQIFTODSA-N 0.000 description 8
- UZVWDRPUTHXQAM-FXQIFTODSA-N Met-Asp-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O UZVWDRPUTHXQAM-FXQIFTODSA-N 0.000 description 8
- PNDCUTDWYVKBHX-IHRRRGAJSA-N Met-Asp-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PNDCUTDWYVKBHX-IHRRRGAJSA-N 0.000 description 8
- NSMXRFMGZYTFEX-KJEVXHAQSA-N Met-Thr-Tyr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCSC)N)O NSMXRFMGZYTFEX-KJEVXHAQSA-N 0.000 description 8
- OVTOTTGZBWXLFU-QXEWZRGKSA-N Met-Val-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O OVTOTTGZBWXLFU-QXEWZRGKSA-N 0.000 description 8
- IIHMNTBFPMRJCN-RCWTZXSCSA-N Met-Val-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IIHMNTBFPMRJCN-RCWTZXSCSA-N 0.000 description 8
- KAHUBGWSIQNZQQ-KKUMJFAQSA-N Phe-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KAHUBGWSIQNZQQ-KKUMJFAQSA-N 0.000 description 8
- RIYZXJVARWJLKS-KKUMJFAQSA-N Phe-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 RIYZXJVARWJLKS-KKUMJFAQSA-N 0.000 description 8
- MMYUOSCXBJFUNV-QWRGUYRKSA-N Phe-Gly-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N MMYUOSCXBJFUNV-QWRGUYRKSA-N 0.000 description 8
- APJPXSFJBMMOLW-KBPBESRZSA-N Phe-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 APJPXSFJBMMOLW-KBPBESRZSA-N 0.000 description 8
- WKLMCMXFMQEKCX-SLFFLAALSA-N Phe-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)[C@H](CC3=CC=CC=C3)N)C(=O)O WKLMCMXFMQEKCX-SLFFLAALSA-N 0.000 description 8
- AAERWTUHZKLDLC-IHRRRGAJSA-N Phe-Pro-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O AAERWTUHZKLDLC-IHRRRGAJSA-N 0.000 description 8
- AFNJAQVMTIQTCB-DLOVCJGASA-N Phe-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 AFNJAQVMTIQTCB-DLOVCJGASA-N 0.000 description 8
- DEDANIDYQAPTFI-IHRRRGAJSA-N Pro-Asp-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O DEDANIDYQAPTFI-IHRRRGAJSA-N 0.000 description 8
- UEHYFUCOGHWASA-HJGDQZAQSA-N Pro-Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 UEHYFUCOGHWASA-HJGDQZAQSA-N 0.000 description 8
- FEPSEIDIPBMIOS-QXEWZRGKSA-N Pro-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEPSEIDIPBMIOS-QXEWZRGKSA-N 0.000 description 8
- AQGUSRZKDZYGGV-GMOBBJLQSA-N Pro-Ile-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O AQGUSRZKDZYGGV-GMOBBJLQSA-N 0.000 description 8
- GURGCNUWVSDYTP-SRVKXCTJSA-N Pro-Leu-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GURGCNUWVSDYTP-SRVKXCTJSA-N 0.000 description 8
- POQFNPILEQEODH-FXQIFTODSA-N Pro-Ser-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O POQFNPILEQEODH-FXQIFTODSA-N 0.000 description 8
- DYJTXTCEXMCPBF-UFYCRDLUSA-N Pro-Tyr-Phe Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CC3=CC=CC=C3)C(=O)O DYJTXTCEXMCPBF-UFYCRDLUSA-N 0.000 description 8
- QMABBZHZMDXHKU-FKBYEOEOSA-N Pro-Tyr-Trp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O QMABBZHZMDXHKU-FKBYEOEOSA-N 0.000 description 8
- WDXYVIIVDIDOSX-DCAQKATOSA-N Ser-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N WDXYVIIVDIDOSX-DCAQKATOSA-N 0.000 description 8
- RDFQNDHEHVSONI-ZLUOBGJFSA-N Ser-Asn-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDFQNDHEHVSONI-ZLUOBGJFSA-N 0.000 description 8
- MQQBBLVOUUJKLH-HJPIBITLSA-N Ser-Ile-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MQQBBLVOUUJKLH-HJPIBITLSA-N 0.000 description 8
- FPCGZYMRFFIYIH-CIUDSAMLSA-N Ser-Lys-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O FPCGZYMRFFIYIH-CIUDSAMLSA-N 0.000 description 8
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 8
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 8
- XJDMUQCLVSCRSJ-VZFHVOOUSA-N Ser-Thr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O XJDMUQCLVSCRSJ-VZFHVOOUSA-N 0.000 description 8
- CAJFZCICSVBOJK-SHGPDSBTSA-N Thr-Ala-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAJFZCICSVBOJK-SHGPDSBTSA-N 0.000 description 8
- QNJZOAHSYPXTAB-VEVYYDQMSA-N Thr-Asn-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O QNJZOAHSYPXTAB-VEVYYDQMSA-N 0.000 description 8
- YBXMGKCLOPDEKA-NUMRIWBASA-N Thr-Asp-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YBXMGKCLOPDEKA-NUMRIWBASA-N 0.000 description 8
- ZTPXSEUVYNNZRB-CDMKHQONSA-N Thr-Gly-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZTPXSEUVYNNZRB-CDMKHQONSA-N 0.000 description 8
- DNCUODYZAMHLCV-XGEHTFHBSA-N Thr-Pro-Cys Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)O)N)O DNCUODYZAMHLCV-XGEHTFHBSA-N 0.000 description 8
- KERCOYANYUPLHJ-XGEHTFHBSA-N Thr-Pro-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O KERCOYANYUPLHJ-XGEHTFHBSA-N 0.000 description 8
- WKGAAMOJPMBBMC-IXOXFDKPSA-N Thr-Ser-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WKGAAMOJPMBBMC-IXOXFDKPSA-N 0.000 description 8
- VTFWAGGJDRSQFG-MELADBBJSA-N Tyr-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O VTFWAGGJDRSQFG-MELADBBJSA-N 0.000 description 8
- FWOVTJKVUCGVND-UFYCRDLUSA-N Tyr-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N FWOVTJKVUCGVND-UFYCRDLUSA-N 0.000 description 8
- RVGVIWNHABGIFH-IHRRRGAJSA-N Tyr-Val-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O RVGVIWNHABGIFH-IHRRRGAJSA-N 0.000 description 8
- HTONZBWRYUKUKC-RCWTZXSCSA-N Val-Thr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HTONZBWRYUKUKC-RCWTZXSCSA-N 0.000 description 8
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 8
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 108010051242 phenylalanylserine Proteins 0.000 description 8
- 108010080629 tryptophan-leucine Proteins 0.000 description 8
- 108010020532 tyrosyl-proline Proteins 0.000 description 8
- UCIYCBSJBQGDGM-LPEHRKFASA-N Ala-Arg-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N UCIYCBSJBQGDGM-LPEHRKFASA-N 0.000 description 7
- XUGATJVGQUGQKY-ULQDDVLXSA-N Arg-Lys-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XUGATJVGQUGQKY-ULQDDVLXSA-N 0.000 description 7
- RYAOJUMWLWUGNW-QMMMGPOBSA-N Gly-Val-Gly Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O RYAOJUMWLWUGNW-QMMMGPOBSA-N 0.000 description 7
- VOCZPDONPURUHV-QEWYBTABSA-N Ile-Phe-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VOCZPDONPURUHV-QEWYBTABSA-N 0.000 description 7
- GILLQRYAWOMHED-DCAQKATOSA-N Lys-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN GILLQRYAWOMHED-DCAQKATOSA-N 0.000 description 7
- HXSUFWQYLPKEHF-IHRRRGAJSA-N Phe-Asn-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HXSUFWQYLPKEHF-IHRRRGAJSA-N 0.000 description 7
- QSPOLEBZTMESFY-SRVKXCTJSA-N Val-Pro-Val Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O QSPOLEBZTMESFY-SRVKXCTJSA-N 0.000 description 7
- 108010047857 aspartylglycine Proteins 0.000 description 7
- 108010054155 lysyllysine Proteins 0.000 description 7
- NYZGVTGOMPHSJW-CIUDSAMLSA-N Arg-Glu-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N NYZGVTGOMPHSJW-CIUDSAMLSA-N 0.000 description 6
- YNSGXDWWPCGGQS-YUMQZZPRSA-N Arg-Gly-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O YNSGXDWWPCGGQS-YUMQZZPRSA-N 0.000 description 6
- ACRYGQFHAQHDSF-ZLUOBGJFSA-N Asn-Asn-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ACRYGQFHAQHDSF-ZLUOBGJFSA-N 0.000 description 6
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 6
- HPBNLFLSSQDFQW-WHFBIAKZSA-N Asn-Ser-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O HPBNLFLSSQDFQW-WHFBIAKZSA-N 0.000 description 6
- NCXTYSVDWLAQGZ-ZKWXMUAHSA-N Asn-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O NCXTYSVDWLAQGZ-ZKWXMUAHSA-N 0.000 description 6
- JBDLMLZNDRLDIX-HJGDQZAQSA-N Asn-Thr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O JBDLMLZNDRLDIX-HJGDQZAQSA-N 0.000 description 6
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 6
- AMGQTNHANMRPOE-LKXGYXEUSA-N Asn-Thr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O AMGQTNHANMRPOE-LKXGYXEUSA-N 0.000 description 6
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 6
- CJUKAWUWBZCTDQ-SRVKXCTJSA-N Asp-Leu-Lys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O CJUKAWUWBZCTDQ-SRVKXCTJSA-N 0.000 description 6
- GWWSUMLEWKQHLR-NUMRIWBASA-N Asp-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GWWSUMLEWKQHLR-NUMRIWBASA-N 0.000 description 6
- GXIUDSXIUSTSLO-QXEWZRGKSA-N Asp-Val-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)O)N GXIUDSXIUSTSLO-QXEWZRGKSA-N 0.000 description 6
- 229940124957 Cervarix Drugs 0.000 description 6
- GRNOCLDFUNCIDW-ACZMJKKPSA-N Cys-Ala-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CS)N GRNOCLDFUNCIDW-ACZMJKKPSA-N 0.000 description 6
- 229940124897 Gardasil Drugs 0.000 description 6
- PRBLYKYHAJEABA-SRVKXCTJSA-N Gln-Arg-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O PRBLYKYHAJEABA-SRVKXCTJSA-N 0.000 description 6
- WHVLABLIJYGVEK-QEWYBTABSA-N Gln-Phe-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WHVLABLIJYGVEK-QEWYBTABSA-N 0.000 description 6
- GTBXHETZPUURJE-KKUMJFAQSA-N Gln-Tyr-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GTBXHETZPUURJE-KKUMJFAQSA-N 0.000 description 6
- SBCYJMOOHUDWDA-NUMRIWBASA-N Glu-Asp-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SBCYJMOOHUDWDA-NUMRIWBASA-N 0.000 description 6
- ITBHUUMCJJQUSC-LAEOZQHASA-N Glu-Ile-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O ITBHUUMCJJQUSC-LAEOZQHASA-N 0.000 description 6
- ARIORLIIMJACKZ-KKUMJFAQSA-N Glu-Pro-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ARIORLIIMJACKZ-KKUMJFAQSA-N 0.000 description 6
- LCNXZQROPKFGQK-WHFBIAKZSA-N Gly-Asp-Ser Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O LCNXZQROPKFGQK-WHFBIAKZSA-N 0.000 description 6
- LGQZOQRDEUIZJY-YUMQZZPRSA-N Gly-Cys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CS)NC(=O)CN)C(O)=O LGQZOQRDEUIZJY-YUMQZZPRSA-N 0.000 description 6
- AYBKPDHHVADEDA-YUMQZZPRSA-N Gly-His-Asn Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O AYBKPDHHVADEDA-YUMQZZPRSA-N 0.000 description 6
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 6
- DURWCDDDAWVPOP-JBDRJPRFSA-N Ile-Cys-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O)N DURWCDDDAWVPOP-JBDRJPRFSA-N 0.000 description 6
- GECLQMBTZCPAFY-PEFMBERDSA-N Ile-Gln-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GECLQMBTZCPAFY-PEFMBERDSA-N 0.000 description 6
- GLBNEGIOFRVRHO-JYJNAYRXSA-N Leu-Gln-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLBNEGIOFRVRHO-JYJNAYRXSA-N 0.000 description 6
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 6
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 6
- ARNIBBOXIAWUOP-MGHWNKPDSA-N Leu-Tyr-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ARNIBBOXIAWUOP-MGHWNKPDSA-N 0.000 description 6
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 6
- XOMXAVJBLRROMC-IHRRRGAJSA-N Met-Asp-Phe Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XOMXAVJBLRROMC-IHRRRGAJSA-N 0.000 description 6
- 108010021466 Mutant Proteins Proteins 0.000 description 6
- 102000008300 Mutant Proteins Human genes 0.000 description 6
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 6
- ABSSTGUCBCDKMU-UWVGGRQHSA-N Pro-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 ABSSTGUCBCDKMU-UWVGGRQHSA-N 0.000 description 6
- DWPXHLIBFQLKLK-CYDGBPFRSA-N Pro-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 DWPXHLIBFQLKLK-CYDGBPFRSA-N 0.000 description 6
- KBUAPZAZPWNYSW-SRVKXCTJSA-N Pro-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 KBUAPZAZPWNYSW-SRVKXCTJSA-N 0.000 description 6
- IDQFQFVEWMWRQQ-DLOVCJGASA-N Ser-Ala-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IDQFQFVEWMWRQQ-DLOVCJGASA-N 0.000 description 6
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 6
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 6
- PMCMLDNPAZUYGI-DCAQKATOSA-N Ser-Lys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMCMLDNPAZUYGI-DCAQKATOSA-N 0.000 description 6
- SGZVZUCRAVSPKQ-FXQIFTODSA-N Ser-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N SGZVZUCRAVSPKQ-FXQIFTODSA-N 0.000 description 6
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 6
- CJEHCEOXPLASCK-MEYUZBJRSA-N Thr-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=C(O)C=C1 CJEHCEOXPLASCK-MEYUZBJRSA-N 0.000 description 6
- AFSYEUHJBVCPEL-JBACZVJFSA-N Trp-Gln-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=CC=C1 AFSYEUHJBVCPEL-JBACZVJFSA-N 0.000 description 6
- DVWAIHZOPSYMSJ-ZVZYQTTQSA-N Trp-Glu-Val Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O)=CNC2=C1 DVWAIHZOPSYMSJ-ZVZYQTTQSA-N 0.000 description 6
- XHALUUQSNXSPLP-UFYCRDLUSA-N Tyr-Arg-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 XHALUUQSNXSPLP-UFYCRDLUSA-N 0.000 description 6
- MWUYSCVVPVITMW-IGNZVWTISA-N Tyr-Tyr-Ala Chemical compound C([C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 MWUYSCVVPVITMW-IGNZVWTISA-N 0.000 description 6
- PWRITNSESKQTPW-NRPADANISA-N Val-Gln-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N PWRITNSESKQTPW-NRPADANISA-N 0.000 description 6
- PMDOQZFYGWZSTK-LSJOCFKGSA-N Val-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C PMDOQZFYGWZSTK-LSJOCFKGSA-N 0.000 description 6
- UJMCYJKPDFQLHX-XGEHTFHBSA-N Val-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N)O UJMCYJKPDFQLHX-XGEHTFHBSA-N 0.000 description 6
- IRAUYEAFPFPVND-UVBJJODRSA-N Val-Trp-Ala Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](C)C(O)=O)=CNC2=C1 IRAUYEAFPFPVND-UVBJJODRSA-N 0.000 description 6
- 108010064235 lysylglycine Proteins 0.000 description 6
- 239000006228 supernatant Substances 0.000 description 6
- IKKVASZHTMKJIR-ZKWXMUAHSA-N Ala-Asp-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IKKVASZHTMKJIR-ZKWXMUAHSA-N 0.000 description 5
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 5
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 5
- MYOHQBFRJQFIDZ-KKUMJFAQSA-N Asp-Leu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYOHQBFRJQFIDZ-KKUMJFAQSA-N 0.000 description 5
- LLRJPYJQNBMOOO-QEJZJMRPSA-N Asp-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N LLRJPYJQNBMOOO-QEJZJMRPSA-N 0.000 description 5
- SRIRHERUAMYIOQ-CIUDSAMLSA-N Cys-Leu-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SRIRHERUAMYIOQ-CIUDSAMLSA-N 0.000 description 5
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 5
- MXJYXYDREQWUMS-XKBZYTNZSA-N Glu-Thr-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O MXJYXYDREQWUMS-XKBZYTNZSA-N 0.000 description 5
- 241000701806 Human papillomavirus Species 0.000 description 5
- JNDYEOUZBLOVOF-AVGNSLFASA-N Leu-Leu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JNDYEOUZBLOVOF-AVGNSLFASA-N 0.000 description 5
- MSSJJDVQTFTLIF-KBPBESRZSA-N Lys-Phe-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O MSSJJDVQTFTLIF-KBPBESRZSA-N 0.000 description 5
- UQJOKDAYFULYIX-AVGNSLFASA-N Lys-Pro-Pro Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 UQJOKDAYFULYIX-AVGNSLFASA-N 0.000 description 5
- QQPMHUCGDRJFQK-RHYQMDGZSA-N Met-Thr-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QQPMHUCGDRJFQK-RHYQMDGZSA-N 0.000 description 5
- HTKNPQZCMLBOTQ-XVSYOHENSA-N Phe-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N)O HTKNPQZCMLBOTQ-XVSYOHENSA-N 0.000 description 5
- PCWLNNZTBJTZRN-AVGNSLFASA-N Pro-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 PCWLNNZTBJTZRN-AVGNSLFASA-N 0.000 description 5
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 5
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 5
- VZQRNAYURWAEFE-KKUMJFAQSA-N Ser-Leu-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VZQRNAYURWAEFE-KKUMJFAQSA-N 0.000 description 5
- VGQVAVQWKJLIRM-FXQIFTODSA-N Ser-Ser-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O VGQVAVQWKJLIRM-FXQIFTODSA-N 0.000 description 5
- ZKOKTQPHFMRSJP-YJRXYDGGSA-N Ser-Thr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKOKTQPHFMRSJP-YJRXYDGGSA-N 0.000 description 5
- BEZTUFWTPVOROW-KJEVXHAQSA-N Thr-Tyr-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O BEZTUFWTPVOROW-KJEVXHAQSA-N 0.000 description 5
- QYSBJAUCUKHSLU-JYJNAYRXSA-N Tyr-Arg-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O QYSBJAUCUKHSLU-JYJNAYRXSA-N 0.000 description 5
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 5
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 5
- WNZSAUMKZQXHNC-UKJIMTQDSA-N Val-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N WNZSAUMKZQXHNC-UKJIMTQDSA-N 0.000 description 5
- LLJLBRRXKZTTRD-GUBZILKMSA-N Val-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N LLJLBRRXKZTTRD-GUBZILKMSA-N 0.000 description 5
- 108010011559 alanylphenylalanine Proteins 0.000 description 5
- 239000013604 expression vector Substances 0.000 description 5
- 108010013768 glutamyl-aspartyl-proline Proteins 0.000 description 5
- 108010081551 glycylphenylalanine Proteins 0.000 description 5
- 108010073101 phenylalanylleucine Proteins 0.000 description 5
- 210000002966 serum Anatomy 0.000 description 5
- 239000011780 sodium chloride Substances 0.000 description 5
- 108010051110 tyrosyl-lysine Proteins 0.000 description 5
- OMMDTNGURYRDAC-NRPADANISA-N Ala-Glu-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OMMDTNGURYRDAC-NRPADANISA-N 0.000 description 4
- IRRMIGDCPOPZJW-ULQDDVLXSA-N Arg-His-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IRRMIGDCPOPZJW-ULQDDVLXSA-N 0.000 description 4
- LKDHUGLXOHYINY-XUXIUFHCSA-N Arg-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LKDHUGLXOHYINY-XUXIUFHCSA-N 0.000 description 4
- HGKHPCFTRQDHCU-IUCAKERBSA-N Arg-Pro-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HGKHPCFTRQDHCU-IUCAKERBSA-N 0.000 description 4
- BDMIFVIWCNLDCT-CIUDSAMLSA-N Asn-Arg-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O BDMIFVIWCNLDCT-CIUDSAMLSA-N 0.000 description 4
- XVAPVJNJGLWGCS-ACZMJKKPSA-N Asn-Glu-Asn Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVAPVJNJGLWGCS-ACZMJKKPSA-N 0.000 description 4
- GIQCDTKOIPUDSG-GARJFASQSA-N Asn-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N)C(=O)O GIQCDTKOIPUDSG-GARJFASQSA-N 0.000 description 4
- BFOYULZBKYOKAN-OLHMAJIHSA-N Asp-Asp-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFOYULZBKYOKAN-OLHMAJIHSA-N 0.000 description 4
- RRKCPMGSRIDLNC-AVGNSLFASA-N Asp-Glu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RRKCPMGSRIDLNC-AVGNSLFASA-N 0.000 description 4
- LBFYTUPYYZENIR-GHCJXIJMSA-N Asp-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N LBFYTUPYYZENIR-GHCJXIJMSA-N 0.000 description 4
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 4
- BJDHEININLSZOT-KKUMJFAQSA-N Asp-Tyr-Lys Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(O)=O BJDHEININLSZOT-KKUMJFAQSA-N 0.000 description 4
- 206010008342 Cervix carcinoma Diseases 0.000 description 4
- KIQKJXYVGSYDFS-ZLUOBGJFSA-N Cys-Asn-Asn Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O KIQKJXYVGSYDFS-ZLUOBGJFSA-N 0.000 description 4
- LWYKPOCGGTYAIH-FXQIFTODSA-N Cys-Met-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N LWYKPOCGGTYAIH-FXQIFTODSA-N 0.000 description 4
- ALNKNYKSZPSLBD-ZDLURKLDSA-N Cys-Thr-Gly Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O ALNKNYKSZPSLBD-ZDLURKLDSA-N 0.000 description 4
- BUAUGQJXGNRTQE-AAEUAGOBSA-N Cys-Trp-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CS)N BUAUGQJXGNRTQE-AAEUAGOBSA-N 0.000 description 4
- SHERTACNJPYHAR-ACZMJKKPSA-N Gln-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O SHERTACNJPYHAR-ACZMJKKPSA-N 0.000 description 4
- LVSYIKGMLRHKME-IUCAKERBSA-N Gln-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N LVSYIKGMLRHKME-IUCAKERBSA-N 0.000 description 4
- SMLDOQHTOAAFJQ-WDSKDSINSA-N Gln-Gly-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SMLDOQHTOAAFJQ-WDSKDSINSA-N 0.000 description 4
- XQDGOJPVMSWZSO-SRVKXCTJSA-N Gln-Pro-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N XQDGOJPVMSWZSO-SRVKXCTJSA-N 0.000 description 4
- FYBSCGZLICNOBA-XQXXSGGOSA-N Glu-Ala-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FYBSCGZLICNOBA-XQXXSGGOSA-N 0.000 description 4
- MFNUFCFRAZPJFW-JYJNAYRXSA-N Glu-Lys-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MFNUFCFRAZPJFW-JYJNAYRXSA-N 0.000 description 4
- ZRZILYKEJBMFHY-BQBZGAKWSA-N Gly-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN ZRZILYKEJBMFHY-BQBZGAKWSA-N 0.000 description 4
- JUBDONGMHASUCN-IUCAKERBSA-N Gly-Glu-His Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O JUBDONGMHASUCN-IUCAKERBSA-N 0.000 description 4
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 4
- HDODQNPMSHDXJT-GHCJXIJMSA-N Ile-Asn-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O HDODQNPMSHDXJT-GHCJXIJMSA-N 0.000 description 4
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 4
- QJUWBDPGGYVRHY-YUMQZZPRSA-N Leu-Gly-Cys Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N QJUWBDPGGYVRHY-YUMQZZPRSA-N 0.000 description 4
- MJWVXZABPOKJJF-ACRUOGEOSA-N Leu-Phe-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MJWVXZABPOKJJF-ACRUOGEOSA-N 0.000 description 4
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 4
- LLSUNJYOSCOOEB-GUBZILKMSA-N Lys-Glu-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O LLSUNJYOSCOOEB-GUBZILKMSA-N 0.000 description 4
- UQRZFMQQXXJTTF-AVGNSLFASA-N Lys-Lys-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O UQRZFMQQXXJTTF-AVGNSLFASA-N 0.000 description 4
- SUZVLFWOCKHWET-CQDKDKBSSA-N Lys-Tyr-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O SUZVLFWOCKHWET-CQDKDKBSSA-N 0.000 description 4
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 4
- XIGAHPDZLAYQOS-SRVKXCTJSA-N Met-Pro-Pro Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 XIGAHPDZLAYQOS-SRVKXCTJSA-N 0.000 description 4
- 241000699670 Mus sp. Species 0.000 description 4
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 4
- AXIOGMQCDYVTNY-ACRUOGEOSA-N Phe-Phe-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 AXIOGMQCDYVTNY-ACRUOGEOSA-N 0.000 description 4
- XOHJOMKCRLHGCY-UNQGMJICSA-N Phe-Pro-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOHJOMKCRLHGCY-UNQGMJICSA-N 0.000 description 4
- JXQVYPWVGUOIDV-MXAVVETBSA-N Phe-Ser-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JXQVYPWVGUOIDV-MXAVVETBSA-N 0.000 description 4
- YTGGLKWSVIRECD-JBACZVJFSA-N Phe-Trp-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=CC=C1 YTGGLKWSVIRECD-JBACZVJFSA-N 0.000 description 4
- MTHRMUXESFIAMS-DCAQKATOSA-N Pro-Asn-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O MTHRMUXESFIAMS-DCAQKATOSA-N 0.000 description 4
- SFECXGVELZFBFJ-VEVYYDQMSA-N Pro-Asp-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SFECXGVELZFBFJ-VEVYYDQMSA-N 0.000 description 4
- RMODQFBNDDENCP-IHRRRGAJSA-N Pro-Lys-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O RMODQFBNDDENCP-IHRRRGAJSA-N 0.000 description 4
- 241001112090 Pseudovirus Species 0.000 description 4
- UFKPDBLKLOBMRH-XHNCKOQMSA-N Ser-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)C(=O)O UFKPDBLKLOBMRH-XHNCKOQMSA-N 0.000 description 4
- IXCHOHLPHNGFTJ-YUMQZZPRSA-N Ser-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N IXCHOHLPHNGFTJ-YUMQZZPRSA-N 0.000 description 4
- SQHKXWODKJDZRC-LKXGYXEUSA-N Ser-Thr-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQHKXWODKJDZRC-LKXGYXEUSA-N 0.000 description 4
- QILPDQCTQZDHFM-HJGDQZAQSA-N Thr-Gln-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QILPDQCTQZDHFM-HJGDQZAQSA-N 0.000 description 4
- ODXKUIGEPAGKKV-KATARQTJSA-N Thr-Leu-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)O)N)O ODXKUIGEPAGKKV-KATARQTJSA-N 0.000 description 4
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 4
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 4
- RPECVQBNONKZAT-WZLNRYEVSA-N Thr-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H]([C@@H](C)O)N RPECVQBNONKZAT-WZLNRYEVSA-N 0.000 description 4
- QGVBFDIREUUSHX-IFFSRLJSSA-N Thr-Val-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O QGVBFDIREUUSHX-IFFSRLJSSA-N 0.000 description 4
- RMRFSFXLFWWAJZ-HJOGWXRNSA-N Tyr-Tyr-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 RMRFSFXLFWWAJZ-HJOGWXRNSA-N 0.000 description 4
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 4
- SJRUJQFQVLMZFW-WPRPVWTQSA-N Val-Pro-Gly Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SJRUJQFQVLMZFW-WPRPVWTQSA-N 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 4
- 239000000872 buffer Substances 0.000 description 4
- 201000010881 cervical cancer Diseases 0.000 description 4
- UQLDLKMNUJERMK-UHFFFAOYSA-L di(octadecanoyloxy)lead Chemical compound [Pb+2].CCCCCCCCCCCCCCCCCC([O-])=O.CCCCCCCCCCCCCCCCCC([O-])=O UQLDLKMNUJERMK-UHFFFAOYSA-L 0.000 description 4
- 108010089804 glycyl-threonine Proteins 0.000 description 4
- 208000015181 infectious disease Diseases 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 4
- 235000010482 polyoxyethylene sorbitan monooleate Nutrition 0.000 description 4
- 229920000053 polysorbate 80 Polymers 0.000 description 4
- 230000009465 prokaryotic expression Effects 0.000 description 4
- KTXKIYXZQFWJKB-VZFHVOOUSA-N Ala-Thr-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O KTXKIYXZQFWJKB-VZFHVOOUSA-N 0.000 description 3
- 241000588724 Escherichia coli Species 0.000 description 3
- LHIPZASLKPYDPI-AVGNSLFASA-N Glu-Phe-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LHIPZASLKPYDPI-AVGNSLFASA-N 0.000 description 3
- GGLIDLCEPDHEJO-BQBZGAKWSA-N Gly-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)CN GGLIDLCEPDHEJO-BQBZGAKWSA-N 0.000 description 3
- FNXSYBOHALPRHV-ONGXEEELSA-N Gly-Val-Lys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN FNXSYBOHALPRHV-ONGXEEELSA-N 0.000 description 3
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 3
- HUURTRNKPBHHKZ-JYJNAYRXSA-N Met-Phe-Val Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 HUURTRNKPBHHKZ-JYJNAYRXSA-N 0.000 description 3
- NHXXGBXJTLRGJI-GUBZILKMSA-N Met-Pro-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O NHXXGBXJTLRGJI-GUBZILKMSA-N 0.000 description 3
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 3
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 3
- JXWGBRRVTRAZQA-ULQDDVLXSA-N Val-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N JXWGBRRVTRAZQA-ULQDDVLXSA-N 0.000 description 3
- 108010087924 alanylproline Proteins 0.000 description 3
- 108010060035 arginylproline Proteins 0.000 description 3
- 238000012217 deletion Methods 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000002296 dynamic light scattering Methods 0.000 description 3
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 3
- 239000006166 lysate Substances 0.000 description 3
- 108010053725 prolylvaline Proteins 0.000 description 3
- 229940021993 prophylactic vaccine Drugs 0.000 description 3
- 239000000523 sample Substances 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 239000004475 Arginine Substances 0.000 description 2
- YUOXLJYVSZYPBJ-CIUDSAMLSA-N Asn-Pro-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O YUOXLJYVSZYPBJ-CIUDSAMLSA-N 0.000 description 2
- WLVLIYYBPPONRJ-GCJQMDKQSA-N Asn-Thr-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O WLVLIYYBPPONRJ-GCJQMDKQSA-N 0.000 description 2
- KVMPVNGOKHTUHZ-GCJQMDKQSA-N Asp-Ala-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KVMPVNGOKHTUHZ-GCJQMDKQSA-N 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 2
- YXPNKXFOBHRUBL-BJDJZHNGSA-N Cys-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CS)N YXPNKXFOBHRUBL-BJDJZHNGSA-N 0.000 description 2
- 101000872083 Danio rerio Delta-like protein C Proteins 0.000 description 2
- 238000002965 ELISA Methods 0.000 description 2
- INKFLNZBTSNFON-CIUDSAMLSA-N Gln-Ala-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O INKFLNZBTSNFON-CIUDSAMLSA-N 0.000 description 2
- RGXXLQWXBFNXTG-CIUDSAMLSA-N Gln-Arg-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O RGXXLQWXBFNXTG-CIUDSAMLSA-N 0.000 description 2
- KLKYKPXITJBSNI-CIUDSAMLSA-N Gln-Met-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O KLKYKPXITJBSNI-CIUDSAMLSA-N 0.000 description 2
- QEJKKJNDDDPSMU-KKUMJFAQSA-N Glu-Tyr-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCSC)C(O)=O QEJKKJNDDDPSMU-KKUMJFAQSA-N 0.000 description 2
- RJIVPOXLQFJRTG-LURJTMIESA-N Gly-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N RJIVPOXLQFJRTG-LURJTMIESA-N 0.000 description 2
- LXXLEUBUOMCAMR-NKWVEPMBSA-N Gly-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)CN)C(=O)O LXXLEUBUOMCAMR-NKWVEPMBSA-N 0.000 description 2
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 2
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 2
- IGOYNRWLWHWAQO-JTQLQIEISA-N Gly-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IGOYNRWLWHWAQO-JTQLQIEISA-N 0.000 description 2
- MYXNLWDWWOTERK-BHNWBGBOSA-N Gly-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN)O MYXNLWDWWOTERK-BHNWBGBOSA-N 0.000 description 2
- DNVDEMWIYLVIQU-RCOVLWMOSA-N Gly-Val-Asp Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O DNVDEMWIYLVIQU-RCOVLWMOSA-N 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- HAPWZEVRQYGLSG-IUCAKERBSA-N His-Gly-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O HAPWZEVRQYGLSG-IUCAKERBSA-N 0.000 description 2
- CHIAUHSHDARFBD-ULQDDVLXSA-N His-Pro-Tyr Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 CHIAUHSHDARFBD-ULQDDVLXSA-N 0.000 description 2
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 2
- PPTAQBNUFKTJKA-BJDJZHNGSA-N Leu-Cys-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PPTAQBNUFKTJKA-BJDJZHNGSA-N 0.000 description 2
- LOLUPZNNADDTAA-AVGNSLFASA-N Leu-Gln-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LOLUPZNNADDTAA-AVGNSLFASA-N 0.000 description 2
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 2
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 2
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 2
- RMLLCGYYVZKKRT-CIUDSAMLSA-N Met-Ser-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O RMLLCGYYVZKKRT-CIUDSAMLSA-N 0.000 description 2
- KAGCQPSEVAETCA-JYJNAYRXSA-N Phe-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N KAGCQPSEVAETCA-JYJNAYRXSA-N 0.000 description 2
- WOIFYRZPIORBRY-AVGNSLFASA-N Pro-Lys-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WOIFYRZPIORBRY-AVGNSLFASA-N 0.000 description 2
- YQHZVYJAGWMHES-ZLUOBGJFSA-N Ser-Ala-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YQHZVYJAGWMHES-ZLUOBGJFSA-N 0.000 description 2
- HBZBPFLJNDXRAY-FXQIFTODSA-N Ser-Ala-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O HBZBPFLJNDXRAY-FXQIFTODSA-N 0.000 description 2
- ASGYVPAVFNDZMA-GUBZILKMSA-N Ser-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CO)N ASGYVPAVFNDZMA-GUBZILKMSA-N 0.000 description 2
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 2
- QAOWNCQODCNURD-UHFFFAOYSA-N Sulfuric acid Chemical compound OS(O)(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-N 0.000 description 2
- UTCFSBBXPWKLTG-XKBZYTNZSA-N Thr-Cys-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O UTCFSBBXPWKLTG-XKBZYTNZSA-N 0.000 description 2
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 2
- QQWNRERCGGZOKG-WEDXCCLWSA-N Thr-Gly-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O QQWNRERCGGZOKG-WEDXCCLWSA-N 0.000 description 2
- NQQMWWVVGIXUOX-SVSWQMSJSA-N Thr-Ser-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NQQMWWVVGIXUOX-SVSWQMSJSA-N 0.000 description 2
- 239000004473 Threonine Substances 0.000 description 2
- USLVEJAHTBLSIL-CYDGBPFRSA-N Val-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C USLVEJAHTBLSIL-CYDGBPFRSA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 230000002378 acidificating effect Effects 0.000 description 2
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 2
- 108010044940 alanylglutamine Proteins 0.000 description 2
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 2
- 108091007433 antigens Proteins 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 229940031416 bivalent vaccine Drugs 0.000 description 2
- 238000011097 chromatography purification Methods 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 229910052802 copper Inorganic materials 0.000 description 2
- 239000010949 copper Substances 0.000 description 2
- 238000007865 diluting Methods 0.000 description 2
- 238000010790 dilution Methods 0.000 description 2
- 239000012895 dilution Substances 0.000 description 2
- 239000002552 dosage form Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000001493 electron microscopy Methods 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000000855 fermentation Methods 0.000 description 2
- 230000004151 fermentation Effects 0.000 description 2
- 229940102767 gardasil 9 Drugs 0.000 description 2
- 108010010147 glycylglutamine Proteins 0.000 description 2
- 108010036413 histidylglycine Proteins 0.000 description 2
- 230000003053 immunization Effects 0.000 description 2
- 238000002649 immunization Methods 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 2
- 230000003902 lesion Effects 0.000 description 2
- 108010068488 methionylphenylalanine Proteins 0.000 description 2
- 231100000590 oncogenic Toxicity 0.000 description 2
- 230000002246 oncogenic effect Effects 0.000 description 2
- 238000003921 particle size analysis Methods 0.000 description 2
- YBYRMVIVWMBXKQ-UHFFFAOYSA-N phenylmethanesulfonyl fluoride Chemical compound FS(=O)(=O)CC1=CC=CC=C1 YBYRMVIVWMBXKQ-UHFFFAOYSA-N 0.000 description 2
- 238000003259 recombinant expression Methods 0.000 description 2
- 230000007017 scission Effects 0.000 description 2
- 108010026333 seryl-proline Proteins 0.000 description 2
- 239000011734 sodium Substances 0.000 description 2
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 2
- 239000004094 surface-active agent Substances 0.000 description 2
- 108010073969 valyllysine Proteins 0.000 description 2
- 238000001262 western blot Methods 0.000 description 2
- RRBGTUQJDFBWNN-MUGJNUQGSA-N (2s)-6-amino-2-[[(2s)-6-amino-2-[[(2s)-6-amino-2-[[(2s)-2,6-diaminohexanoyl]amino]hexanoyl]amino]hexanoyl]amino]hexanoic acid Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O RRBGTUQJDFBWNN-MUGJNUQGSA-N 0.000 description 1
- NHBKXEKEPDILRR-UHFFFAOYSA-N 2,3-bis(butanoylsulfanyl)propyl butanoate Chemical compound CCCC(=O)OCC(SC(=O)CCC)CSC(=O)CCC NHBKXEKEPDILRR-UHFFFAOYSA-N 0.000 description 1
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 1
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 1
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 1
- MQIGTEQXYCRLGK-BQBZGAKWSA-N Ala-Gly-Pro Chemical compound C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O MQIGTEQXYCRLGK-BQBZGAKWSA-N 0.000 description 1
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 1
- 206010061424 Anal cancer Diseases 0.000 description 1
- 208000007860 Anus Neoplasms Diseases 0.000 description 1
- MZRBYBIQTIKERR-GUBZILKMSA-N Arg-Glu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MZRBYBIQTIKERR-GUBZILKMSA-N 0.000 description 1
- YCYXHLZRUSJITQ-SRVKXCTJSA-N Arg-Pro-Pro Chemical compound NC(=N)NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 YCYXHLZRUSJITQ-SRVKXCTJSA-N 0.000 description 1
- ATABBWFGOHKROJ-GUBZILKMSA-N Arg-Pro-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O ATABBWFGOHKROJ-GUBZILKMSA-N 0.000 description 1
- WPOLSNAQGVHROR-GUBZILKMSA-N Asn-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N WPOLSNAQGVHROR-GUBZILKMSA-N 0.000 description 1
- NWAHPBGBDIFUFD-KKUMJFAQSA-N Asp-Tyr-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O NWAHPBGBDIFUFD-KKUMJFAQSA-N 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- 244000050510 Cunninghamia lanceolata Species 0.000 description 1
- KSMSFCBQBQPFAD-GUBZILKMSA-N Cys-Pro-Pro Chemical compound SC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 KSMSFCBQBQPFAD-GUBZILKMSA-N 0.000 description 1
- YRHZWVKUFWCEPW-GLLZPBPUSA-N Gln-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O YRHZWVKUFWCEPW-GLLZPBPUSA-N 0.000 description 1
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 1
- OLPPXYMMIARYAL-QMMMGPOBSA-N Gly-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)CN OLPPXYMMIARYAL-QMMMGPOBSA-N 0.000 description 1
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 1
- WDEHMRNSGHVNOH-VHSXEESVSA-N Gly-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)CN)C(=O)O WDEHMRNSGHVNOH-VHSXEESVSA-N 0.000 description 1
- NSVOVKWEKGEOQB-LURJTMIESA-N Gly-Pro-Gly Chemical compound NCC(=O)N1CCC[C@H]1C(=O)NCC(O)=O NSVOVKWEKGEOQB-LURJTMIESA-N 0.000 description 1
- ZZWUYQXMIFTIIY-WEDXCCLWSA-N Gly-Thr-Leu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O ZZWUYQXMIFTIIY-WEDXCCLWSA-N 0.000 description 1
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 239000007995 HEPES buffer Substances 0.000 description 1
- DPQIPEAHIYMUEJ-IHRRRGAJSA-N His-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N DPQIPEAHIYMUEJ-IHRRRGAJSA-N 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 1
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 1
- HPBCTWSUJOGJSH-MNXVOIDGSA-N Leu-Glu-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HPBCTWSUJOGJSH-MNXVOIDGSA-N 0.000 description 1
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 1
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 1
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 1
- MYZMQWHPDAYKIE-SRVKXCTJSA-N Lys-Leu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O MYZMQWHPDAYKIE-SRVKXCTJSA-N 0.000 description 1
- MGKFCQFVPKOWOL-CIUDSAMLSA-N Lys-Ser-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N MGKFCQFVPKOWOL-CIUDSAMLSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 101710135729 Major capsid protein L1 Proteins 0.000 description 1
- 208000032271 Malignant tumor of penis Diseases 0.000 description 1
- HUKLXYYPZWPXCC-KZVJFYERSA-N Met-Ala-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HUKLXYYPZWPXCC-KZVJFYERSA-N 0.000 description 1
- ZAJNRWKGHWGPDQ-SDDRHHMPSA-N Met-Arg-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N ZAJNRWKGHWGPDQ-SDDRHHMPSA-N 0.000 description 1
- BJPQKNHZHUCQNQ-SRVKXCTJSA-N Met-Pro-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCSC)N BJPQKNHZHUCQNQ-SRVKXCTJSA-N 0.000 description 1
- DSZFTPCSFVWMKP-DCAQKATOSA-N Met-Ser-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN DSZFTPCSFVWMKP-DCAQKATOSA-N 0.000 description 1
- PHURAEXVWLDIGT-LPEHRKFASA-N Met-Ser-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N PHURAEXVWLDIGT-LPEHRKFASA-N 0.000 description 1
- VYDLZDRMOFYOGV-TUAOUCFPSA-N Met-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCSC)N VYDLZDRMOFYOGV-TUAOUCFPSA-N 0.000 description 1
- OTKQHDPECKUDSB-SZMVWBNQSA-N Met-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCSC)C(O)=O)=CNC2=C1 OTKQHDPECKUDSB-SZMVWBNQSA-N 0.000 description 1
- PVSPJQWHEIQTEH-JYJNAYRXSA-N Met-Val-Tyr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PVSPJQWHEIQTEH-JYJNAYRXSA-N 0.000 description 1
- IQJMEDDVOGMTKT-SRVKXCTJSA-N Met-Val-Val Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IQJMEDDVOGMTKT-SRVKXCTJSA-N 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 208000002471 Penile Neoplasms Diseases 0.000 description 1
- 206010034299 Penile cancer Diseases 0.000 description 1
- 208000037581 Persistent Infection Diseases 0.000 description 1
- LGBVMDMZZFYSFW-HJWJTTGWSA-N Phe-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CC=CC=C1)N LGBVMDMZZFYSFW-HJWJTTGWSA-N 0.000 description 1
- WLYPRKLMRIYGPP-JYJNAYRXSA-N Phe-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 WLYPRKLMRIYGPP-JYJNAYRXSA-N 0.000 description 1
- XYSXOCIWCPFOCG-IHRRRGAJSA-N Pro-Leu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XYSXOCIWCPFOCG-IHRRRGAJSA-N 0.000 description 1
- MRYUJHGPZQNOAD-IHRRRGAJSA-N Pro-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 MRYUJHGPZQNOAD-IHRRRGAJSA-N 0.000 description 1
- SBVPYBFMIGDIDX-SRVKXCTJSA-N Pro-Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1N(C(=O)[C@H]2NCCC2)CCC1 SBVPYBFMIGDIDX-SRVKXCTJSA-N 0.000 description 1
- BGWKULMLUIUPKY-BQBZGAKWSA-N Pro-Ser-Gly Chemical compound OC(=O)CNC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BGWKULMLUIUPKY-BQBZGAKWSA-N 0.000 description 1
- 101000702488 Rattus norvegicus High affinity cationic amino acid transporter 1 Proteins 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- LVVBAKCGXXUHFO-ZLUOBGJFSA-N Ser-Ala-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O LVVBAKCGXXUHFO-ZLUOBGJFSA-N 0.000 description 1
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 1
- 108700005078 Synthetic Genes Proteins 0.000 description 1
- 230000005867 T cell response Effects 0.000 description 1
- IGROJMCBGRFRGI-YTLHQDLWSA-N Thr-Ala-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 1
- PXQUBKWZENPDGE-CIQUZCHMSA-N Thr-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)O)N PXQUBKWZENPDGE-CIQUZCHMSA-N 0.000 description 1
- QYDKSNXSBXZPFK-ZJDVBMNYSA-N Thr-Thr-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYDKSNXSBXZPFK-ZJDVBMNYSA-N 0.000 description 1
- AKHDFZHUPGVFEJ-YEPSODPASA-N Thr-Val-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AKHDFZHUPGVFEJ-YEPSODPASA-N 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- WLBZWXXGSOLJBA-HOCLYGCPSA-N Trp-Gly-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 WLBZWXXGSOLJBA-HOCLYGCPSA-N 0.000 description 1
- BXJQKVDPRMLGKN-PMVMPFDFSA-N Tyr-Trp-Leu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(O)=O)C1=CC=C(O)C=C1 BXJQKVDPRMLGKN-PMVMPFDFSA-N 0.000 description 1
- GVJUTBOZZBTBIG-AVGNSLFASA-N Val-Lys-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N GVJUTBOZZBTBIG-AVGNSLFASA-N 0.000 description 1
- AJNUKMZFHXUBMK-GUBZILKMSA-N Val-Ser-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N AJNUKMZFHXUBMK-GUBZILKMSA-N 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 206010047741 Vulval cancer Diseases 0.000 description 1
- ABUBSBSOTTXVPV-UHFFFAOYSA-H [U+6].CC([O-])=O.CC([O-])=O.CC([O-])=O.CC([O-])=O.CC([O-])=O.CC([O-])=O Chemical compound [U+6].CC([O-])=O.CC([O-])=O.CC([O-])=O.CC([O-])=O.CC([O-])=O.CC([O-])=O ABUBSBSOTTXVPV-UHFFFAOYSA-H 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 1
- BFNBIHQBYMNNAN-UHFFFAOYSA-N ammonium sulfate Chemical class N.N.OS(O)(=O)=O BFNBIHQBYMNNAN-UHFFFAOYSA-N 0.000 description 1
- 238000005571 anion exchange chromatography Methods 0.000 description 1
- 125000000129 anionic group Chemical group 0.000 description 1
- 239000003945 anionic surfactant Substances 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 201000011165 anus cancer Diseases 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 238000005277 cation exchange chromatography Methods 0.000 description 1
- 125000002091 cationic group Chemical group 0.000 description 1
- 239000003093 cationic surfactant Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000002158 endotoxin Substances 0.000 description 1
- 210000000981 epithelium Anatomy 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 108010085059 glutamyl-arginyl-proline Proteins 0.000 description 1
- 108010077435 glycyl-phenylalanyl-glycine Proteins 0.000 description 1
- 108010077515 glycylproline Proteins 0.000 description 1
- 201000010536 head and neck cancer Diseases 0.000 description 1
- 208000014829 head and neck neoplasm Diseases 0.000 description 1
- 238000000703 high-speed centrifugation Methods 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 238000000265 homogenisation Methods 0.000 description 1
- 206010020718 hyperplasia Diseases 0.000 description 1
- 230000005965 immune activity Effects 0.000 description 1
- 210000003000 inclusion body Anatomy 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 210000003292 kidney cell Anatomy 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 210000001806 memory b lymphocyte Anatomy 0.000 description 1
- 238000002715 modification method Methods 0.000 description 1
- 238000006386 neutralization reaction Methods 0.000 description 1
- 239000002736 nonionic surfactant Substances 0.000 description 1
- 239000003002 pH adjusting agent Substances 0.000 description 1
- 239000008363 phosphate buffer Substances 0.000 description 1
- 239000000244 polyoxyethylene sorbitan monooleate Substances 0.000 description 1
- 229940068968 polysorbate 80 Drugs 0.000 description 1
- 230000002516 postimmunization Effects 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 239000012460 protein solution Substances 0.000 description 1
- 239000012264 purified product Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 125000003607 serino group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C(O[H])([H])[H] 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 229940031351 tetravalent vaccine Drugs 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000004627 transmission electron microscopy Methods 0.000 description 1
- 206010046885 vaginal cancer Diseases 0.000 description 1
- 208000013139 vaginal neoplasm Diseases 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 201000005102 vulva cancer Diseases 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/005—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
- A61K39/12—Viral antigens
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P31/00—Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
- A61P31/12—Antivirals
- A61P31/20—Antivirals for DNA viruses
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P35/00—Antineoplastic agents
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/005—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
- C07K14/01—DNA viruses
- C07K14/025—Papovaviridae, e.g. papillomavirus, polyomavirus, SV40, BK virus, JC virus
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/86—Viral vectors
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/86—Viral vectors
- C12N15/866—Baculoviral vectors
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N7/00—Viruses; Bacteriophages; Compositions thereof; Preparation or purification thereof
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N7/00—Viruses; Bacteriophages; Compositions thereof; Preparation or purification thereof
- C12N7/04—Inactivation or attenuation; Producing viral sub-units
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
- A61K2039/51—Medicinal preparations containing antigens or antibodies comprising whole cells, viruses or DNA/RNA
- A61K2039/525—Virus
- A61K2039/5258—Virus-like particles
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K39/00—Medicinal preparations containing antigens or antibodies
- A61K2039/57—Medicinal preparations containing antigens or antibodies characterised by the type of response, e.g. Th1, Th2
- A61K2039/575—Medicinal preparations containing antigens or antibodies characterised by the type of response, e.g. Th1, Th2 humoral response
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2710/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
- C12N2710/00011—Details
- C12N2710/14011—Baculoviridae
- C12N2710/14041—Use of virus, viral particle or viral elements as a vector
- C12N2710/14043—Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vectore
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2710/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
- C12N2710/00011—Details
- C12N2710/20011—Papillomaviridae
- C12N2710/20022—New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2710/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
- C12N2710/00011—Details
- C12N2710/20011—Papillomaviridae
- C12N2710/20023—Virus like particles [VLP]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2710/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
- C12N2710/00011—Details
- C12N2710/20011—Papillomaviridae
- C12N2710/20034—Use of virus or viral component as vaccine, e.g. live-attenuated or inactivated virus, VLP, viral protein
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2710/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
- C12N2710/00011—Details
- C12N2710/20011—Papillomaviridae
- C12N2710/20071—Demonstrated in vivo effect
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/22—Vectors comprising a coding region that has been codon optimised for expression in a respective host
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Genetics & Genomics (AREA)
- General Health & Medical Sciences (AREA)
- Virology (AREA)
- Engineering & Computer Science (AREA)
- Medicinal Chemistry (AREA)
- Biotechnology (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Animal Behavior & Ethology (AREA)
- Pharmacology & Pharmacy (AREA)
- Veterinary Medicine (AREA)
- Public Health (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Epidemiology (AREA)
- Mycology (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Communicable Diseases (AREA)
- Oncology (AREA)
- Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
- Peptides Or Proteins (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The application relates to an engineered human papillomavirus type 52L1 protein and application thereof. In particular, the application relates to HPV type 52L1 proteins, the nucleotides encoding them, vectors comprising said nucleotides, cells comprising said vectors, pentamers or virus-like particles consisting of said HPV52L1 proteins, vaccines comprising the pentamers or virus-like particles and vaccine adjuvants, and their use in the prevention of HPV infections and diseases associated with HPV infections.
Description
Technical Field
The application relates to the field of biotechnology, in particular to novel human papillomavirus proteins, pentamers or virus-like particles formed by the novel human papillomavirus proteins, and application of the human papillomavirus proteins, the pentamers or the human papillomavirus-like particles in preparing vaccines for preventing papillomavirus infection and diseases induced by the infection.
Background
Human papillomaviruses (human papillomavirus, HPV) are a non-enveloped small DNA virus that infects epithelial tissues, and more than 200 types have been identified at present, and can be classified according to the site of infection into mucosal and cutaneous types, with mucosal HPV mainly infecting the genitourinary, perianal and oropharyngeal mucosal skin, and into oncogenic types with transforming activity and low risk types inducing benign hyperplasia, with more than 20 types of oncogenic types including 12 common high risk types (HPV 16, -18, -31, -33, -35, -39, -45, -51, -52, -56, -58, -59) and more than 10 relatively rare possible/suspected high risk types (HPV 26, -30, -34, -53, -66, -67, -68, -69, -70, -73, -82), with persistent infections inducing about 100% of cervical cancer, 88% of anal cancer, 70% of vaginal cancer, 50% of penile cancer, 43% of vulval cancer and 72% of head and neck cancer, with cervical cancer being the third most advanced malignant tumor worldwide (in 15-44 years old women, about 31 in the second female and about 80% of ten thousand women, and not more than ten thousand women in the second female world), with death in the second-old countries.
HPV52 is a relatively common dominant epidemic strain worldwide with a detection rate of 3.5% in cervical cancer tissue, rank six. And it is notable that in the cervical tissues with normal or low lesion in China, the detection rate of HPV52 reaches 2.8% and 16%, the average position is the first, and in the cervical cancer tissues in south China, the detection rate of HPV52 is only inferior to HPV16 and HPV18, the third position.
HPV major capsid protein L1 self-assembled virus-like particles (VLPs) induce predominantly specific neutralizing antibodies and protective activity. The 4 HPV prophylactic vaccines on the market at present are all L1VLP mixed vaccines, namely HPV16/-18L1VLP bivalent vaccine (Cervarix) produced by using an insect expression system, HPV16/-18/-6/-11L1VLP tetravalent vaccine (Gardasil) produced by using a yeast expression system, HPV16/-18/-31/-33/-45/-52/-58/-6/-11L1VLP nine vaccine (Gardasil-9) and HPV16/-18 bivalent vaccine (Cecolin) produced by using a prokaryotic expression system. But only Gardasil-9 currently contains HPV52L1 VLPs.
VLP expression systems that are currently more commonly used include prokaryotic expression systems, yeast expression systems, and insect expression systems. Comparing the clinical data of the marketed Cervarix and Gardasil found that the content of HPV 16L 1VLP in Cervarix (20. Mu.g) was only one-half of that in Gardasil (40. Mu.g), and that the content of HPV 18L1VLP in Cervarix was the same as that in Gardasil (both 20. Mu.g), but that Cervarix induced type-specific neutralizing antibody titers against HPV16, HPV18, cross-neutralizing activity, memory B cell numbers and CD4+ T cell response levels were higher than Gardasil, indicating that Cervarix was more immune active than Gardasil. In addition, insect cell expression systems have many advantages, compared with prokaryotic expression systems, the genetic distance between the insect cell expression system and the natural host cell of the virus is relatively close (all eukaryotic multicellular organisms), endotoxin is not contained, proteins are mainly expressed in the insect cell expression system in a soluble way, and inclusion body trouble is avoided; compared with a yeast expression system, insect cells are easy to crack, the purification process is relatively simple, the breaking of the yeast cell wall is required to adopt a high-pressure homogenization method, host proteins are more, and the purification difficulty is relatively higher. The insect expression system has the advantage of developing vaccines. However, the cost of fermentation of insect expression systems is relatively high, and thus it is particularly important to increase the expression level and yield of L1 VLPs and thereby reduce vaccine production costs.
It was found that optimizing the antigen gene according to the bias codon of the host cell can increase its expression level, such as optimizing the HPV 11L1 gene by using a bias codon of a mammalian cell, which increases its expression level in human embryonic kidney cells (293T) by at least 100-fold; in insect and yeast expression systems, the expression level of HPV 16L 1 variant and VLP yield were analyzed and compared, and it was found that when the high frequency variant site was mutated to a dominant amino acid, the L1 expression level and VLP yield were increased, but when the high frequency variant site was mutated in combination with other sites, the effect on the L1 expression level was uncertain; in an insect expression system, the BPV 1L1 is modified by adopting a C-terminal truncation method, and the assembly efficiency of the truncated BPV L1 is improved by 3 times. At present, no report on the influence of C-terminal truncation on the protein expression quantity is yet seen; in a prokaryotic expression system, L1 of HPV16, -18, -31, -33, -45, -52, -58, -6 and-11 types is modified by adopting an N-terminal truncation method, and the number of N-terminal truncated amino acids capable of up-regulating the expression level of L1 is found to be different and irregular according to different types.
According to the application, the expression level and yield of the 52L1VLP can be remarkably improved by optimizing and modifying the N-terminal, C-terminal and high-frequency mutation sites of L1, and the produced HPV52L 1VLP can induce high-titer type specific neutralizing antibodies.
Disclosure of Invention
Some embodiments of the present application provide a novel, optimally engineered HPV52L1 protein, pentamer or virus-like particles composed thereof, and vaccines comprising the pentamer or virus-like particles, and research uses of the vaccines in preventing HPV infection and infection-related diseases.
The present inventors have unexpectedly found that the appropriate amino acid substitution of the high frequency mutation site of HPV52L1 protein, and partial deletion or amino acid substitution of the N-and/or C-terminus thereof, can increase the expression level of HPV52L1 protein in an insect cell expression system, and that the optimally engineered protein can assemble into VLPs and can induce a protective immune response against HPV 52.
Thus, according to some embodiments of the present application, there is provided an optimally engineered HPV52L1 protein comprising a modification selected from the group consisting of:
mutation of the 447 th amino acid from aspartic acid to glutamic acid;
deleting 1 to 20 consecutive or non-consecutive amino acids of the N-terminal;
deleting 1 to 25 consecutive or non-consecutive amino acids from the C-terminus;
substitution of one or more amino acids at positions 1 to 20 of the N-terminal;
substitution of one or more amino acids from position 1 to position 25 of the C-terminal.
In particular, according to some embodiments of the present application there is provided an optimally engineered HPV52L1 protein, wherein the engineered HPV52L1 protein has any one or a combination of the following characteristics selected from the group consisting of:
mutation of amino acid 447 from aspartic acid (D) to glutamic acid (E);
deletion of 1-20 contiguous/non-contiguous amino acids of the N-terminus (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20);
deleting 13 amino acids of the N-terminal and substituting serine (S), serine-glutamic acid (SE), serine-glutamic acid-arginine (SER), or proline-serine-glutamic acid-alanine-threonine (PSEAT);
deletion of 1-25 contiguous/non-contiguous amino acids at the C-terminus (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25);
1 or more basic amino acids in the amino acids 1-23 at the C-terminal are substituted with polar uncharged amino acids, nonpolar amino acids and/or acidic amino acids.
In a specific embodiment, the basic amino acid is arginine (R) and/or lysine (K).
In particular embodiments, the polar uncharged amino acid is glycine (G), serine (S) and/or threonine (T).
In a specific embodiment, the nonpolar amino acid is alanine (a) and/or valine (V).
In a specific embodiment, the acidic amino acid is aspartic acid (D) and/or glutamic acid (E).
In a specific embodiment, the optimally engineered HPV52L1 protein of the application is engineered based on the sequence shown in SEQ ID No.1 (the amino acid sequence corresponding to the sequence AEI61557.1 of the NCBI database).
In a specific embodiment, the engineered HPV52L1 protein is selected from the group consisting of 52L1D447E Δc19, 52L1 Δn2, 52L1 Δn4, 52L1 Δn5, 52L1 Δn8, 52L1 Δn10, 52L1 Δn13, 52L1 Δn15, 52L1 Δn18, 52L1 Δn20, 52L1CS1, 52L1CS2, 52L1CS3, 52L1CS4, 52L1CS5, 52L1CS6, 52L1CS7, 52L1CS8, 52L1CS9, 52L1 Δn13CS1, 52L1 Δn13CS2, 52L1 Δn13CS3, 52L1NS1 Δc19, 52L1NS1 Δc25, 52L1NS2 Δc19, 52L1NS3 Δc19, 52L1NS4 Δc19, 52L1 Δn14 Δc25, the amino acid sequence of which is set forth in SEQ ID No.2 to SEQ ID No. 29.
Wild-type HPV52L1 protein may also be derived from, but is not limited to, L1 proteins from HPV52 variants such as NCBI database ABU55797.1, AEI61589.1, AIF71344.1, APQ44868.1, AEI61581.1, AIF71350.1, CAD1814034.1, and the like, the C-terminal altered L1 protein of the corresponding variant having the same modifications as the altered HPV52L1 protein described above, as assessed by sequence comparison.
According to some embodiments of the application there is provided a polynucleotide encoding an optimally engineered HPV52L1 protein of the application. Preferably, the polynucleotide is codon optimized using commonly used expression systems, such as E.coli expression systems, yeast expression systems, insect cell expression systems, and the like. Particularly preferably, the polynucleotide is codon optimized for an insect cell.
According to some embodiments of the application there is provided a vector comprising the polynucleotide described above, preferably the vector is selected from the group consisting of a plasmid, a recombinant Bacmid and a recombinant baculovirus.
According to some embodiments of the application, there is provided a cell comprising the vector described above. Preferably, the cell is an E.coli cell, a yeast cell or an insect cell, particularly preferably the cell is an insect cell.
According to some embodiments of the application there is provided an HPV52L1 multimer (e.g., pentamer) or virus-like particle comprising or formed from the engineered HPV52L1 protein described above.
According to some embodiments of the application there is provided a vaccine for preventing HPV infection or lesions associated with HPV infection, the vaccine comprising a multimer or virus-like particle as described above, wherein the multimer or virus-like particle is present in an amount effective to elicit a protective immune response. Preferably, the vaccine may further comprise at least one pentamer or virus-like particle of HPV selected from other mucophilic and/or dermatophilic groups, the pentamer or virus-like particle being present in an amount effective to induce a protective immune response, respectively. The above vaccine also typically comprises a vaccine excipient or carrier.
In particular embodiments, the vaccine comprises an HPV52L1 multimer (e.g., pentamer) or virus-like particle as described above, and at least 1L1 virus-like particle selected from the group consisting of HPV2, -5, -6, -7, -8, -11, -16, -18, -26, -27, -28, -29, -30, -31, -32, -33, -34, -35, -38, -39, -40, -43, -44, -45, -51, -53, -56, -57, -58, -59, -61, -66, -67, -68, -69, -70, -73, -74, -77, -81, -82, -83, -85, -91, in amounts effective to induce a protective immune response, respectively.
In particular embodiments, the vaccine comprises an L1 virus-like particle of HPV52L1 multimer (e.g., pentamer) or virus-like particle described above, and HPV6, -11, -16, -18, -26, -31, -33, -35, -39, -45, -51, -56, -58, -59, -68, and-73, in amounts effective to induce a protective immune response, respectively.
In particular embodiments, the vaccine comprises an HPV52L1 multimer (e.g., pentamer) or virus-like particle as described above, and L1 virus-like particles of HPV6, -11, -16, -18, -31, -33, -35, -39, -45 and-58 in amounts effective to elicit a protective immune response, respectively.
In particular embodiments, the vaccine comprises an HPV52L1 multimer (e.g., pentamer) or virus-like particle as described above, and L1 virus-like particles of HPV6, -11, -16, -18 and-58, in amounts effective to elicit protective immune responses, respectively.
In particular embodiments, the vaccine comprises an HPV52L1 multimer (e.g., pentamer) or virus-like particle as described above, and L1 virus-like particles of HPV16, -18, and-58, each in an amount effective to elicit a protective immune response.
In particular embodiments, the vaccine comprises an HPV52L1 multimer (e.g., pentamer) or virus-like particle as described above, and L1 virus-like particles of HPV16, -18, each in an amount effective to elicit a protective immune response.
The present application relates to a novel vaccine which further enhances the immune response comprising the HPV52L1 multimer (e.g., pentamer) or virus-like particle described above, and an adjuvant. Preferably, the adjuvant used is a human vaccine adjuvant.
According to some embodiments of the application there is provided the use of the above-described engineered HPV52L1 protein, multimer (e.g., pentamer), virus-like particle, vaccine in the prevention of HPV infection or diseases associated with HPV infection.
Description and interpretation of related terms
According to the present application, the term "insect cell expression system" includes insect cells, recombinant baculoviruses, recombinant Bacmid and expression vectors. Wherein the insect cells are derived from commercially available cells, exemplified herein but not limited to: sf9, sf21, high Five.
According to the present application, examples of the term "wild-type HPV52L1 protein" include, but are not limited to, the L1 protein corresponding to the sequence number AEI61557.1 in the NCBI database.
According to the present application, the term "excipient or carrier" refers to a compound selected from one or more of the group including, but not limited to: a pH adjustor, a surfactant, and an ion strength enhancer. For example, pH modifiers such as but not limited to phosphate buffers, surfactants including cationic, anionic, or nonionic surfactants such as but not limited to polysorbate 80 (Tween-80), and ionic strength enhancers such as but not limited to sodium chloride.
According to the present application, the term "adjuvant" refers to adjuvants that are clinically applicable to the human body, including various adjuvants that have been approved currently and may be approved in the future.
According to the application, the vaccine of the application may take a patient acceptable form, including but not limited to oral or injection, preferably injection.
According to the application, the vaccine of the application is preferably used in unit dosage form, wherein the dose of the optimized modified HPV52L1 protein virus-like particles in the unit dosage form is 5 μg to 80 μg, preferably 20 μg to 40 μg.
Drawings
FIGS. 1A and 1B show the identification of the expression of wild type HPV52L1 and 28 mutants thereof in insect cells in example 4 of the present application. The results show that wild-type HPV52L1 and 28 mutants thereof were expressed in insect cells. Lanes 1 to 15 of FIG. 1A represent wild-type HPVs 52L1, 52L1D447 E.DELTA.C19, 52L 1.DELTA.N2, 52L 1.DELTA.N4, 52L 1.DELTA.N5, 52L 1.DELTA.N8, 52L 1.DELTA.N10, 52L 1.DELTA.N13, 52L 1.DELTA.N15, 52L 1.DELTA.N18, 52L 1.DELTA.N20, 52L1CS1, 52L1CS2, 52L1CS3, 52L1CS4, respectively; lanes 1 through 14 of FIG. 2A represent 52L1CS5, 52L1CS6, 52L1CS7, 52L1CS8, 52L1CS9, 52L1 ΔN13CS1, 52L1 ΔN13CS2, 52L1 ΔN13CS3, 52L1NS1 ΔC19, 52L1NS1 ΔC25, 52L1NS2 ΔC19, 52L1NS3 ΔC19, 52L1NS4 ΔC19, 52L1 ΔN14 ΔC25, respectively.
FIGS. 2A to 2K show the results of dynamic light scattering analysis of wild-type HPV52L1, 52L1D447 E.DELTA.C19, 52L 1.DELTA.N13, 52L1CS5, 52L1CS6, 52L1CS7, 52L1CS9, 52L 1.DELTA.N13 CS1, 52L 1.DELTA.N13 CS2, 52L1NS 3.DELTA.C19, 52L1NS 4.DELTA.C19 mutant proteins obtained after purification in example 5 of the present application. The results showed that the virus-like particles formed by the recombinant proteins of wild-type HPV52L1, 52L1D447eΔc19, 52l1Δn13, 52L1CS5, 52L1CS6, 52L1CS7, 52L1CS9, 52l1Δn13cs1 and 52l1Δn13cs2 had kinetic diameters of 123.1nm, 104.9nm, 71.56nm, 108.9nm, 130.4nm, 116nm, 124nm, 111.9nm, 127.2nm and 129.9nm, respectively, and the percentage of particle assembly was 100%;52L1NS3 ΔC19 are unassembled. FIG. 2A shows wild-type HPV52L1; FIG. 2B shows 52L1D447 E.DELTA.C19; fig. 2C shows 52l1Δn13; FIG. 2D shows 52L1CS5; FIG. 2E shows 52L1CS6; fig. 2F shows 52L1CS7; FIG. 2G shows 52L1CS9; fig. 2H shows 52l1Δn13cs1; fig. 2I shows 52l1Δn13cs2; FIG. 2J shows 52L1NS3 ΔC19; fig. 2K shows 52L1NS4 Δc19.
FIGS. 3A to 3I show the transmission electron microscope observations of VLPs of wild-type HPVs 52L1, 52L1D447 E.DELTA.C19, 52L1CS5, 52L1CS6, 52L1CS7, 52L1CS9, 52L1 DELTA.N13 CS1, 52L1 DELTA.N13 CS2 and 52L1NS4 DELTA.C19 obtained after purification in example 6 of the present application. A large number of virus-like particles with diameters of about 40-55nm can be seen in the visual field, the sizes of the particles are consistent with the theoretical values, and the uniformity is good. Bar=100 nm. FIG. 3A shows wild-type HPV52L1; FIG. 3B shows 52L1D447 E.DELTA.C19; FIG. 3C shows 52L1CS5; FIG. 3D shows 52L1CS6; FIG. 3E shows 52L1CS7; fig. 3F shows 52L1CS9; fig. 3G shows 52l1Δn13cs1; fig. 3H shows 52l1Δn13cs2; fig. 3I shows 52L1NS4 Δc19.
FIG. 4 shows analysis of immune serum HPV52 neutralizing antibody titers after mice vaccinated with wild-type HPV52L1, 52L1D447 E.DELTA.C19, 52L 1.DELTA.N13, 52L1CS5, 52L1CS6, 52L1CS7, 52L1CS9, 52L 1.DELTA.N13 CS1, 52L 1.DELTA.N13 CS2, 52L1NS 3.DELTA.C19 and 52L1NS 4.DELTA.C19 VLPs in example 7 of the present application. * **: p <0.001.
Detailed Description
The application will be further illustrated by the following non-limiting examples, which are well known to those skilled in the art, that many modifications can be made to the application without departing from the spirit thereof, and such modifications also fall within the scope of the application. The following examples are merely illustrative of the present application and should not be construed as limiting the scope of the application as embodiments are necessarily varied. The terminology used in the description is for the purpose of describing particular embodiments only and is not intended to be limiting, the scope of the present application being defined in the appended claims.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. Preferred methods and materials of the application are described below, but any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the application. The following experimental methods are all methods described in conventional methods or product specifications unless otherwise specified, and the experimental materials used are readily available from commercial companies unless otherwise specified. All publications mentioned in this specification are herein incorporated by reference to disclose and describe the methods and/or materials in the publications.
Example 1: synthesis of mutant L1 protein gene and construction of expression vector
28 mutant L1 proteins, respectively:
1) 52L1D447eΔc19: the template is full length HPV52L1 gene (the sequence is shown as SEQ ID NO. 1), and the corresponding amino acid sequence is the sequence with the number AEI61557.1 in NCBI database (the sequence is shown as SEQ ID NO. 30). The polynucleotide sequence for encoding HPV52L 1D447E delta C19 is designed by optimizing insect codon, the construction mode is that the nucleotides 1453-1509 of HPV52L1 insect cell codon optimizing gene skeleton are deleted, and the nucleotides 1339-1341 are mutated from GAC to GAG (the amino acid sequence is shown as SEQ ID NO.2, the nucleotide sequence is shown as SEQ ID NO. 31), and the nucleotide sequence is synthesized by Shanghai biological engineering service company.
2) 52L1 ΔN2: the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), the construction mode is that HPV52L 1D447E delta C19 nucleotide 4-6 (the amino acid sequence is shown as SEQ ID NO.3, the nucleotide sequence is shown as SEQ ID NO. 32) is deleted, and the template is synthesized by Shanghai Biotechnology and bioengineering technology service Co.
3) 52L1 Δn4: the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), the construction mode is that HPV52L 1D447E delta C19 nucleotide 4-12 (the amino acid sequence is shown as SEQ ID NO.4, the nucleotide sequence is shown as SEQ ID NO. 33) is deleted, and the template is synthesized by Shanghai Biotechnology and bioengineering technology service Co.
4) 52L1 Δn5: the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), the construction mode is that HPV52L 1D447E delta C19 nucleotide 4-15 (the amino acid sequence is shown as SEQ ID NO.5, the nucleotide sequence is shown as SEQ ID NO. 34) is deleted, and the template is synthesized by Shanghai Biotechnology service company.
5) 52L1 ΔN8: the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), the construction mode is that HPV52L 1D447E delta C19 nucleotide 4-24 (the amino acid sequence is shown as SEQ ID NO.6, the nucleotide sequence is shown as SEQ ID NO. 35) is deleted, and the template is synthesized by Shanghai Biotechnology and bioengineering technology service Co.
6) 52L1 ΔN10: the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), the construction mode is that HPV52L 1D447E delta C19 nucleotide 4-30 is deleted (the amino acid sequence is shown as SEQ ID NO.7, the nucleotide sequence is shown as SEQ ID NO. 36), and the template is synthesized by Shanghai Biotechnology and bioengineering technology service Co.
7) 52L1 Δn13: the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), the construction mode is that HPV52L 1D447E delta C19 nucleotide 4-39 (the amino acid sequence is shown as SEQ ID NO.8, the nucleotide sequence is shown as SEQ ID NO. 37) is deleted, and the template is synthesized by Shanghai Biotechnology and bioengineering technology service Co.
8) 52L1 Δn15: the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), the construction mode is that HPV52L 1D447E delta C19 nucleotide 4-45 is deleted (the amino acid sequence is shown as SEQ ID NO.9, the nucleotide sequence is shown as SEQ ID NO. 38), and the template is synthesized by Shanghai Biotechnology and bioengineering technology service Co.
9) 52L1 Δn18: the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), the construction mode is that HPV52L 1D447E delta C19 nucleotide 4-54 is deleted (the amino acid sequence is shown as SEQ ID NO.10, the nucleotide sequence is shown as SEQ ID NO. 39), and the template is synthesized by Shanghai Biotechnology and bioengineering technology service Co.
10 52L1 Δn20): the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), the construction mode is that HPV52L 1D447E delta C19 nucleotide 4-60 (the amino acid sequence is shown as SEQ ID NO.11, the nucleotide sequence is shown as SEQ ID NO. 40) is deleted, and the template is synthesized by Shanghai Biotechnology and bioengineering technology service Co.
11 52L1CS 1): the template is HPV52L 1D447 E.DELTA.C19 gene (the sequence is shown as SEQ ID NO. 30), and the construction mode is that HPV52L 1D447 E.DELTA.C19 nucleotides 1447-1449 are mutated from AAA to GGA, and after nucleotide 1452, a nucleotide sequence AAAGGTCCTGCATCGAGCGCTCCTAGAACGTCGACGGACGGCTCGGGAGTGGGACGC (the amino acid sequence is shown as SEQ ID NO.12, and the nucleotide sequence is shown as SEQ ID NO. 41) is accessed, and the template is synthesized by Shanghai Biotechnology service company.
12 52L1CS 2): the template is HPV52L 1D447 E.DELTA.C19 gene (the sequence is shown as SEQ ID NO. 30), and the construction mode is that HPV52L 1D447 E.DELTA.C19 nucleotides 1447-1449 are mutated from AAA to GGA, and after nucleotide 1452, a nucleotide sequence AAAGGTCCTGCATCGAGCGCTCCTAGAACGTCGACGGACGGCTCGGGAGTGGACGGC (the amino acid sequence is shown as SEQ ID NO.13, and the nucleotide sequence is shown as SEQ ID NO. 42) is accessed, and the template is synthesized by Shanghai Biotechnology service company.
13 52L1CS 3): the template is HPV52L 1D447 E.DELTA.C19 gene (the sequence is shown as SEQ ID NO. 30), and the construction mode is that HPV52L 1D447 E.DELTA.C19 nucleotides 1447-1449 are mutated from AAA to GGA, and after nucleotide 1452, a nucleotide sequence GGATCGCCTGCATCGAGCGCTCCTAGAACGTCGACGGACGGCTCGGGAGTGAAACGC (the amino acid sequence is shown as SEQ ID NO.14, and the nucleotide sequence is shown as SEQ ID NO. 43) is accessed, and the template is synthesized by Shanghai Biotechnology service company.
14 52L1CS 4): the template is HPV52L 1D447 E.DELTA.C19 gene (the sequence is shown as SEQ ID NO. 30), and the construction mode is that HPV52L 1D447 E.DELTA.C19 nucleotides 1447-1449 are mutated from AAA to GGA, and after nucleotide 1452, a nucleotide sequence GGATCGCCTGCATCGAGCGCTCCTAGAACGTCGACGGACGGCTCGGGAGTGGACCGC (the amino acid sequence is shown as SEQ ID NO.15, and the nucleotide sequence is shown as SEQ ID NO. 44) is accessed, and the template is synthesized by Shanghai Biotechnology service company.
15 52L1CS 5): the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), and the construction mode is that nucleotide sequence GCTGGTCCTGCCTCTTCCGCACCCGCGACTTCAACCGCTGCCGGCGGAGTTGGGTCG (the amino acid sequence is shown as SEQ ID NO.16 and the nucleotide sequence is shown as SEQ ID NO. 45) is accessed after HPV52L 1D447E delta C19 nucleotide 1452, and the template is synthesized by Shanghai Biotechnology service company.
16 52L1CS 6): the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), and the construction mode is that nucleotide sequence GAAGCTCCTGCCTCTTCCGCACCCGGTACTTCAACCGGCTCGAAAGCGGTTGCTGGA (the amino acid sequence is shown as SEQ ID NO.17 and the nucleotide sequence is shown as SEQ ID NO. 46) is accessed after HPV52L 1D447E delta C19 nucleotide 1452, and the template is synthesized by Shanghai Biotechnology service company.
17 52L1CS 7): the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), and the construction mode is that nucleotide sequence GCTGGTCCTGCTTCCTCAGCTCCAGCTACCTCAACCGACGGTTCTGGTGTGAAGCGC (the amino acid sequence is shown as SEQ ID NO.18 and the nucleotide sequence is shown as SEQ ID NO. 47) is accessed after HPV52L 1D447E delta C19 nucleotide 1452, and the template is synthesized by Shanghai Biotechnology service company.
18 52L1CS 8): the template is HPV52L 1D447E delta C19 gene (the sequence is shown as SEQ ID NO. 30), and the construction mode is that nucleotide sequence GCTGGTCCTGCTTCCTCAGCTCCACGTACCTCAACCGACGGTTCTGGTGTGAAGCGC (the amino acid sequence is shown as SEQ ID NO.19 and the nucleotide sequence is shown as SEQ ID NO. 48) is accessed after HPV52L 1D447E delta C19 nucleotide 1452, and the template is synthesized by Shanghai Biotechnology service company.
19 52L1CS 9): the template is HPV52L 1D447 E.DELTA.C19 gene (the sequence is shown as SEQ ID NO. 30), and the construction mode is that HPV52L 1D447 E.DELTA.C19 nucleotides 1441-1443 are mutated from AGA to GGT, nucleotides 1447-1449 are mutated from AAA to GGC, and after HPV52L 1D447 E.DELTA.C19 nucleotide 1452, a nucleotide sequence TCGGGTCCTGCCTCGAGCGCCCCTAGAACGTCGACGGGTGGCTCGGCCGTGGGTAGC (the amino acid sequence is shown as SEQ ID NO.20 and the nucleotide sequence is shown as SEQ ID NO. 49) is accessed and synthesized by Shanghai Biotechnology and bioengineering services, inc.
20 52L1 Δn13cs1): the template is HPV52L1 delta N13 gene (the sequence is shown as SEQ ID NO. 37), and the construction mode is that HPV52L1 delta N13 nucleotides 1411-1416 are mutated from AAACTG to GGCTTG, and nucleotide sequence TCGGGTCCTGCCTCGAGCGCCCCTAGAACGTCGACGGGTGGCTCGGCCGTGGGTAGC (the amino acid sequence is shown as SEQ ID NO.21 and the nucleotide sequence is shown as SEQ ID NO. 50) is accessed after the nucleotide 1416, and the template is synthesized by Shanghai biological engineering service Co.
21 52L1 Δn13cs2): the template is HPV52L1 delta N13 gene (the sequence is shown as SEQ ID NO. 37), and the construction mode is that HPV52L1 delta N13 nucleotide 1405-1407 is mutated from AGA to GGT, nucleotide 1411-1416 is mutated from AAACTG to GGCTTG, and nucleotide sequence TCGGGTCCTGCCTCGAGCGCCCCTAGAACGTCGACGGGTGGCTCGGCCGTGGGTAGC is accessed after nucleotide 1416 (the amino acid sequence is shown as SEQ ID NO.22, the nucleotide sequence is shown as SEQ ID NO. 51) and is synthesized by Shanghai Biotechnology engineering service Co.
22 52L1 Δn13cs3): the template is HPV52L1 delta N13 gene (the sequence is shown as SEQ ID NO. 37), and the template is constructed by inserting a nucleotide sequence GCCGGTCCTGCCTCGAGCGCCCCTGCCACGTCGACGGCTGCGGGAGGCGTGGGTAGC (the amino acid sequence is shown as SEQ ID NO.23, and the nucleotide sequence is shown as SEQ ID NO. 52) after HPV52L1 delta N13 nucleotide 1416, and is synthesized by Shanghai Biotechnology service Co.
23 52L1NS1 Δc19): the template is HPV52L1 delta N13 gene (the sequence is shown as SEQ ID NO. 37), and the template is constructed by inserting a nucleotide sequence CCTAGCGAGGCTACC (the amino acid sequence is shown as SEQ ID NO.24, and the nucleotide sequence is shown as SEQ ID NO. 53) between 3/4 nucleotides of HPV52L1 delta N13, and is synthesized by Shanghai Biotechnology and engineering services Co.
24 52L1NS1 Δc25): the template is HPV52L1 delta N13 gene (the sequence is shown as SEQ ID NO. 37), the construction mode is that a nucleotide sequence CCTAGCGAGGCTACC is inserted between 3/4 nucleotides of HPV52L1 delta N13, and nucleotides 1414-1431 are deleted (the amino acid sequence is shown as SEQ ID NO.25, the nucleotide sequence is shown as SEQ ID NO. 54) and the template is synthesized by Shanghai biological engineering technical service company.
25 52L1NS2 Δc19): the template is HPV52L1 delta N13 gene (the sequence is shown as SEQ ID NO. 37), and the template is constructed by inserting a nucleotide sequence TCCGAGCGT (the amino acid sequence is shown as SEQ ID NO.26 and the nucleotide sequence is shown as SEQ ID NO. 55) between 3/4 nucleotides of HPV52L1 delta N13, and the template is synthesized by Shanghai biological engineering technical service company.
26 52L1NS3 Δc19): the template is HPV52L1 delta N13 gene (the sequence is shown as SEQ ID NO. 37), and the template is constructed by inserting a nucleotide sequence TCCGG (the amino acid sequence is shown as SEQ ID NO.27, the nucleotide sequence is shown as SEQ ID NO. 56) between 3/4 nucleotides of HPV52L1 delta N13, and the template is synthesized by Shanghai biological engineering technical service Co.
27 52L1NS4 Δc19): the template is HPV52L1 delta N13 gene (the sequence is shown as SEQ ID NO. 37), and the construction mode is that a nucleotide sequence TCC (the amino acid sequence is shown as SEQ ID NO.28 and the nucleotide sequence is shown as SEQ ID NO. 57) is inserted between 3/4 nucleotides of HPV52L1 delta N13, and the template is synthesized by Shanghai biological engineering technical service company.
28 52L1 Δn14Δc25): the template is HPV52L1 delta N13 gene (the sequence is shown as SEQ ID NO. 37), the construction mode is deleting HPV52L1 delta N13 nucleotide 4-6 and 1414-1431 (the amino acid sequence is shown as SEQ ID NO.29 and the nucleotide sequence is shown as SEQ ID NO. 58), and the template is synthesized by Shanghai biological engineering service Co.
The synthetic genes were digested with EcoR I/BamH I cleavage sites, and inserted into commercial expression vectors pFastBac1 (Invitrogen Co.) to obtain recombinant expression vectors containing HPV52L1 mutant genes: pFastBac1-52L1D447 E.DELTA.C19, pFastBac1-52L 1.DELTA.N2, pFastBac1-52L 1.DELTA.N4, pFastBac1-52L 1.DELTA.N5, pFastBac1-52L 1.DELTA.N8, pFastBac1-52L 1.DELTA.N10, pFastBac1-52L 1.DELTA.N15, pFastBac1-52L 1.DELTA.N18, pFastBac1-52L 1.DELTA.N20, pFastBac1-52L1CS1, pFastBac1-52L1CS2, pFastBac1-52L1CS3, pFastBac1-52L1CS4, pFastBac1-52L1CS5 pFastBac1-52L1CS6, pFastBac1-52L1CS7, pFastBac1-52L1CS8, pFastBac1-52L1CS9, pFastBac1-52L1 ΔN13CS1, pFastBac1-52L1 ΔN13CS2, pFastBac1-52L1 n1ΔC19, pFastBac1-52L1 n1ΔC25, pFastBac1-52L1 n2ΔC19, pFastBac1-52L1 n3ΔC19, pFastBac1-52L1 n1ΔC19, pFastBac1-52L1 ΔN14 ΔC25. The methods of cleavage, ligation and cloning are all well known, for example, from patent CN101293918B.
Example 2: recombinant Bacmid of HPV52L1 mutant gene and recombinant baculovirus construct
The recombinant expression vectors pFastBac1-52L1D447 E.DELTA.C19, pFastBac1-52L 1.DELTA.N2, pFastBac1-52L 1.DELTA.N4, pFastBac1-52L 1.DELTA.N5, pFastBac1-52L 1.DELTA.N8, pFastBac1-52L 1.DELTA.N10, pFastBac1-52L 1.DELTA.N13, pFastBac1-52L 1.DELTA.N15, pFastBac1-52L 1.DELTA.N18, pFastBac1-52L 1.DELTA.N20, pFastBac1-52L1CS1, pFastBac1-52L1CS2, pFastBac1-52L1CS3, pFastBac1-52L1CS4, pFastBac1-52L1 pFastBac1-52L1CS5, pFastBac1-52L1CS6, pFastBac1-52L1CS7, pFastBac1-52L1CS8, pFastBac1-52L1CS9, pFastBac1-52L1 ΔN13CS1, pFastBac1-52L1 ΔN13CS3, pFastBac1-52L1 ΔN1ΔC19, pFastBac1-52L1 n1ΔC25, pFastBac1-52L1 n2ΔC19, pFastBac1-52L1 n1ΔC19, pFastBac1-52L1 n12ΔC19, pFastBac1-52L1 ΔC19, pFastBac1-52L1 ΔN1ΔC14 ΔC25 were transformed into E.coli DH10Bac, screening to obtain recombinant Bacmid. Insect cells Sf9 were then transfected with recombinant Bacmid, and recombinant baculoviruses were amplified within Sf 9. Methods for screening recombinant Bacmid and amplifying recombinant baculoviruses are well known, for example, patent CN101148661B.
Example 3: expression of HPV52L1 mutant genes in Sf9 cells
Sf9 cells were inoculated with recombinant baculovirus of 28 HPV52L1 mutant genes, HPV52L1 mutant proteins were expressed, incubated at 27 ℃ for about 80h, fermented, centrifuged at 3000rpm for 15min, the supernatant was discarded, and the cells were washed with PBS for expression identification and purification. Methods of infection expression are disclosed, for example, in patent CN101148661B.
Example 4: expression identification and expression quantity comparison of HPV52L1 mutant protein
Cells expressing different HPV52L1 mutants and wild type HPV52L1 described in example 3 were taken 1X 10 each 6 Resuspension in 200 μl PBS solution, disrupting cells by ultrasonic disruption (Ningbo new ultrasonic disruption instrument, 2# probe, 100W, ultrasonic for 5s, interval 7s, total time 3 min), high-speed centrifugation at 13000rpm for 30min, collecting the lysed supernatant, measuring total protein concentration in each lysed supernatant by BCA method and uniformly diluting it to 20 ng/. Mu.l with PBS, respectively taking 10 μl (i.e. 200 ng) of the diluted lysed supernatant, adding 2 μl of 6 Xloading buffer, denaturing at 75deg.C for 8min, performing SDS-PAGE electrophoresis and Western blot identification and comparing L1 protein content (about 55 kDa) in each mutant lysate supernatant, expression identification of each mutant L1 protein is shown in FIG. 1, comparison of expression amounts of each mutant L1 protein is shown in Table 1, SDS-PAGE electrophoresis and Western blot identification methods are disclosed, for example, patent CN101148661B.
Coating an ELISA plate with HPV52L1 monoclonal antibody prepared by the inventor, and incubating at 4 ℃ for overnight; the plates were blocked with 5% BSA-PBST for 2h at room temperature and washed 3 times with PBST. Lysates were serially diluted 2-fold with PBS and HPV52L 1VLP standard was also diluted in gradient, at a concentration from 2. Mu.g/ml to 0.0625. Mu.g/ml, and ELISA plates were added, 100. Mu.l per well, and incubated for 1h at 37 ℃. Plates were washed 3 times with PBST, add 1: HPV52L1 rabbit polyclonal antibody diluted at 3000 was incubated at 37℃for 1h at 100. Mu.l per well. Plates were washed 3 times with PBST, add 1:3000 dilution of HRP-labeled goat anti-mouse IgG (1:3000 dilution, china fir bridge Co.) was incubated at 37℃for 45 minutes. The plate was washed 5 times with PBST, 100. Mu.l of OPD substrate (Sigma Co.) was added to each well, color development was performed at 37℃for 5 minutes, the reaction was stopped with 50. Mu.l of 2M sulfuric acid, and the absorbance was measured at 490 nm. The concentration of the engineered HPV52L1 protein and wild-type HPV52L1 protein in the lysates was calculated according to a standard curve.
As shown in Table 1, the expression levels of HPV52L1 proteins are affected differently by different modification modes, wherein the expression levels of the partially modified HPV52L1 proteins are increased, in particular 52L1 DeltaN 13, 52L1CS7, 52L1CS9, 52L1 DeltaN 13CS1, 52L1 DeltaN 13CS2, 52L1 DeltaN 13CS3, 52L1NS3 DeltaC 19 and 52L1NS4 DeltaC 19, and the expression levels are more than 50mg/L and are far higher than those of wild type HPV52L1 proteins.
TABLE 1 analysis of protein expression levels of HPV52L1 mutants
Example 5: purification of L1 mutant protein and dynamic light scattering particle size analysis
Taking a proper amount of cell fermentation broth of L1 mutant, resuspending cells with PBS, adding PMSF to a final concentration of 1mg/mL, sonicating (Ningbo Xinzhi sonicator, 2# probe, 200W, sonicating for 5s, interval 7s, total time 10 min), centrifuging at 13000rpm for 30min, collecting supernatant and diluting it with PBS to 3-4mg/mL, adding saturated ammonium sulfate thereto to a saturation of 30% for 1-2 hours at 4 ℃, centrifuging at 13000rpm for 30min, resuspension with a proper amount of buffer (20 mM Na 3 PO 4 50mM DTT,300mM NaCl,pH6.8) was resuspended on ice overnight. The chromatographic purification step was carried out at room temperature, and the sample was filtered using a 0.45 μm filter before chromatography, followed by SP-FF cation exchange chromatography and Q-HP anion exchange chromatography (100mM NaCl,20mM Na) 3 PO 4 10mM DTT, pH 6.8). The purified product was VLP assembled using assembly buffer (500mM NaCl,2mM CaCl2,2mM MgCl. 6H2O,20mM HEPES,0.01%Tween 80,pH6.0) at 4℃and after 3 days of assembly it was transferred to stabilization buffer (500 mM NaCl,10mM histidine, 0.01%Tween 80,pH7.2) and stabilized at 4℃for 2 days. The purification results show that the purification yield of the modified 52L1 protein is improved compared with that of the wild 52L1, in particular 52L1 delta N13, 52L1CS7, 52L1CS9, 52L1 delta N13CS1, 52L1 delta N13CS2 and 52L1 delta N13CS3, 52L1NS3 delta C19, 52L1NS4 delta C19, and the purification yield is above 15 mg/L. The above purification methods are disclosed, for example, in patent CN101293918B, CN1976718A and the like.
Taking purified protein solution for DLS particle size analysis (Zetasizer Nano ZS dynamic light scattering instrument, malvern Co.) and obtaining the results as shown in FIG. 2 and Table 2, wherein the hydraulic diameters of the mutants except 52L1 DeltaN 13 are above 100nm and are close to the diameter of HPV52L1; 52L1 ΔN13 has a hydraulic diameter of only 71.56nm, suggesting that its degree of assembly may be low.
TABLE 2 DLS analysis of HPV52L1 mutant proteins
Protein name | Hydraulic diameter (nm) | PDI |
HPV52L1 | 123.1 | 0.134 |
52L1D447EΔC19 | 104.9 | 0.142 |
52L1ΔN13 | 71.56 | 0.141 |
52L1CS5 | 108.9 | 0.126 |
52L1CS6 | 130.4 | 0.111 |
52L1CS7 | 116 | 0.135 |
52L1CS9 | 124 | 0.143 |
52L1ΔN13CS1 | 111.9 | 0.09 |
52L1ΔN13CS2 | 127.2 | 0.139 |
52L1NS3ΔC19 | 149.4 | 0.234 |
52L1NS4ΔC19 | 129.9 | 0.125 |
Example 6: transmission electron microscope observation of HPV52L1 mutant VLPs
HPV52L1 and its mutant proteins were purified and assembled, copper mesh was prepared from the assembled VLPs, stained with 1% uranium acetate, dried well and observed using JM-1400 electron microscopy (Olinbas) according to the chromatographic purification method described in example 5. Transmission electron microscopy images of HPV52L1, HPV52L 1D447E Δc19, HPV52L 1CS5, HPV52L 1CS6, HPV52L 1CS7, HPV52L 1CS9, HPV52L1 Δn13CS1, HPV52L1 Δn13CS2, HPV52L 1NS4 Δc19 VLPs are shown in fig. 3A-3I, respectively, and the diameters of these mutants are all between 40-55 nm. Methods of copper mesh preparation and electron microscopy are disclosed, for example, in patent CN101148661B.
Example 7: mouse immunization and neutralizing antibody titre assay of HPV52L1 mutant VLPs
BALB/c female mice of 4-6 weeks of age were randomly grouped, 5 mice per group were immunized with 0.1 μg VLP, subcutaneously injected, immunized at weeks 0, 2, 4, 2 weeks postimmunization, tail blood collected after 3 rd immunization, and serum was isolated.
The results of the measurement of the neutralizing antibody titer of immune serum using HPV52 pseudovirus are shown in table 3 and fig. 4, and the neutralizing activity of VLP immune serum produced by the insect cell expression system of 52L1D447E Δc19, 52L1CS5, 52L1CS6, 52L1CS7, 52L1CS9, 52L1 Δn13CS1, 52L1 Δn13CS2, 52L1NS4 Δc19 is equivalent to that of HPV52L1, while 52L1 Δn13 immune serum has no neutralizing activity. Methods for pseudovirus preparation and pseudovirus neutralization experiments are disclosed, for example, in patent CN104418942a.
TABLE 3 neutralizing antibody titres against HPV52 pseudovirus induced by HPV52L1 mutant in mice
Antigen name | Average neutralizing antibody titre |
HPV52L1 | 8960 |
52L1D447EΔC19 | 10240 |
52L1ΔN13 | <25 |
52L1CS5 | 11520 |
52L1CS6 | 8320 |
52L1CS7 | 10880 |
52L1CS9 | 9600 |
52L1ΔN13CS1 | 11520 |
52L1ΔN13CS2 | 9600 |
52L1NS4ΔC19 | 10880 |
In summary, the inventors found that the amino acid sequence of HPV52L1 was modified to obtain mutants with different expression levels, and that the degree of assembly and the immune activity could be affected by the modification, without obvious rules. Therefore, the HPV52L1 mutant with high expression level, effective assembly and good immunocompetence is obtained by adopting an amino acid sequence modification method, and has unpredictability. The optimized and modified HPV52L1 mutant obtained by screening can be used for preparing multivalent HPV prophylactic vaccine and constructing broad-spectrum HPV prophylactic vaccine, and has good research and development prospect.
DESCRIPTION OF THE SEQUENCES
SEQ ID NO.1:HPV52L1
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKDYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKLKRPASSAPRTSTKKKKVKR
SEQ ID NO.2:52L1D447EΔC19
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.3:52L1ΔN2
MVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.4:52L1ΔN4
MRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.5:52L1ΔN5
MPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.6:52L1ΔN8
MATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.7:52L1ΔN10
MVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.8:52L1ΔN13
MPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.9:52L1ΔN15
MVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.10:52L1ΔN18
MSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.11:52L1ΔN20
MVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.12:52L1CS1
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPGLKGPASSAPRTSTDGSGVGR
SEQ ID NO.13:52L1CS2
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPGLKGPASSAPRTSTDGSGVDG
SEQ ID NO.14:52L1CS3
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPGLGSPASSAPRTSTDGSGVKR
SEQ ID NO.15:52L1CS4
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPGLGSPASSAPRTSTDGSGVDR
SEQ ID NO.16:52L1CS5
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKLAGPASSAPATSTAAGGVGS
SEQ ID NO.17:52L1CS6
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKLEAPASSAPGTSTGSKAVAG
SEQ ID NO.18:52L1CS7
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKLAGPASSAPATSTDGSGVKR
SEQ ID NO.19:52L1CS8
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKLAGPASSAPRTSTDGSGVKR
SEQ ID NO.20:52L1CS9
MSVWRPSEATVYLPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQAGPGLSGPASSAPRTSTGGSAVGS
SEQ ID NO.21:52L1ΔN13CS1
MPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPGLSGPASSAP481RTSTGGSAVGS
SEQ ID NO.22:52L1ΔN13CS2
MPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQAGPGLSGPASSAP481RTSTGGSAVGS
SEQ ID NO.23:52L1ΔN13CS3
MPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKLAGPASSAP481ATSTAAGGVGS
SEQ ID NO.24:52L1NS1ΔC19
MPSEATPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.25:52L1NS1ΔC25
MPSEATPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGL
SEQ ID NO.26:52L1NS2ΔC19
MSERPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.27:52L1NS3ΔC19
MSEPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.28:52L1NS4ΔC19
MSPPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGLQARPKL
SEQ ID NO.29:52L1ΔN14ΔC25
MPVPVSKVVSTDEYVSRTSIYYYAGSSRLLTVGHPYFSIKNTSSGNGKKVLVPKVSGLQYRVFRIKLPDPNKFGFPDTSFYNPETQRLVWACTGLEIGRGQPLGVGISGHPLLNKFDDTETSNKYAGKPGIDNRECLSMDYKQTQLCILGCKPPIGEHWGKGTPCNNNSGNPGDCPPLQLINSVIQDGDMVDTGFGCMDFNTLQASKSDVPIDICSSVCKYPDYLQMASEPYGDSLFFFLRREQMFVRHFFNRAGTLGDPVPGDLYIQGSNSGNTATVQSSAFFPTPSGSMVTSESQLFNKPYWLQRAQGHNNGICWGNQLFVTVVDTTRSTNMTLCAEVKKESTYKNENFKEYLRHGEEFDLQFIFQLCKITLTADVMTYIHKMDATILEDWQFGLTPPPSASLEDTYRFVTSTAITCQKNTPPKGKEDPLKEYMFWEVDLKEKFSADLDQFPLGRKFLLQAGL
SEQ ID NO.30:HPV52L1nt
ATGTCCGTGTGGCGGCCTAGTGAGGCCACTGTGTACCTGCCTCCTGTACCTGTCTCTAAGGTTGTAAGCACTGATGAGTATGTGTCTCGCACAAGCATCTATTATTATGCAGGCAGTTCTCGATTACTAACAGTAGGACATCCCTATTTTTCTATTAAAAACACCAGTAGTGGTAATGGTAAAAAAGTTTTAGTTCCCAAGGTGTCTGGCCTGCAATACAGGGTATTTAGAATTAAATTGCCGGACCCTAATAAATTTGGTTTTCCGGATACATCTTTTTATAACCCAGAAACCCAAAGGTTGGTGTGGGCCTGTACAGGCTTGGAAATTGGTAGGGGACAGCCTTTAGGTGTGGGTATTAGTGGGCATCCTTTATTAAACAAGTTTGATGATACTGAAACCAGTAACAAATATGCTGGTAAACCTGGTATAGATAATAGAGAATGTTTATCTATGGATTATAAGCAGACTCAGTTATGCATTTTAGGATGCAAACCTCCTATAGGTGAACATTGGGGTAAGGGAACCCCTTGTAATAATAATTCAGGAAATCCTGGGGATTGTCCTCCCCTACAACTCATTAACAGTGTAATACAGGATGGGGACATGGTAGATACAGGATTTGGTTGCATGGATTTTAATACCTTGCAAGCTAGTAAAAGTGATGTGCCCATTGATATATGTAGCAGTGTATGTAAGTATCCAGATTATTTGCAAATGGCTAGCGAGCCATATGGTGACAGTTTGTTCTTTTTTCTTAGACGTGAGCAAATGTTTGTTAGACACTTTTTTAATAGGGCTGGTACCTTAGGTGACCCTGTGCCAGGTGATTTATATATACAAGGGTCTAACTCTGGCAATACTGCCACTGTACAAAGCAGTGCTTTTTTTCCTACTCCTAGTGGTTCTATGGTAACCTCAGAATCCCAATTATTTAATAAACCGTACTGGTTACAACGTGCGCAGGGCCACAATAATGGCATATGTTGGGGCAATCAGTTGTTTGTCACAGTTGTGGATACCACTCGTAGCACTAACATGACTTTATGTGCTGAAGTTAAAAAGGAAAGCACATATAAAAATGAAAATTTTAAGGAATACCTTCGTCATGGCGAGGAATTTGATTTACAATTTATTTTTCAATTGTGCAAAATTACATTAACAGCTGATGTTATGACATACATTCATAAGATGGATGCCACTATTTTAGAGGACTGGCAATTTGGCCTTACCCCACCACCGTCTGCATCTTTGGAGGACACATACAGATTTGTAACTTCTACTGCTATAACTTGTCAAAAAAACACACCACCTAAAGGAAAGGAAGATCCTTTAAAGGACTATATGTTTTGGGAGGTGGATTTAAAAGAAAAGTTTTCTGCAGATTTAGATCAGTTTCCTTTAGGTAGGAAGTTTTTGTTACAGGCAGGGCTACAGGCTAGGCCCAAACTAAAACGCCCTGCATCATCAGCCCCACGTACCTCCACAAAGAAGAAAAAGGTTAAAAGGTAA
SEQ ID NO.31:52L1D447EΔC19nt
ATGTCCGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.32:52L1ΔN2nt
ATGGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.33:52L1ΔN4nt
ATGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.34:52L1ΔN5nt
ATGCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.35:52L1ΔN8nt
ATGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.36:52L1ΔN10nt
ATGGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.37:52L1ΔN13nt
ATGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.38:52L1ΔN15nt
ATGGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.39:52L1ΔN18nt
ATGTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.40:52L1ΔN20nt
ATGGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.41:52L1CS1nt
ATGTCCGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTCGTCCTGGACTGAAAGGTCCTGCATCGAGCGCTCCTAGAACGTCGACGGACGGCTCGGGAGTGGGACGCTAA
SEQ ID NO.42:52L1CS2nt
ATGTCCGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTCGTCCTGGACTGAAAGGTCCTGCATCGAGCGCTCCTAGAACGTCGACGGACGGCTCGGGAGTGGACGGCTAA
SEQ ID NO.43:52L1CS3nt
ATGTCCGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTCGTCCTGGACTGGGATCGCCTGCATCGAGCGCTCCTAGAACGTCGACGGACGGCTCGGGAGTGAAACGCTAA
SEQ ID NO.44:52L1CS4nt
ATGTCCGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTCGTCCTGGACTGGGATCGCCTGCATCGAGCGCTCCTAGAACGTCGACGGACGGCTCGGGAGTGGACCGCTAA
SEQ ID NO.45:52L1CS5nt
ATGTCCGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGGCTGGTCCTGCCTCTTCCGCACCCGCGACTTCAACCGCTGCCGGCGGAGTTGGGTCGTAA
SEQ ID NO.46:52L1CS6nt
ATGTCCGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGGAAGCTCCTGCCTCTTCCGCACCCGGTACTTCAACCGGCTCGAAAGCGGTTGCTGGATAA
SEQ ID NO.47:52L1CS7nt
ATGTCCGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGGCTGGTCCTGCTTCCTCAGCTCCAGCTACCTCAACCGACGGTTCTGGTGTGAAGCGCTAA
SEQ ID NO.48:52L1CS8nt
ATGTCCGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGGCTGGTCCTGCTTCCTCAGCTCCACGTACCTCAACCGACGGTTCTGGTGTGAAGCGCTAA
SEQ ID NO.49:52L1CS9nt
ATGTCCGTGTGGCGTCCTTCCGAGGCTACTGTGTACTTGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCGGGTCCTGGCTTGTCGGGTCCTGCCTCGAGCGCCCCTAGAACGTCGACGGGTGGCTCGGCCGTGGGTAGCTAA
SEQ ID NO.50:52L1ΔN13CS1nt
ATGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCGAGACCTGGCTTGTCGGGTCCTGCCTCGAGCGCCCCTAGAACGTCGACGGGTGGCTCGGCCGTGGGTAGCTAA
SEQ ID NO.51:52L1ΔN13CS2nt
ATGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCGGGTCCTGGCTTGTCGGGTCCTGCCTCGAGCGCCCCTAGAACGTCGACGGGTGGCTCGGCCGTGGGTAGCTAA
SEQ ID NO.52:52L1ΔN13CS3nt
ATGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGGCCGGTCCTGCCTCGAGCGCCCCTGCCACGTCGACGGCTGCGGGAGGCGTGGGTAGCTAA
SEQ ID NO.53:52L1NS1ΔC19nt
ATGCCTAGCGAGGCTACCCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.54:52L1NS1ΔC25
ATGCCTAGCGAGGCTACCCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGTAA
SEQ ID NO.55:52L1NS2ΔC19nt
ATGTCCGAGCGTCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.56:52L1NS3ΔC19nt
ATGTCCGAGCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.57:52L1NS4ΔC19nt
ATGTCCCCTCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGCAAGCTAGACCTAAACTGTAA
SEQ ID NO.58:52L1ΔN14ΔC25nt
ATGCCAGTACCTGTTTCTAAAGTGGTCTCCACTGATGAATACGTCTCACGTACCTCGATTTACTATTACGCTGGTAGTTCAAGACTGTTGACAGTCGGCCACCCATACTTTTCTATCAAGAATACGTCCTCAGGAAACGGTAAGAAGGTCCTTGTGCCGAAAGTTTCGGGTCTCCAATACCGCGTCTTCCGTATCAAGCTGCCTGACCCCAACAAATTCGGCTTCCCAGATACTAGTTTCTATAACCCAGAGACCCAGAGACTGGTGTGGGCCTGCACAGGACTCGAAATTGGCAGGGGTCAACCTTTGGGCGTGGGAATCAGCGGTCACCCCCTTCTCAATAAGTTCGACGACACAGAGACTTCTAACAAATACGCTGGTAAGCCAGGCATCGACAACCGTGAATGCCTCTCCATGGATTACAAACAGACCCAACTGTGTATTCTGGGATGCAAGCCGCCTATCGGTGAGCATTGGGGTAAAGGCACACCTTGCAACAATAACTCAGGAAACCCAGGAGACTGCCCACCTTTGCAGCTTATCAACTCGGTTATTCAAGATGGTGACATGGTCGACACTGGCTTTGGATGTATGGACTTCAATACTCTCCAGGCTTCCAAGAGCGATGTCCCCATCGACATCTGCTCTTCCGTGTGTAAATACCCAGATTATCTGCAAATGGCTTCAGAACCTTACGGAGACTCTCTGTTCTTCTTCTTGCGCAGGGAGCAGATGTTCGTTCGTCACTTTTTCAACAGAGCCGGTACCTTGGGCGATCCTGTCCCCGGAGACCTTTATATTCAAGGTTCCAACAGCGGTAACACAGCCACCGTGCAGTCTTCCGCTTTCTTCCCAACTCCTTCAGGCAGCATGGTGACCAGTGAAAGCCAACTCTTTAATAAGCCTTACTGGTTGCAGAGGGCTCAAGGACACAACAATGGCATCTGCTGGGGTAACCAGCTGTTCGTTACAGTCGTCGATACCACTCGTTCTACCAATATGACACTGTGCGCCGAGGTGAAGAAGGAATCCACATACAAAAACGAGAATTTCAAGGAATACTTGCGTCACGGCGAGGAATTTGACCTTCAATTCATCTTCCAGCTCTGCAAGATTACTCTCACCGCTGATGTTATGACATATATCCATAAGATGGACGCTACCATCCTGGAGGATTGGCAATTTGGACTGACTCCCCCACCCTCAGCTTCGTTGGAAGACACCTACCGCTTCGTCACAAGTACTGCCATTACTTGTCAGAAGAACACTCCACCCAAGGGTAAGGAGGACCCACTTAAGGAGTACATGTTTTGGGAAGTGGATCTCAAAGAGAAGTTCAGCGCCGACCTGGATCAATTTCCTCTGGGTCGTAAGTTCCTCTTGCAAGCAGGACTGTAA。
Sequence listing
<110> basic medical institute of the national academy of medical science
<120> an engineered human papillomavirus type 52L1 protein and uses thereof
<130> 300260CG
<140> 2020113513909
<141> 2020-11-26
<160> 58
<170> SIPOSequenceListing 1.0
<210> 1
<211> 503
<212> PRT
<213> human papillomavirus type 52 (Human papillomavirus type 52)
<400> 1
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Asp Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Arg Pro Lys Leu Lys Arg Pro Ala Ser Ser Ala Pro Arg Thr Ser Thr
485 490 495
Lys Lys Lys Lys Val Lys Arg
500
<210> 2
<211> 484
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1D447EΔC19
<400> 2
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Arg Pro Lys Leu
<210> 3
<211> 483
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N2
<400> 3
Met Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val Pro
1 5 10 15
Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser Ile
20 25 30
Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro Tyr
35 40 45
Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu Val
50 55 60
Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu Pro
65 70 75 80
Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro Glu
85 90 95
Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg Gly
100 105 110
Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys Phe
115 120 125
Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile Asp
130 135 140
Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys Ile
145 150 155 160
Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr Pro
165 170 175
Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln Leu
180 185 190
Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe Gly
195 200 205
Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro Ile
210 215 220
Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met Ala
225 230 235 240
Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu Gln
245 250 255
Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp Pro
260 265 270
Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr Ala
275 280 285
Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met Val
290 295 300
Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg Ala
305 310 315 320
Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val Thr
325 330 335
Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu Val
340 345 350
Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu Arg
355 360 365
His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys Ile
370 375 380
Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala Thr
385 390 395 400
Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala Ser
405 410 415
Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys Gln
420 425 430
Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr Met
435 440 445
Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp Gln
450 455 460
Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala Arg
465 470 475 480
Pro Lys Leu
<210> 4
<211> 481
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N4
<400> 4
Met Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val Pro Val Ser
1 5 10 15
Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser Ile Tyr Tyr
20 25 30
Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro Tyr Phe Ser
35 40 45
Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu Val Pro Lys
50 55 60
Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu Pro Asp Pro
65 70 75 80
Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro Glu Thr Gln
85 90 95
Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg Gly Gln Pro
100 105 110
Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys Phe Asp Asp
115 120 125
Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile Asp Asn Arg
130 135 140
Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys Ile Leu Gly
145 150 155 160
Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr Pro Cys Asn
165 170 175
Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln Leu Ile Asn
180 185 190
Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe Gly Cys Met
195 200 205
Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro Ile Asp Ile
210 215 220
Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met Ala Ser Glu
225 230 235 240
Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu Gln Met Phe
245 250 255
Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp Pro Val Pro
260 265 270
Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr Ala Thr Val
275 280 285
Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met Val Thr Ser
290 295 300
Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg Ala Gln Gly
305 310 315 320
His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val Thr Val Val
325 330 335
Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu Val Lys Lys
340 345 350
Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu Arg His Gly
355 360 365
Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys Ile Thr Leu
370 375 380
Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala Thr Ile Leu
385 390 395 400
Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala Ser Leu Glu
405 410 415
Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys Gln Lys Asn
420 425 430
Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr Met Phe Trp
435 440 445
Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp Gln Phe Pro
450 455 460
Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala Arg Pro Lys
465 470 475 480
Leu
<210> 5
<211> 480
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N5
<400> 5
Met Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val Pro Val Ser Lys
1 5 10 15
Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser Ile Tyr Tyr Tyr
20 25 30
Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro Tyr Phe Ser Ile
35 40 45
Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu Val Pro Lys Val
50 55 60
Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu Pro Asp Pro Asn
65 70 75 80
Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro Glu Thr Gln Arg
85 90 95
Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg Gly Gln Pro Leu
100 105 110
Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys Phe Asp Asp Thr
115 120 125
Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile Asp Asn Arg Glu
130 135 140
Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys Ile Leu Gly Cys
145 150 155 160
Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr Pro Cys Asn Asn
165 170 175
Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln Leu Ile Asn Ser
180 185 190
Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe Gly Cys Met Asp
195 200 205
Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro Ile Asp Ile Cys
210 215 220
Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met Ala Ser Glu Pro
225 230 235 240
Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu Gln Met Phe Val
245 250 255
Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp Pro Val Pro Gly
260 265 270
Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr Ala Thr Val Gln
275 280 285
Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met Val Thr Ser Glu
290 295 300
Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg Ala Gln Gly His
305 310 315 320
Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val Thr Val Val Asp
325 330 335
Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu Val Lys Lys Glu
340 345 350
Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu Arg His Gly Glu
355 360 365
Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys Ile Thr Leu Thr
370 375 380
Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala Thr Ile Leu Glu
385 390 395 400
Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala Ser Leu Glu Asp
405 410 415
Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys Gln Lys Asn Thr
420 425 430
Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr Met Phe Trp Glu
435 440 445
Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp Gln Phe Pro Leu
450 455 460
Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala Arg Pro Lys Leu
465 470 475 480
<210> 6
<211> 477
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N8
<400> 6
Met Ala Thr Val Tyr Leu Pro Pro Val Pro Val Ser Lys Val Val Ser
1 5 10 15
Thr Asp Glu Tyr Val Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser
20 25 30
Ser Arg Leu Leu Thr Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr
35 40 45
Ser Ser Gly Asn Gly Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu
50 55 60
Gln Tyr Arg Val Phe Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly
65 70 75 80
Phe Pro Asp Thr Ser Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp
85 90 95
Ala Cys Thr Gly Leu Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly
100 105 110
Ile Ser Gly His Pro Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser
115 120 125
Asn Lys Tyr Ala Gly Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser
130 135 140
Met Asp Tyr Lys Gln Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro
145 150 155 160
Ile Gly Glu His Trp Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly
165 170 175
Asn Pro Gly Asp Cys Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln
180 185 190
Asp Gly Asp Met Val Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr
195 200 205
Leu Gln Ala Ser Lys Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val
210 215 220
Cys Lys Tyr Pro Asp Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp
225 230 235 240
Ser Leu Phe Phe Phe Leu Arg Arg Glu Gln Met Phe Val Arg His Phe
245 250 255
Phe Asn Arg Ala Gly Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr
260 265 270
Ile Gln Gly Ser Asn Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala
275 280 285
Phe Phe Pro Thr Pro Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu
290 295 300
Phe Asn Lys Pro Tyr Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly
305 310 315 320
Ile Cys Trp Gly Asn Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg
325 330 335
Ser Thr Asn Met Thr Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr
340 345 350
Lys Asn Glu Asn Phe Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp
355 360 365
Leu Gln Phe Ile Phe Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val
370 375 380
Met Thr Tyr Ile His Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln
385 390 395 400
Phe Gly Leu Thr Pro Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg
405 410 415
Phe Val Thr Ser Thr Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys
420 425 430
Gly Lys Glu Asp Pro Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu
435 440 445
Lys Glu Lys Phe Ser Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys
450 455 460
Phe Leu Leu Gln Ala Gly Leu Gln Ala Arg Pro Lys Leu
465 470 475
<210> 7
<211> 475
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N10
<400> 7
Met Val Tyr Leu Pro Pro Val Pro Val Ser Lys Val Val Ser Thr Asp
1 5 10 15
Glu Tyr Val Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg
20 25 30
Leu Leu Thr Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr Ser Ser
35 40 45
Gly Asn Gly Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu Gln Tyr
50 55 60
Arg Val Phe Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly Phe Pro
65 70 75 80
Asp Thr Ser Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp Ala Cys
85 90 95
Thr Gly Leu Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly Ile Ser
100 105 110
Gly His Pro Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser Asn Lys
115 120 125
Tyr Ala Gly Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser Met Asp
130 135 140
Tyr Lys Gln Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro Ile Gly
145 150 155 160
Glu His Trp Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly Asn Pro
165 170 175
Gly Asp Cys Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln Asp Gly
180 185 190
Asp Met Val Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr Leu Gln
195 200 205
Ala Ser Lys Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val Cys Lys
210 215 220
Tyr Pro Asp Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp Ser Leu
225 230 235 240
Phe Phe Phe Leu Arg Arg Glu Gln Met Phe Val Arg His Phe Phe Asn
245 250 255
Arg Ala Gly Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr Ile Gln
260 265 270
Gly Ser Asn Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala Phe Phe
275 280 285
Pro Thr Pro Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu Phe Asn
290 295 300
Lys Pro Tyr Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly Ile Cys
305 310 315 320
Trp Gly Asn Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg Ser Thr
325 330 335
Asn Met Thr Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr Lys Asn
340 345 350
Glu Asn Phe Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp Leu Gln
355 360 365
Phe Ile Phe Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val Met Thr
370 375 380
Tyr Ile His Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln Phe Gly
385 390 395 400
Leu Thr Pro Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg Phe Val
405 410 415
Thr Ser Thr Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys Gly Lys
420 425 430
Glu Asp Pro Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu Lys Glu
435 440 445
Lys Phe Ser Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys Phe Leu
450 455 460
Leu Gln Ala Gly Leu Gln Ala Arg Pro Lys Leu
465 470 475
<210> 8
<211> 472
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N13
<400> 8
Met Pro Pro Val Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val
1 5 10 15
Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr
20 25 30
Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly
35 40 45
Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe
50 55 60
Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser
65 70 75 80
Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu
85 90 95
Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro
100 105 110
Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly
115 120 125
Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln
130 135 140
Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp
145 150 155 160
Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys
165 170 175
Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val
180 185 190
Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys
195 200 205
Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp
210 215 220
Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe
225 230 235 240
Leu Arg Arg Glu Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly
245 250 255
Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn
260 265 270
Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro
275 280 285
Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr
290 295 300
Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn
305 310 315 320
Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr
325 330 335
Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe
340 345 350
Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe
355 360 365
Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His
370 375 380
Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro
385 390 395 400
Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr
405 410 415
Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro
420 425 430
Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser
435 440 445
Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala
450 455 460
Gly Leu Gln Ala Arg Pro Lys Leu
465 470
<210> 9
<211> 470
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N15
<400> 9
Met Val Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg
1 5 10 15
Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly
20 25 30
His Pro Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys
35 40 45
Val Leu Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile
50 55 60
Lys Leu Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr
65 70 75 80
Asn Pro Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile
85 90 95
Gly Arg Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu
100 105 110
Asn Lys Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro
115 120 125
Gly Ile Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln
130 135 140
Leu Cys Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys
145 150 155 160
Gly Thr Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro
165 170 175
Leu Gln Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr
180 185 190
Gly Phe Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp
195 200 205
Val Pro Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu
210 215 220
Gln Met Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg
225 230 235 240
Arg Glu Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu
245 250 255
Gly Asp Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly
260 265 270
Asn Thr Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly
275 280 285
Ser Met Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu
290 295 300
Gln Arg Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu
305 310 315 320
Phe Val Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys
325 330 335
Ala Glu Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu
340 345 350
Tyr Leu Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu
355 360 365
Cys Lys Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met
370 375 380
Asp Ala Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro
385 390 395 400
Ser Ala Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile
405 410 415
Thr Cys Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys
420 425 430
Glu Tyr Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp
435 440 445
Leu Asp Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu
450 455 460
Gln Ala Arg Pro Lys Leu
465 470
<210> 10
<211> 467
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N18
<400> 10
Met Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser Ile
1 5 10 15
Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro Tyr
20 25 30
Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu Val
35 40 45
Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu Pro
50 55 60
Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro Glu
65 70 75 80
Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg Gly
85 90 95
Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys Phe
100 105 110
Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile Asp
115 120 125
Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys Ile
130 135 140
Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr Pro
145 150 155 160
Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln Leu
165 170 175
Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe Gly
180 185 190
Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro Ile
195 200 205
Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met Ala
210 215 220
Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu Gln
225 230 235 240
Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp Pro
245 250 255
Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr Ala
260 265 270
Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met Val
275 280 285
Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg Ala
290 295 300
Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val Thr
305 310 315 320
Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu Val
325 330 335
Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu Arg
340 345 350
His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys Ile
355 360 365
Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala Thr
370 375 380
Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala Ser
385 390 395 400
Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys Gln
405 410 415
Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr Met
420 425 430
Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp Gln
435 440 445
Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala Arg
450 455 460
Pro Lys Leu
465
<210> 11
<211> 465
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N20
<400> 11
Met Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser Ile Tyr Tyr
1 5 10 15
Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro Tyr Phe Ser
20 25 30
Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu Val Pro Lys
35 40 45
Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu Pro Asp Pro
50 55 60
Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro Glu Thr Gln
65 70 75 80
Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg Gly Gln Pro
85 90 95
Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys Phe Asp Asp
100 105 110
Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile Asp Asn Arg
115 120 125
Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys Ile Leu Gly
130 135 140
Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr Pro Cys Asn
145 150 155 160
Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln Leu Ile Asn
165 170 175
Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe Gly Cys Met
180 185 190
Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro Ile Asp Ile
195 200 205
Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met Ala Ser Glu
210 215 220
Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu Gln Met Phe
225 230 235 240
Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp Pro Val Pro
245 250 255
Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr Ala Thr Val
260 265 270
Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met Val Thr Ser
275 280 285
Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg Ala Gln Gly
290 295 300
His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val Thr Val Val
305 310 315 320
Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu Val Lys Lys
325 330 335
Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu Arg His Gly
340 345 350
Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys Ile Thr Leu
355 360 365
Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala Thr Ile Leu
370 375 380
Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala Ser Leu Glu
385 390 395 400
Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys Gln Lys Asn
405 410 415
Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr Met Phe Trp
420 425 430
Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp Gln Phe Pro
435 440 445
Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala Arg Pro Lys
450 455 460
Leu
465
<210> 12
<211> 503
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS1
<400> 12
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Arg Pro Gly Leu Lys Gly Pro Ala Ser Ser Ala Pro Arg Thr Ser Thr
485 490 495
Asp Gly Ser Gly Val Gly Arg
500
<210> 13
<211> 503
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS2
<400> 13
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Arg Pro Gly Leu Lys Gly Pro Ala Ser Ser Ala Pro Arg Thr Ser Thr
485 490 495
Asp Gly Ser Gly Val Asp Gly
500
<210> 14
<211> 503
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS3
<400> 14
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Arg Pro Gly Leu Gly Ser Pro Ala Ser Ser Ala Pro Arg Thr Ser Thr
485 490 495
Asp Gly Ser Gly Val Lys Arg
500
<210> 15
<211> 503
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS4
<400> 15
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Arg Pro Gly Leu Gly Ser Pro Ala Ser Ser Ala Pro Arg Thr Ser Thr
485 490 495
Asp Gly Ser Gly Val Asp Arg
500
<210> 16
<211> 503
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS5
<400> 16
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Arg Pro Lys Leu Ala Gly Pro Ala Ser Ser Ala Pro Ala Thr Ser Thr
485 490 495
Ala Ala Gly Gly Val Gly Ser
500
<210> 17
<211> 503
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS6
<400> 17
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Arg Pro Lys Leu Glu Ala Pro Ala Ser Ser Ala Pro Gly Thr Ser Thr
485 490 495
Gly Ser Lys Ala Val Ala Gly
500
<210> 18
<211> 503
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS7
<400> 18
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Arg Pro Lys Leu Ala Gly Pro Ala Ser Ser Ala Pro Ala Thr Ser Thr
485 490 495
Asp Gly Ser Gly Val Lys Arg
500
<210> 19
<211> 503
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS8
<400> 19
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Arg Pro Lys Leu Ala Gly Pro Ala Ser Ser Ala Pro Arg Thr Ser Thr
485 490 495
Asp Gly Ser Gly Val Lys Arg
500
<210> 20
<211> 503
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS9
<400> 20
Met Ser Val Trp Arg Pro Ser Glu Ala Thr Val Tyr Leu Pro Pro Val
1 5 10 15
Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser Arg Thr Ser
20 25 30
Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val Gly His Pro
35 40 45
Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys Lys Val Leu
50 55 60
Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg Ile Lys Leu
65 70 75 80
Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe Tyr Asn Pro
85 90 95
Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu Ile Gly Arg
100 105 110
Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu Leu Asn Lys
115 120 125
Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys Pro Gly Ile
130 135 140
Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr Gln Leu Cys
145 150 155 160
Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly Lys Gly Thr
165 170 175
Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro Pro Leu Gln
180 185 190
Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp Thr Gly Phe
195 200 205
Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser Asp Val Pro
210 215 220
Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr Leu Gln Met
225 230 235 240
Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu Arg Arg Glu
245 250 255
Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr Leu Gly Asp
260 265 270
Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser Gly Asn Thr
275 280 285
Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser Gly Ser Met
290 295 300
Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp Leu Gln Arg
305 310 315 320
Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln Leu Phe Val
325 330 335
Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu Cys Ala Glu
340 345 350
Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys Glu Tyr Leu
355 360 365
Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln Leu Cys Lys
370 375 380
Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys Met Asp Ala
385 390 395 400
Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro Pro Ser Ala
405 410 415
Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala Ile Thr Cys
420 425 430
Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu Lys Glu Tyr
435 440 445
Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala Asp Leu Asp
450 455 460
Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly Leu Gln Ala
465 470 475 480
Gly Pro Gly Leu Ser Gly Pro Ala Ser Ser Ala Pro Arg Thr Ser Thr
485 490 495
Gly Gly Ser Ala Val Gly Ser
500
<210> 21
<211> 491
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N13CS1
<400> 21
Met Pro Pro Val Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val
1 5 10 15
Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr
20 25 30
Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly
35 40 45
Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe
50 55 60
Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser
65 70 75 80
Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu
85 90 95
Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro
100 105 110
Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly
115 120 125
Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln
130 135 140
Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp
145 150 155 160
Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys
165 170 175
Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val
180 185 190
Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys
195 200 205
Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp
210 215 220
Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe
225 230 235 240
Leu Arg Arg Glu Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly
245 250 255
Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn
260 265 270
Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro
275 280 285
Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr
290 295 300
Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn
305 310 315 320
Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr
325 330 335
Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe
340 345 350
Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe
355 360 365
Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His
370 375 380
Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro
385 390 395 400
Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr
405 410 415
Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro
420 425 430
Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser
435 440 445
Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala
450 455 460
Gly Leu Gln Ala Arg Pro Gly Leu Ser Gly Pro Ala Ser Ser Ala Pro
465 470 475 480
Arg Thr Ser Thr Gly Gly Ser Ala Val Gly Ser
485 490
<210> 22
<211> 491
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N13CS2
<400> 22
Met Pro Pro Val Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val
1 5 10 15
Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr
20 25 30
Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly
35 40 45
Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe
50 55 60
Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser
65 70 75 80
Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu
85 90 95
Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro
100 105 110
Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly
115 120 125
Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln
130 135 140
Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp
145 150 155 160
Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys
165 170 175
Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val
180 185 190
Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys
195 200 205
Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp
210 215 220
Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe
225 230 235 240
Leu Arg Arg Glu Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly
245 250 255
Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn
260 265 270
Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro
275 280 285
Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr
290 295 300
Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn
305 310 315 320
Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr
325 330 335
Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe
340 345 350
Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe
355 360 365
Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His
370 375 380
Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro
385 390 395 400
Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr
405 410 415
Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro
420 425 430
Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser
435 440 445
Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala
450 455 460
Gly Leu Gln Ala Gly Pro Gly Leu Ser Gly Pro Ala Ser Ser Ala Pro
465 470 475 480
Arg Thr Ser Thr Gly Gly Ser Ala Val Gly Ser
485 490
<210> 23
<211> 491
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N13CS3
<400> 23
Met Pro Pro Val Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val
1 5 10 15
Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr
20 25 30
Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly
35 40 45
Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe
50 55 60
Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser
65 70 75 80
Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu
85 90 95
Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro
100 105 110
Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly
115 120 125
Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln
130 135 140
Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp
145 150 155 160
Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys
165 170 175
Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val
180 185 190
Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys
195 200 205
Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp
210 215 220
Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe
225 230 235 240
Leu Arg Arg Glu Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly
245 250 255
Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn
260 265 270
Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro
275 280 285
Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr
290 295 300
Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn
305 310 315 320
Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr
325 330 335
Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe
340 345 350
Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe
355 360 365
Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His
370 375 380
Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro
385 390 395 400
Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr
405 410 415
Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro
420 425 430
Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser
435 440 445
Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala
450 455 460
Gly Leu Gln Ala Arg Pro Lys Leu Ala Gly Pro Ala Ser Ser Ala Pro
465 470 475 480
Ala Thr Ser Thr Ala Ala Gly Gly Val Gly Ser
485 490
<210> 24
<211> 477
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1NS1∆C19
<400> 24
Met Pro Ser Glu Ala Thr Pro Pro Val Pro Val Ser Lys Val Val Ser
1 5 10 15
Thr Asp Glu Tyr Val Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser
20 25 30
Ser Arg Leu Leu Thr Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr
35 40 45
Ser Ser Gly Asn Gly Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu
50 55 60
Gln Tyr Arg Val Phe Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly
65 70 75 80
Phe Pro Asp Thr Ser Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp
85 90 95
Ala Cys Thr Gly Leu Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly
100 105 110
Ile Ser Gly His Pro Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser
115 120 125
Asn Lys Tyr Ala Gly Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser
130 135 140
Met Asp Tyr Lys Gln Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro
145 150 155 160
Ile Gly Glu His Trp Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly
165 170 175
Asn Pro Gly Asp Cys Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln
180 185 190
Asp Gly Asp Met Val Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr
195 200 205
Leu Gln Ala Ser Lys Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val
210 215 220
Cys Lys Tyr Pro Asp Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp
225 230 235 240
Ser Leu Phe Phe Phe Leu Arg Arg Glu Gln Met Phe Val Arg His Phe
245 250 255
Phe Asn Arg Ala Gly Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr
260 265 270
Ile Gln Gly Ser Asn Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala
275 280 285
Phe Phe Pro Thr Pro Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu
290 295 300
Phe Asn Lys Pro Tyr Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly
305 310 315 320
Ile Cys Trp Gly Asn Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg
325 330 335
Ser Thr Asn Met Thr Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr
340 345 350
Lys Asn Glu Asn Phe Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp
355 360 365
Leu Gln Phe Ile Phe Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val
370 375 380
Met Thr Tyr Ile His Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln
385 390 395 400
Phe Gly Leu Thr Pro Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg
405 410 415
Phe Val Thr Ser Thr Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys
420 425 430
Gly Lys Glu Asp Pro Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu
435 440 445
Lys Glu Lys Phe Ser Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys
450 455 460
Phe Leu Leu Gln Ala Gly Leu Gln Ala Arg Pro Lys Leu
465 470 475
<210> 25
<211> 471
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1NS1∆C25
<400> 25
Met Pro Ser Glu Ala Thr Pro Pro Val Pro Val Ser Lys Val Val Ser
1 5 10 15
Thr Asp Glu Tyr Val Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser
20 25 30
Ser Arg Leu Leu Thr Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr
35 40 45
Ser Ser Gly Asn Gly Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu
50 55 60
Gln Tyr Arg Val Phe Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly
65 70 75 80
Phe Pro Asp Thr Ser Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp
85 90 95
Ala Cys Thr Gly Leu Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly
100 105 110
Ile Ser Gly His Pro Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser
115 120 125
Asn Lys Tyr Ala Gly Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser
130 135 140
Met Asp Tyr Lys Gln Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro
145 150 155 160
Ile Gly Glu His Trp Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly
165 170 175
Asn Pro Gly Asp Cys Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln
180 185 190
Asp Gly Asp Met Val Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr
195 200 205
Leu Gln Ala Ser Lys Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val
210 215 220
Cys Lys Tyr Pro Asp Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp
225 230 235 240
Ser Leu Phe Phe Phe Leu Arg Arg Glu Gln Met Phe Val Arg His Phe
245 250 255
Phe Asn Arg Ala Gly Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr
260 265 270
Ile Gln Gly Ser Asn Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala
275 280 285
Phe Phe Pro Thr Pro Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu
290 295 300
Phe Asn Lys Pro Tyr Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly
305 310 315 320
Ile Cys Trp Gly Asn Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg
325 330 335
Ser Thr Asn Met Thr Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr
340 345 350
Lys Asn Glu Asn Phe Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp
355 360 365
Leu Gln Phe Ile Phe Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val
370 375 380
Met Thr Tyr Ile His Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln
385 390 395 400
Phe Gly Leu Thr Pro Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg
405 410 415
Phe Val Thr Ser Thr Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys
420 425 430
Gly Lys Glu Asp Pro Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu
435 440 445
Lys Glu Lys Phe Ser Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys
450 455 460
Phe Leu Leu Gln Ala Gly Leu
465 470
<210> 26
<211> 475
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1NS2∆C19
<400> 26
Met Ser Glu Arg Pro Pro Val Pro Val Ser Lys Val Val Ser Thr Asp
1 5 10 15
Glu Tyr Val Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg
20 25 30
Leu Leu Thr Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr Ser Ser
35 40 45
Gly Asn Gly Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu Gln Tyr
50 55 60
Arg Val Phe Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly Phe Pro
65 70 75 80
Asp Thr Ser Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp Ala Cys
85 90 95
Thr Gly Leu Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly Ile Ser
100 105 110
Gly His Pro Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser Asn Lys
115 120 125
Tyr Ala Gly Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser Met Asp
130 135 140
Tyr Lys Gln Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro Ile Gly
145 150 155 160
Glu His Trp Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly Asn Pro
165 170 175
Gly Asp Cys Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln Asp Gly
180 185 190
Asp Met Val Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr Leu Gln
195 200 205
Ala Ser Lys Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val Cys Lys
210 215 220
Tyr Pro Asp Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp Ser Leu
225 230 235 240
Phe Phe Phe Leu Arg Arg Glu Gln Met Phe Val Arg His Phe Phe Asn
245 250 255
Arg Ala Gly Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr Ile Gln
260 265 270
Gly Ser Asn Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala Phe Phe
275 280 285
Pro Thr Pro Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu Phe Asn
290 295 300
Lys Pro Tyr Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly Ile Cys
305 310 315 320
Trp Gly Asn Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg Ser Thr
325 330 335
Asn Met Thr Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr Lys Asn
340 345 350
Glu Asn Phe Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp Leu Gln
355 360 365
Phe Ile Phe Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val Met Thr
370 375 380
Tyr Ile His Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln Phe Gly
385 390 395 400
Leu Thr Pro Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg Phe Val
405 410 415
Thr Ser Thr Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys Gly Lys
420 425 430
Glu Asp Pro Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu Lys Glu
435 440 445
Lys Phe Ser Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys Phe Leu
450 455 460
Leu Gln Ala Gly Leu Gln Ala Arg Pro Lys Leu
465 470 475
<210> 27
<211> 474
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1NS3∆C19
<400> 27
Met Ser Glu Pro Pro Val Pro Val Ser Lys Val Val Ser Thr Asp Glu
1 5 10 15
Tyr Val Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu
20 25 30
Leu Thr Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly
35 40 45
Asn Gly Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu Gln Tyr Arg
50 55 60
Val Phe Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp
65 70 75 80
Thr Ser Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp Ala Cys Thr
85 90 95
Gly Leu Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly Ile Ser Gly
100 105 110
His Pro Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr
115 120 125
Ala Gly Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr
130 135 140
Lys Gln Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu
145 150 155 160
His Trp Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly
165 170 175
Asp Cys Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln Asp Gly Asp
180 185 190
Met Val Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr Leu Gln Ala
195 200 205
Ser Lys Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr
210 215 220
Pro Asp Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe
225 230 235 240
Phe Phe Leu Arg Arg Glu Gln Met Phe Val Arg His Phe Phe Asn Arg
245 250 255
Ala Gly Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly
260 265 270
Ser Asn Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala Phe Phe Pro
275 280 285
Thr Pro Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu Phe Asn Lys
290 295 300
Pro Tyr Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly Ile Cys Trp
305 310 315 320
Gly Asn Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg Ser Thr Asn
325 330 335
Met Thr Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu
340 345 350
Asn Phe Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp Leu Gln Phe
355 360 365
Ile Phe Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val Met Thr Tyr
370 375 380
Ile His Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu
385 390 395 400
Thr Pro Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr
405 410 415
Ser Thr Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu
420 425 430
Asp Pro Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu Lys Glu Lys
435 440 445
Phe Ser Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu
450 455 460
Gln Ala Gly Leu Gln Ala Arg Pro Lys Leu
465 470
<210> 28
<211> 473
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1NS4∆C19
<400> 28
Met Ser Pro Pro Val Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr
1 5 10 15
Val Ser Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu
20 25 30
Thr Val Gly His Pro Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn
35 40 45
Gly Lys Lys Val Leu Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val
50 55 60
Phe Arg Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr
65 70 75 80
Ser Phe Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly
85 90 95
Leu Glu Ile Gly Arg Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His
100 105 110
Pro Leu Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala
115 120 125
Gly Lys Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys
130 135 140
Gln Thr Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His
145 150 155 160
Trp Gly Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp
165 170 175
Cys Pro Pro Leu Gln Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met
180 185 190
Val Asp Thr Gly Phe Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser
195 200 205
Lys Ser Asp Val Pro Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro
210 215 220
Asp Tyr Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe
225 230 235 240
Phe Leu Arg Arg Glu Gln Met Phe Val Arg His Phe Phe Asn Arg Ala
245 250 255
Gly Thr Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser
260 265 270
Asn Ser Gly Asn Thr Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr
275 280 285
Pro Ser Gly Ser Met Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro
290 295 300
Tyr Trp Leu Gln Arg Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly
305 310 315 320
Asn Gln Leu Phe Val Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met
325 330 335
Thr Leu Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn
340 345 350
Phe Lys Glu Tyr Leu Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile
355 360 365
Phe Gln Leu Cys Lys Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile
370 375 380
His Lys Met Asp Ala Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr
385 390 395 400
Pro Pro Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser
405 410 415
Thr Ala Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp
420 425 430
Pro Leu Lys Glu Tyr Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe
435 440 445
Ser Ala Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln
450 455 460
Ala Gly Leu Gln Ala Arg Pro Lys Leu
465 470
<210> 29
<211> 465
<212> PRT
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N14∆C25
<400> 29
Met Pro Val Pro Val Ser Lys Val Val Ser Thr Asp Glu Tyr Val Ser
1 5 10 15
Arg Thr Ser Ile Tyr Tyr Tyr Ala Gly Ser Ser Arg Leu Leu Thr Val
20 25 30
Gly His Pro Tyr Phe Ser Ile Lys Asn Thr Ser Ser Gly Asn Gly Lys
35 40 45
Lys Val Leu Val Pro Lys Val Ser Gly Leu Gln Tyr Arg Val Phe Arg
50 55 60
Ile Lys Leu Pro Asp Pro Asn Lys Phe Gly Phe Pro Asp Thr Ser Phe
65 70 75 80
Tyr Asn Pro Glu Thr Gln Arg Leu Val Trp Ala Cys Thr Gly Leu Glu
85 90 95
Ile Gly Arg Gly Gln Pro Leu Gly Val Gly Ile Ser Gly His Pro Leu
100 105 110
Leu Asn Lys Phe Asp Asp Thr Glu Thr Ser Asn Lys Tyr Ala Gly Lys
115 120 125
Pro Gly Ile Asp Asn Arg Glu Cys Leu Ser Met Asp Tyr Lys Gln Thr
130 135 140
Gln Leu Cys Ile Leu Gly Cys Lys Pro Pro Ile Gly Glu His Trp Gly
145 150 155 160
Lys Gly Thr Pro Cys Asn Asn Asn Ser Gly Asn Pro Gly Asp Cys Pro
165 170 175
Pro Leu Gln Leu Ile Asn Ser Val Ile Gln Asp Gly Asp Met Val Asp
180 185 190
Thr Gly Phe Gly Cys Met Asp Phe Asn Thr Leu Gln Ala Ser Lys Ser
195 200 205
Asp Val Pro Ile Asp Ile Cys Ser Ser Val Cys Lys Tyr Pro Asp Tyr
210 215 220
Leu Gln Met Ala Ser Glu Pro Tyr Gly Asp Ser Leu Phe Phe Phe Leu
225 230 235 240
Arg Arg Glu Gln Met Phe Val Arg His Phe Phe Asn Arg Ala Gly Thr
245 250 255
Leu Gly Asp Pro Val Pro Gly Asp Leu Tyr Ile Gln Gly Ser Asn Ser
260 265 270
Gly Asn Thr Ala Thr Val Gln Ser Ser Ala Phe Phe Pro Thr Pro Ser
275 280 285
Gly Ser Met Val Thr Ser Glu Ser Gln Leu Phe Asn Lys Pro Tyr Trp
290 295 300
Leu Gln Arg Ala Gln Gly His Asn Asn Gly Ile Cys Trp Gly Asn Gln
305 310 315 320
Leu Phe Val Thr Val Val Asp Thr Thr Arg Ser Thr Asn Met Thr Leu
325 330 335
Cys Ala Glu Val Lys Lys Glu Ser Thr Tyr Lys Asn Glu Asn Phe Lys
340 345 350
Glu Tyr Leu Arg His Gly Glu Glu Phe Asp Leu Gln Phe Ile Phe Gln
355 360 365
Leu Cys Lys Ile Thr Leu Thr Ala Asp Val Met Thr Tyr Ile His Lys
370 375 380
Met Asp Ala Thr Ile Leu Glu Asp Trp Gln Phe Gly Leu Thr Pro Pro
385 390 395 400
Pro Ser Ala Ser Leu Glu Asp Thr Tyr Arg Phe Val Thr Ser Thr Ala
405 410 415
Ile Thr Cys Gln Lys Asn Thr Pro Pro Lys Gly Lys Glu Asp Pro Leu
420 425 430
Lys Glu Tyr Met Phe Trp Glu Val Asp Leu Lys Glu Lys Phe Ser Ala
435 440 445
Asp Leu Asp Gln Phe Pro Leu Gly Arg Lys Phe Leu Leu Gln Ala Gly
450 455 460
Leu
465
<210> 30
<211> 1512
<212> DNA
<213> human papillomavirus type 52 (Human papillomavirus type 52)
<400> 30
atgtccgtgt ggcggcctag tgaggccact gtgtacctgc ctcctgtacc tgtctctaag 60
gttgtaagca ctgatgagta tgtgtctcgc acaagcatct attattatgc aggcagttct 120
cgattactaa cagtaggaca tccctatttt tctattaaaa acaccagtag tggtaatggt 180
aaaaaagttt tagttcccaa ggtgtctggc ctgcaataca gggtatttag aattaaattg 240
ccggacccta ataaatttgg ttttccggat acatcttttt ataacccaga aacccaaagg 300
ttggtgtggg cctgtacagg cttggaaatt ggtaggggac agcctttagg tgtgggtatt 360
agtgggcatc ctttattaaa caagtttgat gatactgaaa ccagtaacaa atatgctggt 420
aaacctggta tagataatag agaatgttta tctatggatt ataagcagac tcagttatgc 480
attttaggat gcaaacctcc tataggtgaa cattggggta agggaacccc ttgtaataat 540
aattcaggaa atcctgggga ttgtcctccc ctacaactca ttaacagtgt aatacaggat 600
ggggacatgg tagatacagg atttggttgc atggatttta ataccttgca agctagtaaa 660
agtgatgtgc ccattgatat atgtagcagt gtatgtaagt atccagatta tttgcaaatg 720
gctagcgagc catatggtga cagtttgttc ttttttctta gacgtgagca aatgtttgtt 780
agacactttt ttaatagggc tggtacctta ggtgaccctg tgccaggtga tttatatata 840
caagggtcta actctggcaa tactgccact gtacaaagca gtgctttttt tcctactcct 900
agtggttcta tggtaacctc agaatcccaa ttatttaata aaccgtactg gttacaacgt 960
gcgcagggcc acaataatgg catatgttgg ggcaatcagt tgtttgtcac agttgtggat 1020
accactcgta gcactaacat gactttatgt gctgaagtta aaaaggaaag cacatataaa 1080
aatgaaaatt ttaaggaata ccttcgtcat ggcgaggaat ttgatttaca atttattttt 1140
caattgtgca aaattacatt aacagctgat gttatgacat acattcataa gatggatgcc 1200
actattttag aggactggca atttggcctt accccaccac cgtctgcatc tttggaggac 1260
acatacagat ttgtaacttc tactgctata acttgtcaaa aaaacacacc acctaaagga 1320
aaggaagatc ctttaaagga ctatatgttt tgggaggtgg atttaaaaga aaagttttct 1380
gcagatttag atcagtttcc tttaggtagg aagtttttgt tacaggcagg gctacaggct 1440
aggcccaaac taaaacgccc tgcatcatca gccccacgta cctccacaaa gaagaaaaag 1500
gttaaaaggt aa 1512
<210> 31
<211> 1455
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1D447EΔC19nt
<400> 31
atgtccgtgt ggcgtccttc cgaggctact gtgtacttgc ctccagtacc tgtttctaaa 60
gtggtctcca ctgatgaata cgtctcacgt acctcgattt actattacgc tggtagttca 120
agactgttga cagtcggcca cccatacttt tctatcaaga atacgtcctc aggaaacggt 180
aagaaggtcc ttgtgccgaa agtttcgggt ctccaatacc gcgtcttccg tatcaagctg 240
cctgacccca acaaattcgg cttcccagat actagtttct ataacccaga gacccagaga 300
ctggtgtggg cctgcacagg actcgaaatt ggcaggggtc aacctttggg cgtgggaatc 360
agcggtcacc cccttctcaa taagttcgac gacacagaga cttctaacaa atacgctggt 420
aagccaggca tcgacaaccg tgaatgcctc tccatggatt acaaacagac ccaactgtgt 480
attctgggat gcaagccgcc tatcggtgag cattggggta aaggcacacc ttgcaacaat 540
aactcaggaa acccaggaga ctgcccacct ttgcagctta tcaactcggt tattcaagat 600
ggtgacatgg tcgacactgg ctttggatgt atggacttca atactctcca ggcttccaag 660
agcgatgtcc ccatcgacat ctgctcttcc gtgtgtaaat acccagatta tctgcaaatg 720
gcttcagaac cttacggaga ctctctgttc ttcttcttgc gcagggagca gatgttcgtt 780
cgtcactttt tcaacagagc cggtaccttg ggcgatcctg tccccggaga cctttatatt 840
caaggttcca acagcggtaa cacagccacc gtgcagtctt ccgctttctt cccaactcct 900
tcaggcagca tggtgaccag tgaaagccaa ctctttaata agccttactg gttgcagagg 960
gctcaaggac acaacaatgg catctgctgg ggtaaccagc tgttcgttac agtcgtcgat 1020
accactcgtt ctaccaatat gacactgtgc gccgaggtga agaaggaatc cacatacaaa 1080
aacgagaatt tcaaggaata cttgcgtcac ggcgaggaat ttgaccttca attcatcttc 1140
cagctctgca agattactct caccgctgat gttatgacat atatccataa gatggacgct 1200
accatcctgg aggattggca atttggactg actcccccac cctcagcttc gttggaagac 1260
acctaccgct tcgtcacaag tactgccatt acttgtcaga agaacactcc acccaagggt 1320
aaggaggacc cacttaagga gtacatgttt tgggaagtgg atctcaaaga gaagttcagc 1380
gccgacctgg atcaatttcc tctgggtcgt aagttcctct tgcaagcagg actgcaagct 1440
agacctaaac tgtaa 1455
<210> 32
<211> 1452
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N2nt
<400> 32
atggtgtggc gtccttccga ggctactgtg tacttgcctc cagtacctgt ttctaaagtg 60
gtctccactg atgaatacgt ctcacgtacc tcgatttact attacgctgg tagttcaaga 120
ctgttgacag tcggccaccc atacttttct atcaagaata cgtcctcagg aaacggtaag 180
aaggtccttg tgccgaaagt ttcgggtctc caataccgcg tcttccgtat caagctgcct 240
gaccccaaca aattcggctt cccagatact agtttctata acccagagac ccagagactg 300
gtgtgggcct gcacaggact cgaaattggc aggggtcaac ctttgggcgt gggaatcagc 360
ggtcaccccc ttctcaataa gttcgacgac acagagactt ctaacaaata cgctggtaag 420
ccaggcatcg acaaccgtga atgcctctcc atggattaca aacagaccca actgtgtatt 480
ctgggatgca agccgcctat cggtgagcat tggggtaaag gcacaccttg caacaataac 540
tcaggaaacc caggagactg cccacctttg cagcttatca actcggttat tcaagatggt 600
gacatggtcg acactggctt tggatgtatg gacttcaata ctctccaggc ttccaagagc 660
gatgtcccca tcgacatctg ctcttccgtg tgtaaatacc cagattatct gcaaatggct 720
tcagaacctt acggagactc tctgttcttc ttcttgcgca gggagcagat gttcgttcgt 780
cactttttca acagagccgg taccttgggc gatcctgtcc ccggagacct ttatattcaa 840
ggttccaaca gcggtaacac agccaccgtg cagtcttccg ctttcttccc aactccttca 900
ggcagcatgg tgaccagtga aagccaactc tttaataagc cttactggtt gcagagggct 960
caaggacaca acaatggcat ctgctggggt aaccagctgt tcgttacagt cgtcgatacc 1020
actcgttcta ccaatatgac actgtgcgcc gaggtgaaga aggaatccac atacaaaaac 1080
gagaatttca aggaatactt gcgtcacggc gaggaatttg accttcaatt catcttccag 1140
ctctgcaaga ttactctcac cgctgatgtt atgacatata tccataagat ggacgctacc 1200
atcctggagg attggcaatt tggactgact cccccaccct cagcttcgtt ggaagacacc 1260
taccgcttcg tcacaagtac tgccattact tgtcagaaga acactccacc caagggtaag 1320
gaggacccac ttaaggagta catgttttgg gaagtggatc tcaaagagaa gttcagcgcc 1380
gacctggatc aatttcctct gggtcgtaag ttcctcttgc aagcaggact gcaagctaga 1440
cctaaactgt aa 1452
<210> 33
<211> 1446
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N4nt
<400> 33
atgcgtcctt ccgaggctac tgtgtacttg cctccagtac ctgtttctaa agtggtctcc 60
actgatgaat acgtctcacg tacctcgatt tactattacg ctggtagttc aagactgttg 120
acagtcggcc acccatactt ttctatcaag aatacgtcct caggaaacgg taagaaggtc 180
cttgtgccga aagtttcggg tctccaatac cgcgtcttcc gtatcaagct gcctgacccc 240
aacaaattcg gcttcccaga tactagtttc tataacccag agacccagag actggtgtgg 300
gcctgcacag gactcgaaat tggcaggggt caacctttgg gcgtgggaat cagcggtcac 360
ccccttctca ataagttcga cgacacagag acttctaaca aatacgctgg taagccaggc 420
atcgacaacc gtgaatgcct ctccatggat tacaaacaga cccaactgtg tattctggga 480
tgcaagccgc ctatcggtga gcattggggt aaaggcacac cttgcaacaa taactcagga 540
aacccaggag actgcccacc tttgcagctt atcaactcgg ttattcaaga tggtgacatg 600
gtcgacactg gctttggatg tatggacttc aatactctcc aggcttccaa gagcgatgtc 660
cccatcgaca tctgctcttc cgtgtgtaaa tacccagatt atctgcaaat ggcttcagaa 720
ccttacggag actctctgtt cttcttcttg cgcagggagc agatgttcgt tcgtcacttt 780
ttcaacagag ccggtacctt gggcgatcct gtccccggag acctttatat tcaaggttcc 840
aacagcggta acacagccac cgtgcagtct tccgctttct tcccaactcc ttcaggcagc 900
atggtgacca gtgaaagcca actctttaat aagccttact ggttgcagag ggctcaagga 960
cacaacaatg gcatctgctg gggtaaccag ctgttcgtta cagtcgtcga taccactcgt 1020
tctaccaata tgacactgtg cgccgaggtg aagaaggaat ccacatacaa aaacgagaat 1080
ttcaaggaat acttgcgtca cggcgaggaa tttgaccttc aattcatctt ccagctctgc 1140
aagattactc tcaccgctga tgttatgaca tatatccata agatggacgc taccatcctg 1200
gaggattggc aatttggact gactccccca ccctcagctt cgttggaaga cacctaccgc 1260
ttcgtcacaa gtactgccat tacttgtcag aagaacactc cacccaaggg taaggaggac 1320
ccacttaagg agtacatgtt ttgggaagtg gatctcaaag agaagttcag cgccgacctg 1380
gatcaatttc ctctgggtcg taagttcctc ttgcaagcag gactgcaagc tagacctaaa 1440
ctgtaa 1446
<210> 34
<211> 1443
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N5nt
<400> 34
atgccttccg aggctactgt gtacttgcct ccagtacctg tttctaaagt ggtctccact 60
gatgaatacg tctcacgtac ctcgatttac tattacgctg gtagttcaag actgttgaca 120
gtcggccacc catacttttc tatcaagaat acgtcctcag gaaacggtaa gaaggtcctt 180
gtgccgaaag tttcgggtct ccaataccgc gtcttccgta tcaagctgcc tgaccccaac 240
aaattcggct tcccagatac tagtttctat aacccagaga cccagagact ggtgtgggcc 300
tgcacaggac tcgaaattgg caggggtcaa cctttgggcg tgggaatcag cggtcacccc 360
cttctcaata agttcgacga cacagagact tctaacaaat acgctggtaa gccaggcatc 420
gacaaccgtg aatgcctctc catggattac aaacagaccc aactgtgtat tctgggatgc 480
aagccgccta tcggtgagca ttggggtaaa ggcacacctt gcaacaataa ctcaggaaac 540
ccaggagact gcccaccttt gcagcttatc aactcggtta ttcaagatgg tgacatggtc 600
gacactggct ttggatgtat ggacttcaat actctccagg cttccaagag cgatgtcccc 660
atcgacatct gctcttccgt gtgtaaatac ccagattatc tgcaaatggc ttcagaacct 720
tacggagact ctctgttctt cttcttgcgc agggagcaga tgttcgttcg tcactttttc 780
aacagagccg gtaccttggg cgatcctgtc cccggagacc tttatattca aggttccaac 840
agcggtaaca cagccaccgt gcagtcttcc gctttcttcc caactccttc aggcagcatg 900
gtgaccagtg aaagccaact ctttaataag ccttactggt tgcagagggc tcaaggacac 960
aacaatggca tctgctgggg taaccagctg ttcgttacag tcgtcgatac cactcgttct 1020
accaatatga cactgtgcgc cgaggtgaag aaggaatcca catacaaaaa cgagaatttc 1080
aaggaatact tgcgtcacgg cgaggaattt gaccttcaat tcatcttcca gctctgcaag 1140
attactctca ccgctgatgt tatgacatat atccataaga tggacgctac catcctggag 1200
gattggcaat ttggactgac tcccccaccc tcagcttcgt tggaagacac ctaccgcttc 1260
gtcacaagta ctgccattac ttgtcagaag aacactccac ccaagggtaa ggaggaccca 1320
cttaaggagt acatgttttg ggaagtggat ctcaaagaga agttcagcgc cgacctggat 1380
caatttcctc tgggtcgtaa gttcctcttg caagcaggac tgcaagctag acctaaactg 1440
taa 1443
<210> 35
<211> 1434
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N8nt
<400> 35
atggctactg tgtacttgcc tccagtacct gtttctaaag tggtctccac tgatgaatac 60
gtctcacgta cctcgattta ctattacgct ggtagttcaa gactgttgac agtcggccac 120
ccatactttt ctatcaagaa tacgtcctca ggaaacggta agaaggtcct tgtgccgaaa 180
gtttcgggtc tccaataccg cgtcttccgt atcaagctgc ctgaccccaa caaattcggc 240
ttcccagata ctagtttcta taacccagag acccagagac tggtgtgggc ctgcacagga 300
ctcgaaattg gcaggggtca acctttgggc gtgggaatca gcggtcaccc ccttctcaat 360
aagttcgacg acacagagac ttctaacaaa tacgctggta agccaggcat cgacaaccgt 420
gaatgcctct ccatggatta caaacagacc caactgtgta ttctgggatg caagccgcct 480
atcggtgagc attggggtaa aggcacacct tgcaacaata actcaggaaa cccaggagac 540
tgcccacctt tgcagcttat caactcggtt attcaagatg gtgacatggt cgacactggc 600
tttggatgta tggacttcaa tactctccag gcttccaaga gcgatgtccc catcgacatc 660
tgctcttccg tgtgtaaata cccagattat ctgcaaatgg cttcagaacc ttacggagac 720
tctctgttct tcttcttgcg cagggagcag atgttcgttc gtcacttttt caacagagcc 780
ggtaccttgg gcgatcctgt ccccggagac ctttatattc aaggttccaa cagcggtaac 840
acagccaccg tgcagtcttc cgctttcttc ccaactcctt caggcagcat ggtgaccagt 900
gaaagccaac tctttaataa gccttactgg ttgcagaggg ctcaaggaca caacaatggc 960
atctgctggg gtaaccagct gttcgttaca gtcgtcgata ccactcgttc taccaatatg 1020
acactgtgcg ccgaggtgaa gaaggaatcc acatacaaaa acgagaattt caaggaatac 1080
ttgcgtcacg gcgaggaatt tgaccttcaa ttcatcttcc agctctgcaa gattactctc 1140
accgctgatg ttatgacata tatccataag atggacgcta ccatcctgga ggattggcaa 1200
tttggactga ctcccccacc ctcagcttcg ttggaagaca cctaccgctt cgtcacaagt 1260
actgccatta cttgtcagaa gaacactcca cccaagggta aggaggaccc acttaaggag 1320
tacatgtttt gggaagtgga tctcaaagag aagttcagcg ccgacctgga tcaatttcct 1380
ctgggtcgta agttcctctt gcaagcagga ctgcaagcta gacctaaact gtaa 1434
<210> 36
<211> 1428
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N10nt
<400> 36
atggtgtact tgcctccagt acctgtttct aaagtggtct ccactgatga atacgtctca 60
cgtacctcga tttactatta cgctggtagt tcaagactgt tgacagtcgg ccacccatac 120
ttttctatca agaatacgtc ctcaggaaac ggtaagaagg tccttgtgcc gaaagtttcg 180
ggtctccaat accgcgtctt ccgtatcaag ctgcctgacc ccaacaaatt cggcttccca 240
gatactagtt tctataaccc agagacccag agactggtgt gggcctgcac aggactcgaa 300
attggcaggg gtcaaccttt gggcgtggga atcagcggtc acccccttct caataagttc 360
gacgacacag agacttctaa caaatacgct ggtaagccag gcatcgacaa ccgtgaatgc 420
ctctccatgg attacaaaca gacccaactg tgtattctgg gatgcaagcc gcctatcggt 480
gagcattggg gtaaaggcac accttgcaac aataactcag gaaacccagg agactgccca 540
cctttgcagc ttatcaactc ggttattcaa gatggtgaca tggtcgacac tggctttgga 600
tgtatggact tcaatactct ccaggcttcc aagagcgatg tccccatcga catctgctct 660
tccgtgtgta aatacccaga ttatctgcaa atggcttcag aaccttacgg agactctctg 720
ttcttcttct tgcgcaggga gcagatgttc gttcgtcact ttttcaacag agccggtacc 780
ttgggcgatc ctgtccccgg agacctttat attcaaggtt ccaacagcgg taacacagcc 840
accgtgcagt cttccgcttt cttcccaact ccttcaggca gcatggtgac cagtgaaagc 900
caactcttta ataagcctta ctggttgcag agggctcaag gacacaacaa tggcatctgc 960
tggggtaacc agctgttcgt tacagtcgtc gataccactc gttctaccaa tatgacactg 1020
tgcgccgagg tgaagaagga atccacatac aaaaacgaga atttcaagga atacttgcgt 1080
cacggcgagg aatttgacct tcaattcatc ttccagctct gcaagattac tctcaccgct 1140
gatgttatga catatatcca taagatggac gctaccatcc tggaggattg gcaatttgga 1200
ctgactcccc caccctcagc ttcgttggaa gacacctacc gcttcgtcac aagtactgcc 1260
attacttgtc agaagaacac tccacccaag ggtaaggagg acccacttaa ggagtacatg 1320
ttttgggaag tggatctcaa agagaagttc agcgccgacc tggatcaatt tcctctgggt 1380
cgtaagttcc tcttgcaagc aggactgcaa gctagaccta aactgtaa 1428
<210> 37
<211> 1419
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N13nt
<400> 37
atgcctccag tacctgtttc taaagtggtc tccactgatg aatacgtctc acgtacctcg 60
atttactatt acgctggtag ttcaagactg ttgacagtcg gccacccata cttttctatc 120
aagaatacgt cctcaggaaa cggtaagaag gtccttgtgc cgaaagtttc gggtctccaa 180
taccgcgtct tccgtatcaa gctgcctgac cccaacaaat tcggcttccc agatactagt 240
ttctataacc cagagaccca gagactggtg tgggcctgca caggactcga aattggcagg 300
ggtcaacctt tgggcgtggg aatcagcggt cacccccttc tcaataagtt cgacgacaca 360
gagacttcta acaaatacgc tggtaagcca ggcatcgaca accgtgaatg cctctccatg 420
gattacaaac agacccaact gtgtattctg ggatgcaagc cgcctatcgg tgagcattgg 480
ggtaaaggca caccttgcaa caataactca ggaaacccag gagactgccc acctttgcag 540
cttatcaact cggttattca agatggtgac atggtcgaca ctggctttgg atgtatggac 600
ttcaatactc tccaggcttc caagagcgat gtccccatcg acatctgctc ttccgtgtgt 660
aaatacccag attatctgca aatggcttca gaaccttacg gagactctct gttcttcttc 720
ttgcgcaggg agcagatgtt cgttcgtcac tttttcaaca gagccggtac cttgggcgat 780
cctgtccccg gagaccttta tattcaaggt tccaacagcg gtaacacagc caccgtgcag 840
tcttccgctt tcttcccaac tccttcaggc agcatggtga ccagtgaaag ccaactcttt 900
aataagcctt actggttgca gagggctcaa ggacacaaca atggcatctg ctggggtaac 960
cagctgttcg ttacagtcgt cgataccact cgttctacca atatgacact gtgcgccgag 1020
gtgaagaagg aatccacata caaaaacgag aatttcaagg aatacttgcg tcacggcgag 1080
gaatttgacc ttcaattcat cttccagctc tgcaagatta ctctcaccgc tgatgttatg 1140
acatatatcc ataagatgga cgctaccatc ctggaggatt ggcaatttgg actgactccc 1200
ccaccctcag cttcgttgga agacacctac cgcttcgtca caagtactgc cattacttgt 1260
cagaagaaca ctccacccaa gggtaaggag gacccactta aggagtacat gttttgggaa 1320
gtggatctca aagagaagtt cagcgccgac ctggatcaat ttcctctggg tcgtaagttc 1380
ctcttgcaag caggactgca agctagacct aaactgtaa 1419
<210> 38
<211> 1413
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N15nt
<400> 38
atggtacctg tttctaaagt ggtctccact gatgaatacg tctcacgtac ctcgatttac 60
tattacgctg gtagttcaag actgttgaca gtcggccacc catacttttc tatcaagaat 120
acgtcctcag gaaacggtaa gaaggtcctt gtgccgaaag tttcgggtct ccaataccgc 180
gtcttccgta tcaagctgcc tgaccccaac aaattcggct tcccagatac tagtttctat 240
aacccagaga cccagagact ggtgtgggcc tgcacaggac tcgaaattgg caggggtcaa 300
cctttgggcg tgggaatcag cggtcacccc cttctcaata agttcgacga cacagagact 360
tctaacaaat acgctggtaa gccaggcatc gacaaccgtg aatgcctctc catggattac 420
aaacagaccc aactgtgtat tctgggatgc aagccgccta tcggtgagca ttggggtaaa 480
ggcacacctt gcaacaataa ctcaggaaac ccaggagact gcccaccttt gcagcttatc 540
aactcggtta ttcaagatgg tgacatggtc gacactggct ttggatgtat ggacttcaat 600
actctccagg cttccaagag cgatgtcccc atcgacatct gctcttccgt gtgtaaatac 660
ccagattatc tgcaaatggc ttcagaacct tacggagact ctctgttctt cttcttgcgc 720
agggagcaga tgttcgttcg tcactttttc aacagagccg gtaccttggg cgatcctgtc 780
cccggagacc tttatattca aggttccaac agcggtaaca cagccaccgt gcagtcttcc 840
gctttcttcc caactccttc aggcagcatg gtgaccagtg aaagccaact ctttaataag 900
ccttactggt tgcagagggc tcaaggacac aacaatggca tctgctgggg taaccagctg 960
ttcgttacag tcgtcgatac cactcgttct accaatatga cactgtgcgc cgaggtgaag 1020
aaggaatcca catacaaaaa cgagaatttc aaggaatact tgcgtcacgg cgaggaattt 1080
gaccttcaat tcatcttcca gctctgcaag attactctca ccgctgatgt tatgacatat 1140
atccataaga tggacgctac catcctggag gattggcaat ttggactgac tcccccaccc 1200
tcagcttcgt tggaagacac ctaccgcttc gtcacaagta ctgccattac ttgtcagaag 1260
aacactccac ccaagggtaa ggaggaccca cttaaggagt acatgttttg ggaagtggat 1320
ctcaaagaga agttcagcgc cgacctggat caatttcctc tgggtcgtaa gttcctcttg 1380
caagcaggac tgcaagctag acctaaactg taa 1413
<210> 39
<211> 1404
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N18nt
<400> 39
atgtctaaag tggtctccac tgatgaatac gtctcacgta cctcgattta ctattacgct 60
ggtagttcaa gactgttgac agtcggccac ccatactttt ctatcaagaa tacgtcctca 120
ggaaacggta agaaggtcct tgtgccgaaa gtttcgggtc tccaataccg cgtcttccgt 180
atcaagctgc ctgaccccaa caaattcggc ttcccagata ctagtttcta taacccagag 240
acccagagac tggtgtgggc ctgcacagga ctcgaaattg gcaggggtca acctttgggc 300
gtgggaatca gcggtcaccc ccttctcaat aagttcgacg acacagagac ttctaacaaa 360
tacgctggta agccaggcat cgacaaccgt gaatgcctct ccatggatta caaacagacc 420
caactgtgta ttctgggatg caagccgcct atcggtgagc attggggtaa aggcacacct 480
tgcaacaata actcaggaaa cccaggagac tgcccacctt tgcagcttat caactcggtt 540
attcaagatg gtgacatggt cgacactggc tttggatgta tggacttcaa tactctccag 600
gcttccaaga gcgatgtccc catcgacatc tgctcttccg tgtgtaaata cccagattat 660
ctgcaaatgg cttcagaacc ttacggagac tctctgttct tcttcttgcg cagggagcag 720
atgttcgttc gtcacttttt caacagagcc ggtaccttgg gcgatcctgt ccccggagac 780
ctttatattc aaggttccaa cagcggtaac acagccaccg tgcagtcttc cgctttcttc 840
ccaactcctt caggcagcat ggtgaccagt gaaagccaac tctttaataa gccttactgg 900
ttgcagaggg ctcaaggaca caacaatggc atctgctggg gtaaccagct gttcgttaca 960
gtcgtcgata ccactcgttc taccaatatg acactgtgcg ccgaggtgaa gaaggaatcc 1020
acatacaaaa acgagaattt caaggaatac ttgcgtcacg gcgaggaatt tgaccttcaa 1080
ttcatcttcc agctctgcaa gattactctc accgctgatg ttatgacata tatccataag 1140
atggacgcta ccatcctgga ggattggcaa tttggactga ctcccccacc ctcagcttcg 1200
ttggaagaca cctaccgctt cgtcacaagt actgccatta cttgtcagaa gaacactcca 1260
cccaagggta aggaggaccc acttaaggag tacatgtttt gggaagtgga tctcaaagag 1320
aagttcagcg ccgacctgga tcaatttcct ctgggtcgta agttcctctt gcaagcagga 1380
ctgcaagcta gacctaaact gtaa 1404
<210> 40
<211> 1398
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N20nt
<400> 40
atggtggtct ccactgatga atacgtctca cgtacctcga tttactatta cgctggtagt 60
tcaagactgt tgacagtcgg ccacccatac ttttctatca agaatacgtc ctcaggaaac 120
ggtaagaagg tccttgtgcc gaaagtttcg ggtctccaat accgcgtctt ccgtatcaag 180
ctgcctgacc ccaacaaatt cggcttccca gatactagtt tctataaccc agagacccag 240
agactggtgt gggcctgcac aggactcgaa attggcaggg gtcaaccttt gggcgtggga 300
atcagcggtc acccccttct caataagttc gacgacacag agacttctaa caaatacgct 360
ggtaagccag gcatcgacaa ccgtgaatgc ctctccatgg attacaaaca gacccaactg 420
tgtattctgg gatgcaagcc gcctatcggt gagcattggg gtaaaggcac accttgcaac 480
aataactcag gaaacccagg agactgccca cctttgcagc ttatcaactc ggttattcaa 540
gatggtgaca tggtcgacac tggctttgga tgtatggact tcaatactct ccaggcttcc 600
aagagcgatg tccccatcga catctgctct tccgtgtgta aatacccaga ttatctgcaa 660
atggcttcag aaccttacgg agactctctg ttcttcttct tgcgcaggga gcagatgttc 720
gttcgtcact ttttcaacag agccggtacc ttgggcgatc ctgtccccgg agacctttat 780
attcaaggtt ccaacagcgg taacacagcc accgtgcagt cttccgcttt cttcccaact 840
ccttcaggca gcatggtgac cagtgaaagc caactcttta ataagcctta ctggttgcag 900
agggctcaag gacacaacaa tggcatctgc tggggtaacc agctgttcgt tacagtcgtc 960
gataccactc gttctaccaa tatgacactg tgcgccgagg tgaagaagga atccacatac 1020
aaaaacgaga atttcaagga atacttgcgt cacggcgagg aatttgacct tcaattcatc 1080
ttccagctct gcaagattac tctcaccgct gatgttatga catatatcca taagatggac 1140
gctaccatcc tggaggattg gcaatttgga ctgactcccc caccctcagc ttcgttggaa 1200
gacacctacc gcttcgtcac aagtactgcc attacttgtc agaagaacac tccacccaag 1260
ggtaaggagg acccacttaa ggagtacatg ttttgggaag tggatctcaa agagaagttc 1320
agcgccgacc tggatcaatt tcctctgggt cgtaagttcc tcttgcaagc aggactgcaa 1380
gctagaccta aactgtaa 1398
<210> 41
<211> 1512
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS1nt
<400> 41
atgtccgtgt ggcgtccttc cgaggctact gtgtacttgc ctccagtacc tgtttctaaa 60
gtggtctcca ctgatgaata cgtctcacgt acctcgattt actattacgc tggtagttca 120
agactgttga cagtcggcca cccatacttt tctatcaaga atacgtcctc aggaaacggt 180
aagaaggtcc ttgtgccgaa agtttcgggt ctccaatacc gcgtcttccg tatcaagctg 240
cctgacccca acaaattcgg cttcccagat actagtttct ataacccaga gacccagaga 300
ctggtgtggg cctgcacagg actcgaaatt ggcaggggtc aacctttggg cgtgggaatc 360
agcggtcacc cccttctcaa taagttcgac gacacagaga cttctaacaa atacgctggt 420
aagccaggca tcgacaaccg tgaatgcctc tccatggatt acaaacagac ccaactgtgt 480
attctgggat gcaagccgcc tatcggtgag cattggggta aaggcacacc ttgcaacaat 540
aactcaggaa acccaggaga ctgcccacct ttgcagctta tcaactcggt tattcaagat 600
ggtgacatgg tcgacactgg ctttggatgt atggacttca atactctcca ggcttccaag 660
agcgatgtcc ccatcgacat ctgctcttcc gtgtgtaaat acccagatta tctgcaaatg 720
gcttcagaac cttacggaga ctctctgttc ttcttcttgc gcagggagca gatgttcgtt 780
cgtcactttt tcaacagagc cggtaccttg ggcgatcctg tccccggaga cctttatatt 840
caaggttcca acagcggtaa cacagccacc gtgcagtctt ccgctttctt cccaactcct 900
tcaggcagca tggtgaccag tgaaagccaa ctctttaata agccttactg gttgcagagg 960
gctcaaggac acaacaatgg catctgctgg ggtaaccagc tgttcgttac agtcgtcgat 1020
accactcgtt ctaccaatat gacactgtgc gccgaggtga agaaggaatc cacatacaaa 1080
aacgagaatt tcaaggaata cttgcgtcac ggcgaggaat ttgaccttca attcatcttc 1140
cagctctgca agattactct caccgctgat gttatgacat atatccataa gatggacgct 1200
accatcctgg aggattggca atttggactg actcccccac cctcagcttc gttggaagac 1260
acctaccgct tcgtcacaag tactgccatt acttgtcaga agaacactcc acccaagggt 1320
aaggaggacc cacttaagga gtacatgttt tgggaagtgg atctcaaaga gaagttcagc 1380
gccgacctgg atcaatttcc tctgggtcgt aagttcctct tgcaagcagg actgcaagct 1440
cgtcctggac tgaaaggtcc tgcatcgagc gctcctagaa cgtcgacgga cggctcggga 1500
gtgggacgct aa 1512
<210> 42
<211> 1512
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS2nt
<400> 42
atgtccgtgt ggcgtccttc cgaggctact gtgtacttgc ctccagtacc tgtttctaaa 60
gtggtctcca ctgatgaata cgtctcacgt acctcgattt actattacgc tggtagttca 120
agactgttga cagtcggcca cccatacttt tctatcaaga atacgtcctc aggaaacggt 180
aagaaggtcc ttgtgccgaa agtttcgggt ctccaatacc gcgtcttccg tatcaagctg 240
cctgacccca acaaattcgg cttcccagat actagtttct ataacccaga gacccagaga 300
ctggtgtggg cctgcacagg actcgaaatt ggcaggggtc aacctttggg cgtgggaatc 360
agcggtcacc cccttctcaa taagttcgac gacacagaga cttctaacaa atacgctggt 420
aagccaggca tcgacaaccg tgaatgcctc tccatggatt acaaacagac ccaactgtgt 480
attctgggat gcaagccgcc tatcggtgag cattggggta aaggcacacc ttgcaacaat 540
aactcaggaa acccaggaga ctgcccacct ttgcagctta tcaactcggt tattcaagat 600
ggtgacatgg tcgacactgg ctttggatgt atggacttca atactctcca ggcttccaag 660
agcgatgtcc ccatcgacat ctgctcttcc gtgtgtaaat acccagatta tctgcaaatg 720
gcttcagaac cttacggaga ctctctgttc ttcttcttgc gcagggagca gatgttcgtt 780
cgtcactttt tcaacagagc cggtaccttg ggcgatcctg tccccggaga cctttatatt 840
caaggttcca acagcggtaa cacagccacc gtgcagtctt ccgctttctt cccaactcct 900
tcaggcagca tggtgaccag tgaaagccaa ctctttaata agccttactg gttgcagagg 960
gctcaaggac acaacaatgg catctgctgg ggtaaccagc tgttcgttac agtcgtcgat 1020
accactcgtt ctaccaatat gacactgtgc gccgaggtga agaaggaatc cacatacaaa 1080
aacgagaatt tcaaggaata cttgcgtcac ggcgaggaat ttgaccttca attcatcttc 1140
cagctctgca agattactct caccgctgat gttatgacat atatccataa gatggacgct 1200
accatcctgg aggattggca atttggactg actcccccac cctcagcttc gttggaagac 1260
acctaccgct tcgtcacaag tactgccatt acttgtcaga agaacactcc acccaagggt 1320
aaggaggacc cacttaagga gtacatgttt tgggaagtgg atctcaaaga gaagttcagc 1380
gccgacctgg atcaatttcc tctgggtcgt aagttcctct tgcaagcagg actgcaagct 1440
cgtcctggac tgaaaggtcc tgcatcgagc gctcctagaa cgtcgacgga cggctcggga 1500
gtggacggct aa 1512
<210> 43
<211> 1512
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS3nt
<400> 43
atgtccgtgt ggcgtccttc cgaggctact gtgtacttgc ctccagtacc tgtttctaaa 60
gtggtctcca ctgatgaata cgtctcacgt acctcgattt actattacgc tggtagttca 120
agactgttga cagtcggcca cccatacttt tctatcaaga atacgtcctc aggaaacggt 180
aagaaggtcc ttgtgccgaa agtttcgggt ctccaatacc gcgtcttccg tatcaagctg 240
cctgacccca acaaattcgg cttcccagat actagtttct ataacccaga gacccagaga 300
ctggtgtggg cctgcacagg actcgaaatt ggcaggggtc aacctttggg cgtgggaatc 360
agcggtcacc cccttctcaa taagttcgac gacacagaga cttctaacaa atacgctggt 420
aagccaggca tcgacaaccg tgaatgcctc tccatggatt acaaacagac ccaactgtgt 480
attctgggat gcaagccgcc tatcggtgag cattggggta aaggcacacc ttgcaacaat 540
aactcaggaa acccaggaga ctgcccacct ttgcagctta tcaactcggt tattcaagat 600
ggtgacatgg tcgacactgg ctttggatgt atggacttca atactctcca ggcttccaag 660
agcgatgtcc ccatcgacat ctgctcttcc gtgtgtaaat acccagatta tctgcaaatg 720
gcttcagaac cttacggaga ctctctgttc ttcttcttgc gcagggagca gatgttcgtt 780
cgtcactttt tcaacagagc cggtaccttg ggcgatcctg tccccggaga cctttatatt 840
caaggttcca acagcggtaa cacagccacc gtgcagtctt ccgctttctt cccaactcct 900
tcaggcagca tggtgaccag tgaaagccaa ctctttaata agccttactg gttgcagagg 960
gctcaaggac acaacaatgg catctgctgg ggtaaccagc tgttcgttac agtcgtcgat 1020
accactcgtt ctaccaatat gacactgtgc gccgaggtga agaaggaatc cacatacaaa 1080
aacgagaatt tcaaggaata cttgcgtcac ggcgaggaat ttgaccttca attcatcttc 1140
cagctctgca agattactct caccgctgat gttatgacat atatccataa gatggacgct 1200
accatcctgg aggattggca atttggactg actcccccac cctcagcttc gttggaagac 1260
acctaccgct tcgtcacaag tactgccatt acttgtcaga agaacactcc acccaagggt 1320
aaggaggacc cacttaagga gtacatgttt tgggaagtgg atctcaaaga gaagttcagc 1380
gccgacctgg atcaatttcc tctgggtcgt aagttcctct tgcaagcagg actgcaagct 1440
cgtcctggac tgggatcgcc tgcatcgagc gctcctagaa cgtcgacgga cggctcggga 1500
gtgaaacgct aa 1512
<210> 44
<211> 1512
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS4nt
<400> 44
atgtccgtgt ggcgtccttc cgaggctact gtgtacttgc ctccagtacc tgtttctaaa 60
gtggtctcca ctgatgaata cgtctcacgt acctcgattt actattacgc tggtagttca 120
agactgttga cagtcggcca cccatacttt tctatcaaga atacgtcctc aggaaacggt 180
aagaaggtcc ttgtgccgaa agtttcgggt ctccaatacc gcgtcttccg tatcaagctg 240
cctgacccca acaaattcgg cttcccagat actagtttct ataacccaga gacccagaga 300
ctggtgtggg cctgcacagg actcgaaatt ggcaggggtc aacctttggg cgtgggaatc 360
agcggtcacc cccttctcaa taagttcgac gacacagaga cttctaacaa atacgctggt 420
aagccaggca tcgacaaccg tgaatgcctc tccatggatt acaaacagac ccaactgtgt 480
attctgggat gcaagccgcc tatcggtgag cattggggta aaggcacacc ttgcaacaat 540
aactcaggaa acccaggaga ctgcccacct ttgcagctta tcaactcggt tattcaagat 600
ggtgacatgg tcgacactgg ctttggatgt atggacttca atactctcca ggcttccaag 660
agcgatgtcc ccatcgacat ctgctcttcc gtgtgtaaat acccagatta tctgcaaatg 720
gcttcagaac cttacggaga ctctctgttc ttcttcttgc gcagggagca gatgttcgtt 780
cgtcactttt tcaacagagc cggtaccttg ggcgatcctg tccccggaga cctttatatt 840
caaggttcca acagcggtaa cacagccacc gtgcagtctt ccgctttctt cccaactcct 900
tcaggcagca tggtgaccag tgaaagccaa ctctttaata agccttactg gttgcagagg 960
gctcaaggac acaacaatgg catctgctgg ggtaaccagc tgttcgttac agtcgtcgat 1020
accactcgtt ctaccaatat gacactgtgc gccgaggtga agaaggaatc cacatacaaa 1080
aacgagaatt tcaaggaata cttgcgtcac ggcgaggaat ttgaccttca attcatcttc 1140
cagctctgca agattactct caccgctgat gttatgacat atatccataa gatggacgct 1200
accatcctgg aggattggca atttggactg actcccccac cctcagcttc gttggaagac 1260
acctaccgct tcgtcacaag tactgccatt acttgtcaga agaacactcc acccaagggt 1320
aaggaggacc cacttaagga gtacatgttt tgggaagtgg atctcaaaga gaagttcagc 1380
gccgacctgg atcaatttcc tctgggtcgt aagttcctct tgcaagcagg actgcaagct 1440
cgtcctggac tgggatcgcc tgcatcgagc gctcctagaa cgtcgacgga cggctcggga 1500
gtggaccgct aa 1512
<210> 45
<211> 1512
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS5nt
<400> 45
atgtccgtgt ggcgtccttc cgaggctact gtgtacttgc ctccagtacc tgtttctaaa 60
gtggtctcca ctgatgaata cgtctcacgt acctcgattt actattacgc tggtagttca 120
agactgttga cagtcggcca cccatacttt tctatcaaga atacgtcctc aggaaacggt 180
aagaaggtcc ttgtgccgaa agtttcgggt ctccaatacc gcgtcttccg tatcaagctg 240
cctgacccca acaaattcgg cttcccagat actagtttct ataacccaga gacccagaga 300
ctggtgtggg cctgcacagg actcgaaatt ggcaggggtc aacctttggg cgtgggaatc 360
agcggtcacc cccttctcaa taagttcgac gacacagaga cttctaacaa atacgctggt 420
aagccaggca tcgacaaccg tgaatgcctc tccatggatt acaaacagac ccaactgtgt 480
attctgggat gcaagccgcc tatcggtgag cattggggta aaggcacacc ttgcaacaat 540
aactcaggaa acccaggaga ctgcccacct ttgcagctta tcaactcggt tattcaagat 600
ggtgacatgg tcgacactgg ctttggatgt atggacttca atactctcca ggcttccaag 660
agcgatgtcc ccatcgacat ctgctcttcc gtgtgtaaat acccagatta tctgcaaatg 720
gcttcagaac cttacggaga ctctctgttc ttcttcttgc gcagggagca gatgttcgtt 780
cgtcactttt tcaacagagc cggtaccttg ggcgatcctg tccccggaga cctttatatt 840
caaggttcca acagcggtaa cacagccacc gtgcagtctt ccgctttctt cccaactcct 900
tcaggcagca tggtgaccag tgaaagccaa ctctttaata agccttactg gttgcagagg 960
gctcaaggac acaacaatgg catctgctgg ggtaaccagc tgttcgttac agtcgtcgat 1020
accactcgtt ctaccaatat gacactgtgc gccgaggtga agaaggaatc cacatacaaa 1080
aacgagaatt tcaaggaata cttgcgtcac ggcgaggaat ttgaccttca attcatcttc 1140
cagctctgca agattactct caccgctgat gttatgacat atatccataa gatggacgct 1200
accatcctgg aggattggca atttggactg actcccccac cctcagcttc gttggaagac 1260
acctaccgct tcgtcacaag tactgccatt acttgtcaga agaacactcc acccaagggt 1320
aaggaggacc cacttaagga gtacatgttt tgggaagtgg atctcaaaga gaagttcagc 1380
gccgacctgg atcaatttcc tctgggtcgt aagttcctct tgcaagcagg actgcaagct 1440
agacctaaac tggctggtcc tgcctcttcc gcacccgcga cttcaaccgc tgccggcgga 1500
gttgggtcgt aa 1512
<210> 46
<211> 1512
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS6nt
<400> 46
atgtccgtgt ggcgtccttc cgaggctact gtgtacttgc ctccagtacc tgtttctaaa 60
gtggtctcca ctgatgaata cgtctcacgt acctcgattt actattacgc tggtagttca 120
agactgttga cagtcggcca cccatacttt tctatcaaga atacgtcctc aggaaacggt 180
aagaaggtcc ttgtgccgaa agtttcgggt ctccaatacc gcgtcttccg tatcaagctg 240
cctgacccca acaaattcgg cttcccagat actagtttct ataacccaga gacccagaga 300
ctggtgtggg cctgcacagg actcgaaatt ggcaggggtc aacctttggg cgtgggaatc 360
agcggtcacc cccttctcaa taagttcgac gacacagaga cttctaacaa atacgctggt 420
aagccaggca tcgacaaccg tgaatgcctc tccatggatt acaaacagac ccaactgtgt 480
attctgggat gcaagccgcc tatcggtgag cattggggta aaggcacacc ttgcaacaat 540
aactcaggaa acccaggaga ctgcccacct ttgcagctta tcaactcggt tattcaagat 600
ggtgacatgg tcgacactgg ctttggatgt atggacttca atactctcca ggcttccaag 660
agcgatgtcc ccatcgacat ctgctcttcc gtgtgtaaat acccagatta tctgcaaatg 720
gcttcagaac cttacggaga ctctctgttc ttcttcttgc gcagggagca gatgttcgtt 780
cgtcactttt tcaacagagc cggtaccttg ggcgatcctg tccccggaga cctttatatt 840
caaggttcca acagcggtaa cacagccacc gtgcagtctt ccgctttctt cccaactcct 900
tcaggcagca tggtgaccag tgaaagccaa ctctttaata agccttactg gttgcagagg 960
gctcaaggac acaacaatgg catctgctgg ggtaaccagc tgttcgttac agtcgtcgat 1020
accactcgtt ctaccaatat gacactgtgc gccgaggtga agaaggaatc cacatacaaa 1080
aacgagaatt tcaaggaata cttgcgtcac ggcgaggaat ttgaccttca attcatcttc 1140
cagctctgca agattactct caccgctgat gttatgacat atatccataa gatggacgct 1200
accatcctgg aggattggca atttggactg actcccccac cctcagcttc gttggaagac 1260
acctaccgct tcgtcacaag tactgccatt acttgtcaga agaacactcc acccaagggt 1320
aaggaggacc cacttaagga gtacatgttt tgggaagtgg atctcaaaga gaagttcagc 1380
gccgacctgg atcaatttcc tctgggtcgt aagttcctct tgcaagcagg actgcaagct 1440
agacctaaac tggaagctcc tgcctcttcc gcacccggta cttcaaccgg ctcgaaagcg 1500
gttgctggat aa 1512
<210> 47
<211> 1512
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS7nt
<400> 47
atgtccgtgt ggcgtccttc cgaggctact gtgtacttgc ctccagtacc tgtttctaaa 60
gtggtctcca ctgatgaata cgtctcacgt acctcgattt actattacgc tggtagttca 120
agactgttga cagtcggcca cccatacttt tctatcaaga atacgtcctc aggaaacggt 180
aagaaggtcc ttgtgccgaa agtttcgggt ctccaatacc gcgtcttccg tatcaagctg 240
cctgacccca acaaattcgg cttcccagat actagtttct ataacccaga gacccagaga 300
ctggtgtggg cctgcacagg actcgaaatt ggcaggggtc aacctttggg cgtgggaatc 360
agcggtcacc cccttctcaa taagttcgac gacacagaga cttctaacaa atacgctggt 420
aagccaggca tcgacaaccg tgaatgcctc tccatggatt acaaacagac ccaactgtgt 480
attctgggat gcaagccgcc tatcggtgag cattggggta aaggcacacc ttgcaacaat 540
aactcaggaa acccaggaga ctgcccacct ttgcagctta tcaactcggt tattcaagat 600
ggtgacatgg tcgacactgg ctttggatgt atggacttca atactctcca ggcttccaag 660
agcgatgtcc ccatcgacat ctgctcttcc gtgtgtaaat acccagatta tctgcaaatg 720
gcttcagaac cttacggaga ctctctgttc ttcttcttgc gcagggagca gatgttcgtt 780
cgtcactttt tcaacagagc cggtaccttg ggcgatcctg tccccggaga cctttatatt 840
caaggttcca acagcggtaa cacagccacc gtgcagtctt ccgctttctt cccaactcct 900
tcaggcagca tggtgaccag tgaaagccaa ctctttaata agccttactg gttgcagagg 960
gctcaaggac acaacaatgg catctgctgg ggtaaccagc tgttcgttac agtcgtcgat 1020
accactcgtt ctaccaatat gacactgtgc gccgaggtga agaaggaatc cacatacaaa 1080
aacgagaatt tcaaggaata cttgcgtcac ggcgaggaat ttgaccttca attcatcttc 1140
cagctctgca agattactct caccgctgat gttatgacat atatccataa gatggacgct 1200
accatcctgg aggattggca atttggactg actcccccac cctcagcttc gttggaagac 1260
acctaccgct tcgtcacaag tactgccatt acttgtcaga agaacactcc acccaagggt 1320
aaggaggacc cacttaagga gtacatgttt tgggaagtgg atctcaaaga gaagttcagc 1380
gccgacctgg atcaatttcc tctgggtcgt aagttcctct tgcaagcagg actgcaagct 1440
agacctaaac tggctggtcc tgcttcctca gctccagcta cctcaaccga cggttctggt 1500
gtgaagcgct aa 1512
<210> 48
<211> 1512
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS8nt
<400> 48
atgtccgtgt ggcgtccttc cgaggctact gtgtacttgc ctccagtacc tgtttctaaa 60
gtggtctcca ctgatgaata cgtctcacgt acctcgattt actattacgc tggtagttca 120
agactgttga cagtcggcca cccatacttt tctatcaaga atacgtcctc aggaaacggt 180
aagaaggtcc ttgtgccgaa agtttcgggt ctccaatacc gcgtcttccg tatcaagctg 240
cctgacccca acaaattcgg cttcccagat actagtttct ataacccaga gacccagaga 300
ctggtgtggg cctgcacagg actcgaaatt ggcaggggtc aacctttggg cgtgggaatc 360
agcggtcacc cccttctcaa taagttcgac gacacagaga cttctaacaa atacgctggt 420
aagccaggca tcgacaaccg tgaatgcctc tccatggatt acaaacagac ccaactgtgt 480
attctgggat gcaagccgcc tatcggtgag cattggggta aaggcacacc ttgcaacaat 540
aactcaggaa acccaggaga ctgcccacct ttgcagctta tcaactcggt tattcaagat 600
ggtgacatgg tcgacactgg ctttggatgt atggacttca atactctcca ggcttccaag 660
agcgatgtcc ccatcgacat ctgctcttcc gtgtgtaaat acccagatta tctgcaaatg 720
gcttcagaac cttacggaga ctctctgttc ttcttcttgc gcagggagca gatgttcgtt 780
cgtcactttt tcaacagagc cggtaccttg ggcgatcctg tccccggaga cctttatatt 840
caaggttcca acagcggtaa cacagccacc gtgcagtctt ccgctttctt cccaactcct 900
tcaggcagca tggtgaccag tgaaagccaa ctctttaata agccttactg gttgcagagg 960
gctcaaggac acaacaatgg catctgctgg ggtaaccagc tgttcgttac agtcgtcgat 1020
accactcgtt ctaccaatat gacactgtgc gccgaggtga agaaggaatc cacatacaaa 1080
aacgagaatt tcaaggaata cttgcgtcac ggcgaggaat ttgaccttca attcatcttc 1140
cagctctgca agattactct caccgctgat gttatgacat atatccataa gatggacgct 1200
accatcctgg aggattggca atttggactg actcccccac cctcagcttc gttggaagac 1260
acctaccgct tcgtcacaag tactgccatt acttgtcaga agaacactcc acccaagggt 1320
aaggaggacc cacttaagga gtacatgttt tgggaagtgg atctcaaaga gaagttcagc 1380
gccgacctgg atcaatttcc tctgggtcgt aagttcctct tgcaagcagg actgcaagct 1440
agacctaaac tggctggtcc tgcttcctca gctccacgta cctcaaccga cggttctggt 1500
gtgaagcgct aa 1512
<210> 49
<211> 1512
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1CS9nt
<400> 49
atgtccgtgt ggcgtccttc cgaggctact gtgtacttgc ctccagtacc tgtttctaaa 60
gtggtctcca ctgatgaata cgtctcacgt acctcgattt actattacgc tggtagttca 120
agactgttga cagtcggcca cccatacttt tctatcaaga atacgtcctc aggaaacggt 180
aagaaggtcc ttgtgccgaa agtttcgggt ctccaatacc gcgtcttccg tatcaagctg 240
cctgacccca acaaattcgg cttcccagat actagtttct ataacccaga gacccagaga 300
ctggtgtggg cctgcacagg actcgaaatt ggcaggggtc aacctttggg cgtgggaatc 360
agcggtcacc cccttctcaa taagttcgac gacacagaga cttctaacaa atacgctggt 420
aagccaggca tcgacaaccg tgaatgcctc tccatggatt acaaacagac ccaactgtgt 480
attctgggat gcaagccgcc tatcggtgag cattggggta aaggcacacc ttgcaacaat 540
aactcaggaa acccaggaga ctgcccacct ttgcagctta tcaactcggt tattcaagat 600
ggtgacatgg tcgacactgg ctttggatgt atggacttca atactctcca ggcttccaag 660
agcgatgtcc ccatcgacat ctgctcttcc gtgtgtaaat acccagatta tctgcaaatg 720
gcttcagaac cttacggaga ctctctgttc ttcttcttgc gcagggagca gatgttcgtt 780
cgtcactttt tcaacagagc cggtaccttg ggcgatcctg tccccggaga cctttatatt 840
caaggttcca acagcggtaa cacagccacc gtgcagtctt ccgctttctt cccaactcct 900
tcaggcagca tggtgaccag tgaaagccaa ctctttaata agccttactg gttgcagagg 960
gctcaaggac acaacaatgg catctgctgg ggtaaccagc tgttcgttac agtcgtcgat 1020
accactcgtt ctaccaatat gacactgtgc gccgaggtga agaaggaatc cacatacaaa 1080
aacgagaatt tcaaggaata cttgcgtcac ggcgaggaat ttgaccttca attcatcttc 1140
cagctctgca agattactct caccgctgat gttatgacat atatccataa gatggacgct 1200
accatcctgg aggattggca atttggactg actcccccac cctcagcttc gttggaagac 1260
acctaccgct tcgtcacaag tactgccatt acttgtcaga agaacactcc acccaagggt 1320
aaggaggacc cacttaagga gtacatgttt tgggaagtgg atctcaaaga gaagttcagc 1380
gccgacctgg atcaatttcc tctgggtcgt aagttcctct tgcaagcagg actgcaagcg 1440
ggtcctggct tgtcgggtcc tgcctcgagc gcccctagaa cgtcgacggg tggctcggcc 1500
gtgggtagct aa 1512
<210> 50
<211> 1476
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N13CS1nt
<400> 50
atgcctccag tacctgtttc taaagtggtc tccactgatg aatacgtctc acgtacctcg 60
atttactatt acgctggtag ttcaagactg ttgacagtcg gccacccata cttttctatc 120
aagaatacgt cctcaggaaa cggtaagaag gtccttgtgc cgaaagtttc gggtctccaa 180
taccgcgtct tccgtatcaa gctgcctgac cccaacaaat tcggcttccc agatactagt 240
ttctataacc cagagaccca gagactggtg tgggcctgca caggactcga aattggcagg 300
ggtcaacctt tgggcgtggg aatcagcggt cacccccttc tcaataagtt cgacgacaca 360
gagacttcta acaaatacgc tggtaagcca ggcatcgaca accgtgaatg cctctccatg 420
gattacaaac agacccaact gtgtattctg ggatgcaagc cgcctatcgg tgagcattgg 480
ggtaaaggca caccttgcaa caataactca ggaaacccag gagactgccc acctttgcag 540
cttatcaact cggttattca agatggtgac atggtcgaca ctggctttgg atgtatggac 600
ttcaatactc tccaggcttc caagagcgat gtccccatcg acatctgctc ttccgtgtgt 660
aaatacccag attatctgca aatggcttca gaaccttacg gagactctct gttcttcttc 720
ttgcgcaggg agcagatgtt cgttcgtcac tttttcaaca gagccggtac cttgggcgat 780
cctgtccccg gagaccttta tattcaaggt tccaacagcg gtaacacagc caccgtgcag 840
tcttccgctt tcttcccaac tccttcaggc agcatggtga ccagtgaaag ccaactcttt 900
aataagcctt actggttgca gagggctcaa ggacacaaca atggcatctg ctggggtaac 960
cagctgttcg ttacagtcgt cgataccact cgttctacca atatgacact gtgcgccgag 1020
gtgaagaagg aatccacata caaaaacgag aatttcaagg aatacttgcg tcacggcgag 1080
gaatttgacc ttcaattcat cttccagctc tgcaagatta ctctcaccgc tgatgttatg 1140
acatatatcc ataagatgga cgctaccatc ctggaggatt ggcaatttgg actgactccc 1200
ccaccctcag cttcgttgga agacacctac cgcttcgtca caagtactgc cattacttgt 1260
cagaagaaca ctccacccaa gggtaaggag gacccactta aggagtacat gttttgggaa 1320
gtggatctca aagagaagtt cagcgccgac ctggatcaat ttcctctggg tcgtaagttc 1380
ctcttgcaag caggactgca agcgagacct ggcttgtcgg gtcctgcctc gagcgcccct 1440
agaacgtcga cgggtggctc ggccgtgggt agctaa 1476
<210> 51
<211> 1476
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N13CS2nt
<400> 51
atgcctccag tacctgtttc taaagtggtc tccactgatg aatacgtctc acgtacctcg 60
atttactatt acgctggtag ttcaagactg ttgacagtcg gccacccata cttttctatc 120
aagaatacgt cctcaggaaa cggtaagaag gtccttgtgc cgaaagtttc gggtctccaa 180
taccgcgtct tccgtatcaa gctgcctgac cccaacaaat tcggcttccc agatactagt 240
ttctataacc cagagaccca gagactggtg tgggcctgca caggactcga aattggcagg 300
ggtcaacctt tgggcgtggg aatcagcggt cacccccttc tcaataagtt cgacgacaca 360
gagacttcta acaaatacgc tggtaagcca ggcatcgaca accgtgaatg cctctccatg 420
gattacaaac agacccaact gtgtattctg ggatgcaagc cgcctatcgg tgagcattgg 480
ggtaaaggca caccttgcaa caataactca ggaaacccag gagactgccc acctttgcag 540
cttatcaact cggttattca agatggtgac atggtcgaca ctggctttgg atgtatggac 600
ttcaatactc tccaggcttc caagagcgat gtccccatcg acatctgctc ttccgtgtgt 660
aaatacccag attatctgca aatggcttca gaaccttacg gagactctct gttcttcttc 720
ttgcgcaggg agcagatgtt cgttcgtcac tttttcaaca gagccggtac cttgggcgat 780
cctgtccccg gagaccttta tattcaaggt tccaacagcg gtaacacagc caccgtgcag 840
tcttccgctt tcttcccaac tccttcaggc agcatggtga ccagtgaaag ccaactcttt 900
aataagcctt actggttgca gagggctcaa ggacacaaca atggcatctg ctggggtaac 960
cagctgttcg ttacagtcgt cgataccact cgttctacca atatgacact gtgcgccgag 1020
gtgaagaagg aatccacata caaaaacgag aatttcaagg aatacttgcg tcacggcgag 1080
gaatttgacc ttcaattcat cttccagctc tgcaagatta ctctcaccgc tgatgttatg 1140
acatatatcc ataagatgga cgctaccatc ctggaggatt ggcaatttgg actgactccc 1200
ccaccctcag cttcgttgga agacacctac cgcttcgtca caagtactgc cattacttgt 1260
cagaagaaca ctccacccaa gggtaaggag gacccactta aggagtacat gttttgggaa 1320
gtggatctca aagagaagtt cagcgccgac ctggatcaat ttcctctggg tcgtaagttc 1380
ctcttgcaag caggactgca agcgggtcct ggcttgtcgg gtcctgcctc gagcgcccct 1440
agaacgtcga cgggtggctc ggccgtgggt agctaa 1476
<210> 52
<211> 1476
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N13CS3nt
<400> 52
atgcctccag tacctgtttc taaagtggtc tccactgatg aatacgtctc acgtacctcg 60
atttactatt acgctggtag ttcaagactg ttgacagtcg gccacccata cttttctatc 120
aagaatacgt cctcaggaaa cggtaagaag gtccttgtgc cgaaagtttc gggtctccaa 180
taccgcgtct tccgtatcaa gctgcctgac cccaacaaat tcggcttccc agatactagt 240
ttctataacc cagagaccca gagactggtg tgggcctgca caggactcga aattggcagg 300
ggtcaacctt tgggcgtggg aatcagcggt cacccccttc tcaataagtt cgacgacaca 360
gagacttcta acaaatacgc tggtaagcca ggcatcgaca accgtgaatg cctctccatg 420
gattacaaac agacccaact gtgtattctg ggatgcaagc cgcctatcgg tgagcattgg 480
ggtaaaggca caccttgcaa caataactca ggaaacccag gagactgccc acctttgcag 540
cttatcaact cggttattca agatggtgac atggtcgaca ctggctttgg atgtatggac 600
ttcaatactc tccaggcttc caagagcgat gtccccatcg acatctgctc ttccgtgtgt 660
aaatacccag attatctgca aatggcttca gaaccttacg gagactctct gttcttcttc 720
ttgcgcaggg agcagatgtt cgttcgtcac tttttcaaca gagccggtac cttgggcgat 780
cctgtccccg gagaccttta tattcaaggt tccaacagcg gtaacacagc caccgtgcag 840
tcttccgctt tcttcccaac tccttcaggc agcatggtga ccagtgaaag ccaactcttt 900
aataagcctt actggttgca gagggctcaa ggacacaaca atggcatctg ctggggtaac 960
cagctgttcg ttacagtcgt cgataccact cgttctacca atatgacact gtgcgccgag 1020
gtgaagaagg aatccacata caaaaacgag aatttcaagg aatacttgcg tcacggcgag 1080
gaatttgacc ttcaattcat cttccagctc tgcaagatta ctctcaccgc tgatgttatg 1140
acatatatcc ataagatgga cgctaccatc ctggaggatt ggcaatttgg actgactccc 1200
ccaccctcag cttcgttgga agacacctac cgcttcgtca caagtactgc cattacttgt 1260
cagaagaaca ctccacccaa gggtaaggag gacccactta aggagtacat gttttgggaa 1320
gtggatctca aagagaagtt cagcgccgac ctggatcaat ttcctctggg tcgtaagttc 1380
ctcttgcaag caggactgca agctagacct aaactggccg gtcctgcctc gagcgcccct 1440
gccacgtcga cggctgcggg aggcgtgggt agctaa 1476
<210> 53
<211> 1434
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1NS1∆C19nt
<400> 53
atgcctagcg aggctacccc tccagtacct gtttctaaag tggtctccac tgatgaatac 60
gtctcacgta cctcgattta ctattacgct ggtagttcaa gactgttgac agtcggccac 120
ccatactttt ctatcaagaa tacgtcctca ggaaacggta agaaggtcct tgtgccgaaa 180
gtttcgggtc tccaataccg cgtcttccgt atcaagctgc ctgaccccaa caaattcggc 240
ttcccagata ctagtttcta taacccagag acccagagac tggtgtgggc ctgcacagga 300
ctcgaaattg gcaggggtca acctttgggc gtgggaatca gcggtcaccc ccttctcaat 360
aagttcgacg acacagagac ttctaacaaa tacgctggta agccaggcat cgacaaccgt 420
gaatgcctct ccatggatta caaacagacc caactgtgta ttctgggatg caagccgcct 480
atcggtgagc attggggtaa aggcacacct tgcaacaata actcaggaaa cccaggagac 540
tgcccacctt tgcagcttat caactcggtt attcaagatg gtgacatggt cgacactggc 600
tttggatgta tggacttcaa tactctccag gcttccaaga gcgatgtccc catcgacatc 660
tgctcttccg tgtgtaaata cccagattat ctgcaaatgg cttcagaacc ttacggagac 720
tctctgttct tcttcttgcg cagggagcag atgttcgttc gtcacttttt caacagagcc 780
ggtaccttgg gcgatcctgt ccccggagac ctttatattc aaggttccaa cagcggtaac 840
acagccaccg tgcagtcttc cgctttcttc ccaactcctt caggcagcat ggtgaccagt 900
gaaagccaac tctttaataa gccttactgg ttgcagaggg ctcaaggaca caacaatggc 960
atctgctggg gtaaccagct gttcgttaca gtcgtcgata ccactcgttc taccaatatg 1020
acactgtgcg ccgaggtgaa gaaggaatcc acatacaaaa acgagaattt caaggaatac 1080
ttgcgtcacg gcgaggaatt tgaccttcaa ttcatcttcc agctctgcaa gattactctc 1140
accgctgatg ttatgacata tatccataag atggacgcta ccatcctgga ggattggcaa 1200
tttggactga ctcccccacc ctcagcttcg ttggaagaca cctaccgctt cgtcacaagt 1260
actgccatta cttgtcagaa gaacactcca cccaagggta aggaggaccc acttaaggag 1320
tacatgtttt gggaagtgga tctcaaagag aagttcagcg ccgacctgga tcaatttcct 1380
ctgggtcgta agttcctctt gcaagcagga ctgcaagcta gacctaaact gtaa 1434
<210> 54
<211> 1416
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1NS1∆C25
<400> 54
atgcctagcg aggctacccc tccagtacct gtttctaaag tggtctccac tgatgaatac 60
gtctcacgta cctcgattta ctattacgct ggtagttcaa gactgttgac agtcggccac 120
ccatactttt ctatcaagaa tacgtcctca ggaaacggta agaaggtcct tgtgccgaaa 180
gtttcgggtc tccaataccg cgtcttccgt atcaagctgc ctgaccccaa caaattcggc 240
ttcccagata ctagtttcta taacccagag acccagagac tggtgtgggc ctgcacagga 300
ctcgaaattg gcaggggtca acctttgggc gtgggaatca gcggtcaccc ccttctcaat 360
aagttcgacg acacagagac ttctaacaaa tacgctggta agccaggcat cgacaaccgt 420
gaatgcctct ccatggatta caaacagacc caactgtgta ttctgggatg caagccgcct 480
atcggtgagc attggggtaa aggcacacct tgcaacaata actcaggaaa cccaggagac 540
tgcccacctt tgcagcttat caactcggtt attcaagatg gtgacatggt cgacactggc 600
tttggatgta tggacttcaa tactctccag gcttccaaga gcgatgtccc catcgacatc 660
tgctcttccg tgtgtaaata cccagattat ctgcaaatgg cttcagaacc ttacggagac 720
tctctgttct tcttcttgcg cagggagcag atgttcgttc gtcacttttt caacagagcc 780
ggtaccttgg gcgatcctgt ccccggagac ctttatattc aaggttccaa cagcggtaac 840
acagccaccg tgcagtcttc cgctttcttc ccaactcctt caggcagcat ggtgaccagt 900
gaaagccaac tctttaataa gccttactgg ttgcagaggg ctcaaggaca caacaatggc 960
atctgctggg gtaaccagct gttcgttaca gtcgtcgata ccactcgttc taccaatatg 1020
acactgtgcg ccgaggtgaa gaaggaatcc acatacaaaa acgagaattt caaggaatac 1080
ttgcgtcacg gcgaggaatt tgaccttcaa ttcatcttcc agctctgcaa gattactctc 1140
accgctgatg ttatgacata tatccataag atggacgcta ccatcctgga ggattggcaa 1200
tttggactga ctcccccacc ctcagcttcg ttggaagaca cctaccgctt cgtcacaagt 1260
actgccatta cttgtcagaa gaacactcca cccaagggta aggaggaccc acttaaggag 1320
tacatgtttt gggaagtgga tctcaaagag aagttcagcg ccgacctgga tcaatttcct 1380
ctgggtcgta agttcctctt gcaagcagga ctgtaa 1416
<210> 55
<211> 1428
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1NS2∆C19nt
<400> 55
atgtccgagc gtcctccagt acctgtttct aaagtggtct ccactgatga atacgtctca 60
cgtacctcga tttactatta cgctggtagt tcaagactgt tgacagtcgg ccacccatac 120
ttttctatca agaatacgtc ctcaggaaac ggtaagaagg tccttgtgcc gaaagtttcg 180
ggtctccaat accgcgtctt ccgtatcaag ctgcctgacc ccaacaaatt cggcttccca 240
gatactagtt tctataaccc agagacccag agactggtgt gggcctgcac aggactcgaa 300
attggcaggg gtcaaccttt gggcgtggga atcagcggtc acccccttct caataagttc 360
gacgacacag agacttctaa caaatacgct ggtaagccag gcatcgacaa ccgtgaatgc 420
ctctccatgg attacaaaca gacccaactg tgtattctgg gatgcaagcc gcctatcggt 480
gagcattggg gtaaaggcac accttgcaac aataactcag gaaacccagg agactgccca 540
cctttgcagc ttatcaactc ggttattcaa gatggtgaca tggtcgacac tggctttgga 600
tgtatggact tcaatactct ccaggcttcc aagagcgatg tccccatcga catctgctct 660
tccgtgtgta aatacccaga ttatctgcaa atggcttcag aaccttacgg agactctctg 720
ttcttcttct tgcgcaggga gcagatgttc gttcgtcact ttttcaacag agccggtacc 780
ttgggcgatc ctgtccccgg agacctttat attcaaggtt ccaacagcgg taacacagcc 840
accgtgcagt cttccgcttt cttcccaact ccttcaggca gcatggtgac cagtgaaagc 900
caactcttta ataagcctta ctggttgcag agggctcaag gacacaacaa tggcatctgc 960
tggggtaacc agctgttcgt tacagtcgtc gataccactc gttctaccaa tatgacactg 1020
tgcgccgagg tgaagaagga atccacatac aaaaacgaga atttcaagga atacttgcgt 1080
cacggcgagg aatttgacct tcaattcatc ttccagctct gcaagattac tctcaccgct 1140
gatgttatga catatatcca taagatggac gctaccatcc tggaggattg gcaatttgga 1200
ctgactcccc caccctcagc ttcgttggaa gacacctacc gcttcgtcac aagtactgcc 1260
attacttgtc agaagaacac tccacccaag ggtaaggagg acccacttaa ggagtacatg 1320
ttttgggaag tggatctcaa agagaagttc agcgccgacc tggatcaatt tcctctgggt 1380
cgtaagttcc tcttgcaagc aggactgcaa gctagaccta aactgtaa 1428
<210> 56
<211> 1425
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1NS3∆C19nt
<400> 56
atgtccgagc ctccagtacc tgtttctaaa gtggtctcca ctgatgaata cgtctcacgt 60
acctcgattt actattacgc tggtagttca agactgttga cagtcggcca cccatacttt 120
tctatcaaga atacgtcctc aggaaacggt aagaaggtcc ttgtgccgaa agtttcgggt 180
ctccaatacc gcgtcttccg tatcaagctg cctgacccca acaaattcgg cttcccagat 240
actagtttct ataacccaga gacccagaga ctggtgtggg cctgcacagg actcgaaatt 300
ggcaggggtc aacctttggg cgtgggaatc agcggtcacc cccttctcaa taagttcgac 360
gacacagaga cttctaacaa atacgctggt aagccaggca tcgacaaccg tgaatgcctc 420
tccatggatt acaaacagac ccaactgtgt attctgggat gcaagccgcc tatcggtgag 480
cattggggta aaggcacacc ttgcaacaat aactcaggaa acccaggaga ctgcccacct 540
ttgcagctta tcaactcggt tattcaagat ggtgacatgg tcgacactgg ctttggatgt 600
atggacttca atactctcca ggcttccaag agcgatgtcc ccatcgacat ctgctcttcc 660
gtgtgtaaat acccagatta tctgcaaatg gcttcagaac cttacggaga ctctctgttc 720
ttcttcttgc gcagggagca gatgttcgtt cgtcactttt tcaacagagc cggtaccttg 780
ggcgatcctg tccccggaga cctttatatt caaggttcca acagcggtaa cacagccacc 840
gtgcagtctt ccgctttctt cccaactcct tcaggcagca tggtgaccag tgaaagccaa 900
ctctttaata agccttactg gttgcagagg gctcaaggac acaacaatgg catctgctgg 960
ggtaaccagc tgttcgttac agtcgtcgat accactcgtt ctaccaatat gacactgtgc 1020
gccgaggtga agaaggaatc cacatacaaa aacgagaatt tcaaggaata cttgcgtcac 1080
ggcgaggaat ttgaccttca attcatcttc cagctctgca agattactct caccgctgat 1140
gttatgacat atatccataa gatggacgct accatcctgg aggattggca atttggactg 1200
actcccccac cctcagcttc gttggaagac acctaccgct tcgtcacaag tactgccatt 1260
acttgtcaga agaacactcc acccaagggt aaggaggacc cacttaagga gtacatgttt 1320
tgggaagtgg atctcaaaga gaagttcagc gccgacctgg atcaatttcc tctgggtcgt 1380
aagttcctct tgcaagcagg actgcaagct agacctaaac tgtaa 1425
<210> 57
<211> 1422
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1NS4∆C19nt
<400> 57
atgtcccctc cagtacctgt ttctaaagtg gtctccactg atgaatacgt ctcacgtacc 60
tcgatttact attacgctgg tagttcaaga ctgttgacag tcggccaccc atacttttct 120
atcaagaata cgtcctcagg aaacggtaag aaggtccttg tgccgaaagt ttcgggtctc 180
caataccgcg tcttccgtat caagctgcct gaccccaaca aattcggctt cccagatact 240
agtttctata acccagagac ccagagactg gtgtgggcct gcacaggact cgaaattggc 300
aggggtcaac ctttgggcgt gggaatcagc ggtcaccccc ttctcaataa gttcgacgac 360
acagagactt ctaacaaata cgctggtaag ccaggcatcg acaaccgtga atgcctctcc 420
atggattaca aacagaccca actgtgtatt ctgggatgca agccgcctat cggtgagcat 480
tggggtaaag gcacaccttg caacaataac tcaggaaacc caggagactg cccacctttg 540
cagcttatca actcggttat tcaagatggt gacatggtcg acactggctt tggatgtatg 600
gacttcaata ctctccaggc ttccaagagc gatgtcccca tcgacatctg ctcttccgtg 660
tgtaaatacc cagattatct gcaaatggct tcagaacctt acggagactc tctgttcttc 720
ttcttgcgca gggagcagat gttcgttcgt cactttttca acagagccgg taccttgggc 780
gatcctgtcc ccggagacct ttatattcaa ggttccaaca gcggtaacac agccaccgtg 840
cagtcttccg ctttcttccc aactccttca ggcagcatgg tgaccagtga aagccaactc 900
tttaataagc cttactggtt gcagagggct caaggacaca acaatggcat ctgctggggt 960
aaccagctgt tcgttacagt cgtcgatacc actcgttcta ccaatatgac actgtgcgcc 1020
gaggtgaaga aggaatccac atacaaaaac gagaatttca aggaatactt gcgtcacggc 1080
gaggaatttg accttcaatt catcttccag ctctgcaaga ttactctcac cgctgatgtt 1140
atgacatata tccataagat ggacgctacc atcctggagg attggcaatt tggactgact 1200
cccccaccct cagcttcgtt ggaagacacc taccgcttcg tcacaagtac tgccattact 1260
tgtcagaaga acactccacc caagggtaag gaggacccac ttaaggagta catgttttgg 1320
gaagtggatc tcaaagagaa gttcagcgcc gacctggatc aatttcctct gggtcgtaag 1380
ttcctcttgc aagcaggact gcaagctaga cctaaactgt aa 1422
<210> 58
<211> 1398
<212> DNA
<213> Artificial sequence (Artificial Sequence)
<220>
<223> 52L1∆N14∆C25nt
<400> 58
atgccagtac ctgtttctaa agtggtctcc actgatgaat acgtctcacg tacctcgatt 60
tactattacg ctggtagttc aagactgttg acagtcggcc acccatactt ttctatcaag 120
aatacgtcct caggaaacgg taagaaggtc cttgtgccga aagtttcggg tctccaatac 180
cgcgtcttcc gtatcaagct gcctgacccc aacaaattcg gcttcccaga tactagtttc 240
tataacccag agacccagag actggtgtgg gcctgcacag gactcgaaat tggcaggggt 300
caacctttgg gcgtgggaat cagcggtcac ccccttctca ataagttcga cgacacagag 360
acttctaaca aatacgctgg taagccaggc atcgacaacc gtgaatgcct ctccatggat 420
tacaaacaga cccaactgtg tattctggga tgcaagccgc ctatcggtga gcattggggt 480
aaaggcacac cttgcaacaa taactcagga aacccaggag actgcccacc tttgcagctt 540
atcaactcgg ttattcaaga tggtgacatg gtcgacactg gctttggatg tatggacttc 600
aatactctcc aggcttccaa gagcgatgtc cccatcgaca tctgctcttc cgtgtgtaaa 660
tacccagatt atctgcaaat ggcttcagaa ccttacggag actctctgtt cttcttcttg 720
cgcagggagc agatgttcgt tcgtcacttt ttcaacagag ccggtacctt gggcgatcct 780
gtccccggag acctttatat tcaaggttcc aacagcggta acacagccac cgtgcagtct 840
tccgctttct tcccaactcc ttcaggcagc atggtgacca gtgaaagcca actctttaat 900
aagccttact ggttgcagag ggctcaagga cacaacaatg gcatctgctg gggtaaccag 960
ctgttcgtta cagtcgtcga taccactcgt tctaccaata tgacactgtg cgccgaggtg 1020
aagaaggaat ccacatacaa aaacgagaat ttcaaggaat acttgcgtca cggcgaggaa 1080
tttgaccttc aattcatctt ccagctctgc aagattactc tcaccgctga tgttatgaca 1140
tatatccata agatggacgc taccatcctg gaggattggc aatttggact gactccccca 1200
ccctcagctt cgttggaaga cacctaccgc ttcgtcacaa gtactgccat tacttgtcag 1260
aagaacactc cacccaaggg taaggaggac ccacttaagg agtacatgtt ttgggaagtg 1320
gatctcaaag agaagttcag cgccgacctg gatcaatttc ctctgggtcg taagttcctc 1380
ttgcaagcag gactgtaa 1398
Claims (14)
1. An engineered HPV52L1 protein comprising a modification, or combination thereof, as compared to a wild-type HPV52L1 protein selected from the group consisting of:
mutation of the 447 th amino acid from aspartic acid to glutamic acid;
deleting 1 to 20 consecutive or non-consecutive amino acids of the N-terminal;
deleting 1 to 25 consecutive or non-consecutive amino acids from the C-terminus;
substitution of one or more amino acids at positions 1 to 20 of the N-terminal;
substitution of one or more amino acids at positions 1 to 25 of the C-terminal;
the wild HPV52L1 protein is shown in SEQ ID No. 1;
the engineered HPV52L1 protein is represented by a sequence selected from the group consisting of: SEQ ID Nos. 2, 16 to 18, 20 to 22, 28.
2. A polynucleotide encoding the engineered HPV52L1 protein of claim 1.
3. The polynucleotide of claim 2, wherein the sequence is optimized whole gene using insect cell codons.
4. The polynucleotide according to claim 2, which is represented by a sequence selected from the group consisting of seq id nos: SEQ ID Nos. 31, 45 to 47, 49 to 51, 57.
5. A vector comprising the polynucleotide of any one of claims 2 to 4.
6. The carrier of claim 5, selected from the group consisting of: plasmids, recombinant Bacmid and recombinant baculovirus.
7. A host cell comprising the vector of claim 5 or 6.
8. The host cell of claim 7, selected from the group consisting of: coli, yeast cells, insect cells.
9. A polymer, wherein:
the multimer is a pentamer or a virus-like particle;
the multimer is formed from the engineered HPV52L1 protein of claim 1.
10. A vaccine for preventing papillomavirus infection or a disease associated therewith, comprising:
the polymer according to claim 9,
Adjuvants, methods of using the same and compositions
An excipient or carrier for a vaccine.
11. The vaccine of claim 10, wherein the adjuvant is a human adjuvant.
12. The vaccine of claim 10, further comprising one or a combination selected from the group consisting of: a mucophilic group HPV virus-like particle or chimeric virus-like particle, a dermatological group HPV virus-like particle or chimeric virus-like particle.
13. Use of the engineered HPV52L1 protein of claim 1 in the manufacture of a vaccine, wherein the vaccine is for the prevention of a papillomavirus infection or a disease associated therewith.
14. Use of the multimer of claim 9 in the preparation of a vaccine, wherein the vaccine is for preventing papillomavirus infection or a disease associated therewith.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011351390.9A CN114539365B (en) | 2020-11-26 | 2020-11-26 | Modified human papilloma virus 52 type L1 protein and application thereof |
PCT/CN2021/120518 WO2022111022A1 (en) | 2020-11-26 | 2021-09-26 | Modified human papillomavirus type 52 l1 protein and use thereof |
US18/254,576 US20240002447A1 (en) | 2020-11-26 | 2021-09-26 | Modified human papillomavirus type 52 l1 protein and use thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011351390.9A CN114539365B (en) | 2020-11-26 | 2020-11-26 | Modified human papilloma virus 52 type L1 protein and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114539365A CN114539365A (en) | 2022-05-27 |
CN114539365B true CN114539365B (en) | 2023-12-01 |
Family
ID=81667882
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011351390.9A Active CN114539365B (en) | 2020-11-26 | 2020-11-26 | Modified human papilloma virus 52 type L1 protein and application thereof |
Country Status (3)
Country | Link |
---|---|
US (1) | US20240002447A1 (en) |
CN (1) | CN114539365B (en) |
WO (1) | WO2022111022A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117054647A (en) * | 2023-07-17 | 2023-11-14 | 广东省一鼎生物技术有限公司 | Kit for detecting HPV IgG antibody and preparation method thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102268076A (en) * | 2010-07-02 | 2011-12-07 | 厦门大学 | Truncated human papillomavirus (HPV) type 52 L1 protein |
CN102552897A (en) * | 2012-01-18 | 2012-07-11 | 广东华南联合疫苗开发院有限公司 | Prophylactic VLP (Virus-like Particle) vaccine for cervical carcinoma |
CN102747047A (en) * | 2012-02-28 | 2012-10-24 | 厦门大学 | Human papillomaviruse type hybrid virus-like particles and preparation method thereof |
CN106701796A (en) * | 2015-08-12 | 2017-05-24 | 北京康乐卫士生物技术股份有限公司 | Recombinant human papilloma virus 52 virus-like particle and preparation method thereof |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE59511047D1 (en) * | 1994-10-07 | 2006-06-14 | Univ Loyola Chicago | PAPILLOMA-like particles, fusion proteins, and methods of making same |
NZ549898A (en) * | 2004-03-24 | 2009-06-26 | Merck & Co Inc | Optimized expression of HPV 52 L1 in yeast |
GB0413510D0 (en) * | 2004-06-16 | 2004-07-21 | Glaxosmithkline Biolog Sa | Vaccine |
US7758866B2 (en) * | 2004-06-16 | 2010-07-20 | Glaxosmithkline Biologicals, S.A. | Vaccine against HPV16 and HPV18 and at least another HPV type selected from HPV 31, 45 or 52 |
CN101245099A (en) * | 2007-02-14 | 2008-08-20 | 马润林 | Amino acid sequence of recombined human papilloma virus L1 capsid protein and uses thereof |
CN101481408A (en) * | 2008-01-07 | 2009-07-15 | 马润林 | Modification sequence of recombinant human mammilla tumor virus L1 capsid protein for preventing high polymerization |
-
2020
- 2020-11-26 CN CN202011351390.9A patent/CN114539365B/en active Active
-
2021
- 2021-09-26 WO PCT/CN2021/120518 patent/WO2022111022A1/en active Application Filing
- 2021-09-26 US US18/254,576 patent/US20240002447A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102268076A (en) * | 2010-07-02 | 2011-12-07 | 厦门大学 | Truncated human papillomavirus (HPV) type 52 L1 protein |
CN102552897A (en) * | 2012-01-18 | 2012-07-11 | 广东华南联合疫苗开发院有限公司 | Prophylactic VLP (Virus-like Particle) vaccine for cervical carcinoma |
CN102747047A (en) * | 2012-02-28 | 2012-10-24 | 厦门大学 | Human papillomaviruse type hybrid virus-like particles and preparation method thereof |
CN106701796A (en) * | 2015-08-12 | 2017-05-24 | 北京康乐卫士生物技术股份有限公司 | Recombinant human papilloma virus 52 virus-like particle and preparation method thereof |
Non-Patent Citations (4)
Title |
---|
Impact of naturally occurring variation in the human papillomavirus 52 capsid proteins on recognition by type-specific neutralising antibodies;Godi等;Journal of General Virology;第100卷(第02期);237-245 * |
N-terminal truncations on L1 proteins of human papillomaviruses promote their soluble expression in Escherichia coli and self-assembly in vitro;Wei等;Emerging Microbes & Infections;第07卷(第01期);1-12 * |
人乳头瘤病毒52型病毒样颗粒制备及其生物活性;郭晶等;生物技术;第29卷(第02期);127-132、139 * |
预防性人乳头瘤病毒疫苗临床研究进展;余邦威等;中国新药杂志;第27卷(第21期);2527-2533 * |
Also Published As
Publication number | Publication date |
---|---|
WO2022111022A1 (en) | 2022-06-02 |
CN114539365A (en) | 2022-05-27 |
US20240002447A1 (en) | 2024-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107188966B (en) | Papilloma virus chimeric protein and application thereof | |
PL195332B1 (en) | Pappiloma virus capsomer vaccine compositions and methods of applying them | |
CN107188967B (en) | Papilloma virus chimeric protein and application thereof | |
CN107188932B (en) | Truncated human papilloma virus 16 type L1 protein and application thereof | |
EP1305039B1 (en) | Stable (fixed) forms of viral l1 capsid proteins, fusion proteins and uses thereof | |
US20120087936A1 (en) | Therapeutic and prophylactic vaccine for the treatment and prevention of papillomavirus infection | |
US20210198322A1 (en) | Mutant of L1 Protein of Human Papillomavirus Type 39 | |
US20240000915A1 (en) | C-terminally modified human papillomavirus type 11 l1 protein and use thereof | |
CN114539365B (en) | Modified human papilloma virus 52 type L1 protein and application thereof | |
WO2022142525A1 (en) | Human papillomavirus type 58 chimeric protein and use thereof | |
CN114127092A (en) | Multivalent immunogenic compositions of human papillomavirus | |
CN107188931B (en) | Truncated human papilloma virus 58 type L1 protein and application thereof | |
CN114539364B (en) | C-terminal modified human papilloma virus type 6L1 protein and application thereof | |
US7182947B2 (en) | Papillomavirus truncated L1 protein and fusion protein constructs | |
CN114716561B (en) | Human papilloma virus 31 chimeric protein and application thereof | |
US11213580B2 (en) | Mutant of L1 protein of human papillomavirus type 16 | |
CN114716560B (en) | Human papilloma virus 18 chimeric protein and application thereof | |
WO2019233412A1 (en) | Mutant of human papillomavirus 18 l1 protein |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |