CN108570464B - 用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法 - Google Patents
用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法 Download PDFInfo
- Publication number
- CN108570464B CN108570464B CN201810330216.2A CN201810330216A CN108570464B CN 108570464 B CN108570464 B CN 108570464B CN 201810330216 A CN201810330216 A CN 201810330216A CN 108570464 B CN108570464 B CN 108570464B
- Authority
- CN
- China
- Prior art keywords
- leu
- val
- ser
- ala
- gly
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P7/00—Preparation of oxygen-containing organic compounds
- C12P7/24—Preparation of oxygen-containing organic compounds containing a carbonyl group
-
- A—HUMAN NECESSITIES
- A23—FOODS OR FOODSTUFFS; TREATMENT THEREOF, NOT COVERED BY OTHER CLASSES
- A23L—FOODS, FOODSTUFFS, OR NON-ALCOHOLIC BEVERAGES, NOT COVERED BY SUBCLASSES A21D OR A23B-A23J; THEIR PREPARATION OR TREATMENT, e.g. COOKING, MODIFICATION OF NUTRITIVE QUALITIES, PHYSICAL TREATMENT; PRESERVATION OF FOODS OR FOODSTUFFS, IN GENERAL
- A23L33/00—Modifying nutritive qualities of foods; Dietetic products; Preparation or treatment thereof
- A23L33/10—Modifying nutritive qualities of foods; Dietetic products; Preparation or treatment thereof using additives
- A23L33/105—Plant extracts, their artificial duplicates or their derivatives
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/0004—Oxidoreductases (1.)
- C12N9/0006—Oxidoreductases (1.) acting on CH-OH groups as donors (1.1)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/1003—Transferases (2.) transferring one-carbon groups (2.1)
- C12N9/1007—Methyltransferases (general) (2.1.1.)
- C12N9/1011—Catechol O-methyltransferase (2.1.1.6)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/88—Lyases (4.)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/44—Preparation of O-glycosides, e.g. glucosides
- C12P19/46—Preparation of O-glycosides, e.g. glucosides having an oxygen atom of the saccharide radical bound to a cyclohexyl radical, e.g. kasugamycin
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y201/00—Transferases transferring one-carbon groups (2.1)
- C12Y201/01—Methyltransferases (2.1.1)
- C12Y201/01006—Catechol O-methyltransferase (2.1.1.6)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y402/00—Carbon-oxygen lyases (4.2)
- C12Y402/01—Hydro-lyases (4.2.1)
- C12Y402/01118—3-Dehydroshikimate dehydratase (4.2.1.118)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/37—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi
- C07K14/39—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi from yeasts
- C07K14/395—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi from yeasts from Saccharomyces
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
- C12N15/8221—Transit peptides
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y101/00—Oxidoreductases acting on the CH-OH group of donors (1.1)
- C12Y101/01—Oxidoreductases acting on the CH-OH group of donors (1.1) with NAD+ or NADP+ as acceptor (1.1.1)
- C12Y101/01025—Shikimate dehydrogenase (1.1.1.25)
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Genetics & Genomics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Biotechnology (AREA)
- Medicinal Chemistry (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Botany (AREA)
- Mycology (AREA)
- Nutrition Science (AREA)
- Food Science & Technology (AREA)
- Polymers & Plastics (AREA)
- Preparation Of Compounds By Using Micro-Organisms (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明涉及用于香草醛或香草醛β‑D‑葡萄糖苷的生物合成的组合物和方法。具体地,本发明公开了已被工程化改造以单独地或与一种或多种香草醛生物合成酶类或UDP‑糖基转移酶(UGT)组合地表达突变体AROM多肽和/或突变体儿茶酚‑O‑甲基转移酶多肽的重组微生物、植物和植物细胞。这样的微生物、植物或植物细胞可以生产香草醛或香草醛β‑D‑葡萄糖苷。
Description
本申请是国际申请日2012年8月7日、国际申请号PCT/US2012/049842于2014年3月31日进入中国国家阶段、申请号201280048260.5、发明名称“用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法”的申请的分案申请。
发明简介
本申请要求分别提交于2011年8月8日和2011年8月10日的美国临时申请号61/521,090和61/522,096的优先权利益,所述临时申请的内容整体通过参考并入本文。
背景技术
香草醛是世界上最重要的香料化合物之一,全球市场为1.8亿美元。天然的香草醛源自于香草兰(扁叶香果兰(Vanilla planifolia))的熟化种荚,但是世界上大多数的香草醛从石化产品或木浆木质素合成。从香草豆荚生产天然香草醛是费力且缓慢的过程,其需要对花进行手工授粉和将收获到的绿色香草豆荚熟化1-6个月(Ramachandra&Ravishankar(2000)J.Sci.Food Agric.80:289-304)。生产1千克(kg)香草醛需要大约500kg香草豆荚,对应于授粉大约40,000朵花。目前,每年销售的香草醛中仅有约0.25%(16,000吨中的40吨)源自于香草豆荚,而其余的大多数从木质素或化石烃类、尤其是愈创木酚化学合成。合成生产的香草醛售价约为每千克$15,与此相比天然香草醛的价格为每千克$1200-4000(Walton等,(2003)Phytochemistry 63:505-515)。
发明概述
本发明提供了用于生产香草醛和/或香草醛β-D-葡萄糖苷的方法。本发明的方法包括下列步骤:(a)提供能够生产香草醛的重组宿主,其中所述重组宿主带有编码突变体Arom多功能酶(AROM)多肽和/或突变体儿茶酚-O-甲基转移酶(COMT)多肽的异源核酸;(b)将所述重组宿主培养足以使所述重组宿主产生香草醛和/或香草醛葡萄糖苷的时间;以及(c)从所述重组宿主或从所述培养上清液分离香草醛和/或香草醛葡萄糖苷,由此生产香草醛和/或香草醛β-D-葡萄糖苷。
在一种实施方式中,提供了突变体AROM多肽,其中所述突变体相对于缺少所述突变的相应AROM多肽具有降低的莽草酸脱氢酶活性。根据这种实施方式,所述突变体AROM多肽可以在结构域5中具有一个或多个突变、缺失结构域5的至少一部分或缺少结构域5。
在另一种实施方式中,所述突变体AROM多肽是融合多肽,其包括(i)本文描述的AROM多肽,例如,包含缺失结构域5的至少一部分的AROM多肽、或缺少结构域5的AROM多肽;以及(ii)具有3-脱氢莽草酸脱水酶(3DSD)活性的多肽。
在其他实施方式中,提供了突变体COMT多肽,其中所述突变体具有一种或多种改进的性质。具体来说,本发明提供了倾向于在原儿茶酸和/或原儿茶醛和/或原儿茶醇的间位而不是对位处催化甲基化的突变体COMT多肽。
本文描述的任何多肽还可以在所述多肽的N-端或C-端包括纯化标签、叶绿体转运肽、线粒体转运肽、造粉体肽、信号肽或分泌标签。
在某些实施方式中,所述宿主可以是微生物,例如酵母如酿酒酵母(Saccharomyces cerevisiae)、粟酒裂殖酵母(Schizosaccharomyces pombe)或大肠埃希氏杆菌(Escherichia coli)。或者,所述宿主可以是植物或植物细胞(例如小立碗藓属(Physcomitrella)植物或植物细胞、或烟草植物或植物细胞)。
本文描述的任何宿主还可以包含编码芳香族羧酸还原酶(ACAR)、3-脱氢莽草酸脱水酶(3DSD)、尿苷5’-二磷酸葡萄糖基转移酶(UGT)、磷酸泛酰巯基乙胺转移酶(PPTase)、野生型AROM和/或野生型O-甲基转移酶(OMT)的基因。
还提供了编码突变体COMT多肽或突变体AROM多肽的分离的核酸。
附图简述
图1是香草醛(4)的从头生物合成的示意图,以及在表达3DSD、ACAR、OMT和UGT以及磷酸泛酰巯基乙胺转移酶(PPTase)的生物体中发现的不同的香草醛降解产物和代谢副产物,即脱氢莽草酸(1)、原儿茶酸(2)、原儿茶醛(3)、香草酸(5)、原儿茶醇(6),香草醇(7)和香草醛β-D-葡萄糖苷(8)的概述。空心箭头示出了酵母中的主要代谢反应;黑色箭头示出了通过代谢工程引入的酶反应;对角条纹箭头示出了酵母固有的不想要的代谢反应。
图2是使用野生型AROM多肽(左侧)和缺少结构域5的突变体AROM多肽(右侧)细胞生物合成芳香族氨基酸酪氨酸、色氨酸和苯丙氨酸的示意图。
图3A示出了在表达各种突变体COMT多肽的酵母菌株中香草醛(VG)和异香草醛(IsoVg)的生产。
图3B和图3C示出了与人类OMT(Hs-OMT)和人类L198Y OMT突变体(HS-OMT L198Y)相比,从各种来源分离的COMT多肽的香草醛(VG)和异香草醛(IsoVg)生产。致病疫霉(Phytophthera infestans)(PI),长春花(Catharanthus roseus)(CR),解脂耶氏酵母(Yarrowia lipolytica)(YL),玻璃海鞘(Ciona intestinalis)GENBANK登记号XP_002121420和XP_002131313(CI-1和CI-2),Capsasproa owczarzaki(CO),嗜热毛壳菌(Chaetomium therophilum)(CT),卢西坦棒孢酵母(Clavispora lusitaniae(CL),副球孢子菌属物种(Paracoccidioides sp)‘lutzii’Pb01(PL),扁叶香果兰(Vanillaplanifolia)(VP),阿拉比卡咖啡树(Coffea Arabica)(CA),大鼠(Rattus norvegicus)(RN),小鼠(Mus musculus)(MM),泉古菌(Crenarchaeote)(CREN),Mycobacteriumvanbaleeni(MV),或粟酒裂殖酵母(Schizosaccharomyces pombe)(SP)。
图4示出了在增补有3mM香草醇的表达简青霉(Penicillium simplicissium)(PS)或Rhodococcus jostii(RJ)香草醇氧化酶(VAO)的酵母菌株中香草醛葡萄糖苷、香草醛、香草醇葡萄糖苷和香草醇的水平。
图5示出了在增补有3mM香草酸的培养基中生长的表达N.iowensis ACAR或粗糙脉孢霉(N.crassa)ACAR和大肠杆菌磷酸泛酰巯基乙胺转移酶(PPTase)或粟酒裂殖酵母(S.pombe)PPTase的酵母菌株中香草酸、香草醛和香草醛葡萄糖苷的水平。
发明详述
本发明是基于下述发现,即突变体AROM多肽可用于在重组宿主中提高3-脱氢莽草酸(3-DHS)积累。3-DHS是香草醛生物合成的前体,并且如果更多的细胞内3-DHS可用,则在香草醛生物合成途径的第一个关键步骤中可以制造更多的原儿茶酸。参见图1。在酵母中,AROM是由ARO1基因编码的五功能酶复合物。所述基因长为4764bp,并编码长度为1588个氨基酸的相应多肽。AROM执行5个连续的酶促转化,即将DAHP(3-脱氧-D-阿拉伯糖基-庚酮糖酸-7-磷酸)转变成3-DHQ(3-脱氢奎尼酸),其被转变成3-DHS(3-脱氢莽草酸),其被转变成莽草酸,其被转变成莽草酸-3-P(莽草酸3-磷酸),其被转变成EPSP(5-烯醇丙酮莽草酸3-磷酸),它们全都在芳香族氨基酸酪氨酸、色氨酸和苯丙氨酸的细胞生物合成途径中(参见图2)。
AROM的5种催化功能在于ARO1编码的多肽的5个不同结构域。AROM多肽的功能结构域的次序,与在DAHP至EPSP的5步转化中需要它们所具有的次序不同(参见图2)。因此,结构域1对应于催化活性1,结构域2对应于催化活性5,结构域3对应于催化活性4,结构域4对应于催化活性2,并且最后的结构域5对应于催化活性3(参见图1)。
含有AROM多肽的结构域5变体(例如在结构域5中具有一个或多个突变的AROM多肽或一部分结构域5被缺失的AROM多肽)的重组宿主,与表达野生型AROM多肽的相应宿主相比可以具有提高的3-DHS水平。这样的宿主也可以具有增加的原儿茶酸生产。
可选替地或此外,突变体COMT多肽可用于提高香草醛β-D-葡萄糖苷的生物合成。例如,本文描述的突变体COMT多肽可以具有下列性质中的一个或多个:提高的周转;倾向于在间位(3’)而不是对位(4’)处甲基化,以便相对于异香草醛更有利于香草醛的生产;或对香草醛途径底物原儿茶酸和原儿茶醛的更好的特异性。
因此,本发明提供了突变体AROM和突变体COMT多肽和编码这样的多肽的核酸及其在香草醛和/或香草醛葡萄糖苷的生物合成中的使用。所述方法包括下述步骤:提供能够在碳源存在下生产香草醛的重组宿主,其中所述重组宿主带有编码突变体COMT多肽和/或突变体AROM多肽的异源核酸;将所述重组宿主在碳源存在下进行培养;以及从所述重组宿主或从所述培养上清液纯化香草醛和/或香草醛葡萄糖苷。
本文描述的重组宿主可用于生产香草醛或香草醛葡萄糖苷的方法。例如,如果重组宿主是微生物,方法可以包括将重组微生物在培养基中,在表达香草醛和/或香草醛葡萄糖苷生物合成基因的条件下生长。重组微生物可以在分批、补料分批或连续过程或其组合中生长。通常,重组微生物在处于确定温度下的发酵罐中,在适合的营养源例如碳源存在下生长所需时间长度,以生产所需量的香草醛和/或香草醛葡萄糖苷。
在本方法中使用的碳源包括可以被重组宿主细胞代谢以促进生长和/或香草醛和/或香草醛葡萄糖苷生产的任何分子。适合的碳源的实例包括但不限于蔗糖(例如在糖蜜中存在的)、果糖、木糖、乙醇、甘油、葡萄糖、纤维素、淀粉、纤维二糖或其他含葡萄糖聚合物。例如,在使用酵母作为宿主的实施方式中,诸如蔗糖、果糖、木糖、乙醇、甘油和葡萄糖的碳源是适合的。碳源可以在整个培养时间段中提供给宿主生物体,或者生物体可以在另一种能量源例如蛋白质存在下生长一段时间,然后仅在分批补料阶段中提供碳源。
在一种实施方式中,这种方法的微生物带有编码突变体AROM多肽和任选地野生型COMT多肽的核酸。在另一种实施方式中,这种方法的微生物带有编码突变体COMT多肽和任选地野生型AROM多肽的核酸。在又一种实施方式中,这种方法的微生物带有编码突变体AROM多肽和任选地突变体COMT多肽的核酸。取决于方法中使用的具体微生物,也可以存在并表达编码3DSD、ACAR、UGT或PPTase的其他重组基因。因此,在一种实施方式中,重组宿主可以是带有编码3DSD、ACAR、UGT或PPTase的一种或多种异源核酸或上述任一者的与其具有至少80%、例如至少90%、例如至少95%序列同一性的功能同源物的微生物。产物、底物和中间体例如脱氢莽草酸、原儿茶酸、原儿茶醛、香草醛和香草醛β-D-葡萄糖苷的水平,可以通过从培养基提取样品并按照已发表的方法进行分析来确定。
在将重组微生物在培养基中生长所需时间长度后,可以使用本领域已知的各种技术从培养基回收香草醛和/或香草醛β-D-葡萄糖苷,例如通过萃取、真空蒸馏和从水性溶液多步重结晶以及超滤进行分离和纯化(等,(1997)J.Membrane Sci.137:155-158;Borges da Silva等,(2009)Chem.Eng.Des.87:1276-1292)。使用巯基化合物例如二硫苏糖醇、二硫赤藓糖醇、谷胱甘肽或L-半胱氨酸(美国专利号5,128,253)或碱性KOH溶液(WO94/13614)的两相萃取方法,已被用于香草醛的回收及其与其他芳香物质的分离。使用聚醚-聚酰胺共聚物膜从生物转化的培养基吸附和渗透蒸发香草醛,也已被描述(等,(1997)同上;Zucchi等,(1998)J.Microbiol.Biotechnol.8:719-722)。具有交联聚苯乙烯骨架的大孔吸附树脂,也已被用于从水性溶液回收溶解的香草醛(Zhang等,(2008)Eur.Food Res.Technol.226:377-383)。也已评估了超滤和膜接触器(MC)技术用于回收香草醛(Zabkova等,(2007)J.Membr.Sci.301:221-237;Scuibba等,(2009)Desalination241:357-364)。或者,可以使用常规技术例如渗滤或超临界二氧化碳萃取和反渗透进行浓缩。如果重组宿主是植物或植物细胞,可以使用本领域已知的各种技术从植物组织提取香草醛或香草醛葡萄糖苷。
在某些实施方式中,香草醛或香草醛β-D-葡萄糖苷可以使用用含前体分子的原材料喂养的完整细胞来生产。原材料可以在细胞生长期间或细胞生长之后进料。完整细胞可以是悬浮的或固定化的。完整细胞可以在发酵液中或在反应缓冲液中。在某些实施方式中,可能需要渗透剂将底物有效转移到细胞中。
在某些实施方式中,香草醛或香草醛β-D-葡萄糖苷被分离和纯化至均质(例如纯度至少90%、92%、94%、96%或98%)。在其他实施方式中,香草醛或香草醛β-D-葡萄糖苷被分离为重组宿主的提取物。在这种情况下,香草醛或香草醛β-D-葡萄糖苷可以被分离但不必纯化至均质。理想情况下,生产的香草醛或香草醛β-D-葡萄糖苷的量可以为约1mg/l至约20,000mg/L或更高。例如,可以生产约1至约100mg/L、约30至约100mg/L、约50至约200mg/L、约100至约500mg/L、约100至约1,000mg/L、约250至约5,000mg/L、约1,000至约15,000mg/L或约2,000至约10,000mg/L的香草醛或香草醛β-D-葡萄糖苷。一般来说,更长的培养时间产生更大量的产物。因此,可以将重组微生物培养1天至7天、1天至5天、3天至5天、约3天、约4天或约5天。
应该认识到,本文中讨论的各个基因和模块可以存在于两种或更多种重组微生物而不是单种微生物中。当使用多种重组微生物时,它们可以在混合培养中生长以生产香草醛和/或香草醛葡萄糖苷。
分离和任选地纯化的香草醛或香草醛β-D-葡萄糖苷的提取物可用于香料消费品例如食品、饮食增补剂、营养品、药物组合物、牙齿卫生组合物和化妆品中。
当在本文中使用时,短语“食品”包括但不限于水果、蔬菜、果汁、肉制品例如火腿、培根和香肠,蛋制品、水果浓缩物、明胶和明胶状产品例如果酱、果冻、蜜饯等,乳制品例如冰激凌、酸奶油和果子露,糖霜、糖浆包括糖蜜,玉米、小麦、黑麦、大豆、燕麦、大米和大麦制品,坚果仁和坚果制品,蛋糕、曲奇、糖食例如糖果、橡皮糖、水果味糖豆和巧克力,口香糖、薄荷糖、奶油、糖霜、冰激凌、派和面包,饮料例如咖啡、茶、碳酸软饮料例如可口可乐和百事可乐、非碳酸软饮料、果汁和其他水果饮料、运动饮料例如GATORADE、咖啡、茶、冰茶、可乐,含酒精饮料例如啤酒、葡萄酒和白酒以及KOOL-AID。
食品还包括调味品例如药草、香料和佐料、增味剂。食品还包括制备的包装产品,例如饮食甜味剂、液体甜味剂、在用水重构后提供非碳酸饮料的颗粒调味混合物、即溶布丁混合物、速溶咖啡和茶、咖啡伴侣、麦乳精混合物、宠物食品、牲畜饲料、烟草和用于烘烤应用的材料,例如用于制备面包、曲奇、蛋糕、薄煎饼、甜甜圈等的粉末烘烤混合物。食品还包括含有很少或不含蔗糖的饮食或低卡路里食物和饮料。本发明设想的食品的其他实例描述在下面以及整个本说明书中。
在另一种实施方式中,食品是水果、蔬菜、果汁、肉制品例如火腿、培根和香肠,蛋制品、水果浓缩物、明胶和明胶状产品例如果酱、果冻、蜜饯等,乳制品例如冰激凌、酸奶油和果子露,糖霜、糖浆包括糖蜜,玉米、小麦、黑麦、大豆、燕麦、大米和大麦制品,坚果仁和坚果制品,蛋糕、曲奇、糖食例如糖果、橡皮糖、水果味糖豆和巧克力,奶油、糖霜、冰激凌、派和面包。
在另一种实施方式中,消费品是药物组合物。优选的组合物是含有香草醛和/或香草醛β-D-葡萄糖苷和一种或多种可药用赋形剂的药物组合物。这些药物组合物可用于配制含有发挥生物效应的一种或多种活性剂的药品。就此而言,药物组合物优选地还包括发挥生物效应的一种或多种活性剂。这样的活性剂包括具有活性的药剂和生物剂。这样的活性剂在本领域中是公知的。参见例如《医生桌面参考》(The Physician's Desk Reference)。这样的组合物可以按照本领域中已知的程序来制备,例如在《Remington制药学》(Remington's Pharmaceutical Sciences),Mack Publishing Co.,Easton,PA,USA中所描述的。在一种实施方式中,这样的活性剂包括支气管扩张药、减食欲药、抗组胺药、营养增补剂、缓泻药、镇痛剂、麻醉剂、解酸剂、H2-受体拮抗剂、抗胆碱能类药物、止泻药、缓和剂、止咳药、止恶心药、抗微生物剂、抗细菌剂、抗真菌剂、抗病毒剂、祛痰药、消炎药、退热剂及其混合物。在一种实施方式中,活性剂是退热剂或镇痛剂,例如布洛芬、对乙酰氨基酚或阿司匹林;缓泻药例如酚酞二辛基磺基琥珀酸钠;食欲抑制剂例如安非他命、苯丙醇胺、苯丙醇胺盐酸盐或咖啡因;解酸剂例如碳酸钙;平喘药例如茶碱;抗利尿剂例如盐酸地芬诺酯;有抗胃肠气胀活性的药剂例如西甲硅油;偏头痛药例如酒石酸麦角胺;精神病药理学药剂例如氟呱啶醇;解痉剂或镇静药例如苯巴比妥;抗运动功能亢进药例如甲基多巴或哌醋甲酯;镇定剂例如苯并二氮卓类、hydroxinmeprobramates或吩噻嗪类;抗组胺药例如阿司咪唑、马来酸氯苯那敏、马来酸吡拉明、琥珀酸多西拉敏、马来酸溴苯那敏、柠檬酸苯托沙敏、盐酸氯环嗪、马来酸非尼拉敏和酒石酸苯茚胺;解充血药例如盐酸苯丙醇胺、盐酸苯肾上腺素、盐酸伪麻黄碱、硫酸伪麻黄碱、重酒石酸苯丙醇胺和麻黄碱;β-受体阻断剂例如普萘洛尔;用于戒酒的药剂例如双硫仑;止咳药例如苯佐卡因、氢溴酸右美沙芬、诺斯卡品、柠檬酸咳必清和盐酸氯苯达诺;氟增补剂例如氟化钠;局部抗生素例如四环素或氯洁霉素;皮质甾类增补剂例如泼尼松或泼尼松龙;抗甲状腺肿形成药剂例如秋水仙素或别嘌呤醇;抗癫痫药例如苯妥英钠;抗脱水药剂例如电解质增补剂;防腐剂例如氯化鲸蜡基吡啶;NSAID例如对乙酰氨基酚、布洛芬、萘普生或其盐;胃肠活性剂例如洛哌丁胺和法莫替丁;各种生物碱类例如磷酸可待因、硫酸可待因或吗啡;微量元素增补剂例如氯化钠、氯化锌、碳酸钙、氧化镁和其他碱金属盐和碱土金属盐;维生素;离子交换树脂例如消胆胺;胆甾醇抑制药和降脂物质;抗心律不齐药例如N-乙酰普鲁卡因酰胺;以及祛痰药例如愈创木酚甘油醚。
具有特别令人不快的味道的活性物质包括抗细菌剂例如环丙沙星、氧氟沙星和培氟沙星;抗癫痫药例如唑尼沙胺;大环内酯类抗生素例如红霉素;β-内酰胺类抗生素例如青霉素类和头孢菌素类;治疗精神病的活性物质例如氯丙嗪;活性物质例如安乃近;以及具有抗溃疡活性的药剂例如西咪替丁。
本发明的药物组合物以适合于实现它们所打算的目的的任何形式给药于对象。然而,优选地,组合物是可以通过颊或口部给药的组合物。或者,药物组合物可以是口或鼻喷剂。对象是任何动物例如人类,尽管本发明不打算仅限于此。其他适合的动物包括犬科动物、猫科动物、狗、猫、牲畜、马、牛、绵羊等。当在本文中使用时,兽用组合物是指适用于非人类动物的药物组合物。这样的兽用组合物在本领域中是已知的。
在另一种实施方式中,药物组合物是用于口服给药的液体剂型,包括可药用乳液、溶液、悬液、糖浆和酏剂。除了活性化合物之外,液体剂型可以含有本领域中常用的惰性稀释剂例如水或其他溶剂,增溶剂和乳化剂例如乙醇、异丙醇、碳酸乙酯、乙酸乙酯、苯甲醇、苯甲酸苯甲酯、丙二醇、1,3-丁二醇、二甲基甲酰胺、油类(尤其是棉籽油、花生油、玉米油、胚芽油、橄榄油、蓖麻油和芝麻油)、甘油、四氢糠醇、聚乙二醇和失水山梨糖醇的脂肪酸酯及其混合物。除了活性化合物之外,悬液还可以包含悬浮剂例如乙氧基化异硬脂醇、聚氧乙烯山梨糖醇和失水山梨糖醇的酯类、微晶纤维素、偏氢氧化铝、膨润土、琼脂和黄耆树胶及其混合物。
本发明的药物组合物可以采取可咀嚼片剂的形式。可咀嚼片剂在本领域中是已知的。参见例如美国专利号4,684,534和6,060,078,其每个以其全部内容通过参考并入本文。可以将任何类型的药物包含在可咀嚼片剂中,优选为具有苦味的药物、天然植物提取物或其他有机化合物。更优选地,可以在核心中包含维生素例如维生素A、维生素B、维生素B1、维生素B2、维生素B6、维生素C、维生素E和维生素K;天然植物提取物例如Sohgunjung-tang提取物、Sipchundaebo-tang提取物和刺五加片(Eleutherococcus senticosus)提取物;有机化合物例如茶苯海明、氯苯甲嗪、对乙酰氨基酚、阿司匹林、苯丙醇胺和氯化鲸蜡基吡啶;或胃肠药剂例如干燥的氢氧化铝凝胶、多潘立酮、可溶性甘菊环烃、L-谷氨酰胺和水滑石。
本发明的药物组合物可以是口服崩解组合物。口服崩解片剂在本领域中是已知的。参见例如美国专利号6,368,625和6,316,029,其每个以其全部内容通过参考并入本文。
本发明的药物组合物可以是固体剂型,包括香草醛或香草醛β-D-葡萄糖苷和水和/或唾液激活的泡腾颗粒,例如具有可控泡腾速率的泡腾颗粒。泡腾组合物还可以包含药物活性化合物。泡腾药物组合物在本领域中是已知的。参见例如美国专利号6,649,186,其全部内容通过参考并入本文。泡腾组合物可用于以下应用:制药、兽医、园艺、家用、食品、烹饪、杀虫、农业、化妆品、除草、工业、清洁、甜食和调味。含有包含香草醛或香草醛β-D-葡萄糖苷的泡腾组合物的制剂还可以包括一种或多种其他佐剂和/或活性成分,其选自本领域中已知的成分,包括调味剂、稀释剂、着色剂、粘合剂、填充剂、表面活性剂、崩解剂、稳定剂、压实介质和非泡腾崩解剂。
药物组合物可以是膜状或薄片状药物组合物。这样的膜状或薄片状药物组合物可以被配置成例如快速崩解的给药形式,例如在1秒直至3分钟的时间段内崩解的给药形式,或配置成缓慢崩解的给药形式,例如在3至15分钟的时间段内崩解的给药形式。通过使用例如具有不同的崩解或溶解特性的形成基质的聚合物,可以将所示崩解时间设定到上述范围。因此,通过混合相应的聚合物组分,可以调节崩解时间。此外,将水“抽取”到基质内并引起基质从内部迸裂的崩解剂是已知的。因此,本发明的某些实施方式包括这样的崩解剂以便调整崩解时间。
用于膜状或薄片状药物组合物的适合聚合物包括纤维素衍生物、聚乙烯醇(例如MOWIOL)、聚丙烯酸酯、聚乙烯吡咯烷酮、纤维素醚例如乙基纤维素,以及聚乙烯醇、聚氨酯、聚甲基丙烯酸酯、聚甲基丙烯酸甲酯以及上述聚合物的衍生物和共聚物。
在某些实施方式中,本发明的膜状或薄片状药物组合物的总厚度优选地为5μm直到10mm,优选地30μm至2mm,特别优选地0.1mm至1mm。药物制剂可以是圆形、卵形、椭圆形、三角形、四边形或多边形,但它们也可以具有任何圆滑的形状。
本发明的药物组合物可以采取气溶胶形式。气溶胶组合物还可以包含药物活性剂。气溶胶组合物在本领域中是已知的。参见例如美国专利号5,011,678,其全部内容通过参考并入本文。作为非限制性实例,本发明的气溶胶组合物可以包括医学有效量的药物活性物质、香草醛或香草醛β-D-葡萄糖苷和生物相容的推进剂,例如烃/氟碳化合物推进剂。
在本发明的一种实施方式中,药物组合物是营养组合物。具有不想要的味道的营养组合物的实例包括但不必定限于用于治疗营养缺陷、创伤、手术、克罗恩病、肾病、高血压、肥胖症等的肠营养品,用于促进运动性能、肌肉增强或综合健康状况或先天性代谢障碍例如苯丙酮尿症的肠营养品。具体来说,这样的营养制剂可以含有具有苦味或金属味或回味的一种或多种氨基酸。这样的氨基酸包括但不限于必需氨基酸例如亮氨酸、异亮氨酸、组氨酸、赖氨酸、甲硫氨酸、苯丙氨酸、苏氨酸、色氨酸、酪氨酸或缬氨酸的L-异构体。
在一种实施方式中,本发明的消费品是含有香草醛和/或香草醛β-D-葡萄糖苷的牙齿卫生组合物。牙齿卫生组合物在本领域中是已知的,包括但不必定限于牙膏、漱口液、牙斑冲洗液、牙线、牙痛缓解剂(例如ANBESOL)等。
在另一种实施方式中,本发明的消费品是含有香草醛和/或香草醛β-D-葡萄糖苷的化妆品。例如但不是限制性地,化妆品可以是面霜、口红、唇彩等。本发明的其他适合的组合物包括润唇膏例如CHAPSTICK或BURT'S BEESWAX润唇膏,其还包含香草醛和/或香草醛β-D-葡萄糖苷。
Arom多功能酶(AROM)多肽
AROM多肽的非限制性实例包括具有SEQ ID NO:4中显示的氨基酸序列的酿酒酵母(Saccharomyces cerevisiae)多肽(GENBANK登记号X06077),粟酒裂殖酵母(Schizosaccharomyces pombe)多肽(GENBANK登记号NP_594681.1),Schizosaccharomycesjaponicas多肽(GENBANK登记号XP_002171624),粗糙脉孢霉(Neurospora crassa)多肽(GENBANK登记号XP_956000)和解脂耶氏酵母(Yarrowia lipolytica)多肽(GENBANK登记号XP_505337)。
当在本文中使用时,术语“AROM多肽”是指与SEQ ID NO:4中显示的序列具有至少80%(例如至少85、90、95、96、97、98、99或100%)的同一性,并具有酿酒酵母AROM多肽的5种酶活性即3-脱氢奎尼酸脱水酶活性、3-脱氢奎尼酸合成酶活性、3-磷酸莽草酸1-羧基乙烯基转移酶活性、莽草酸3-脱氢酶(NADP+)活性和莽草酸激酶活性中的至少4种活性的任意氨基酸序列。
根据本发明的一种实施方式,AROM多肽是具有降低的莽草酸脱氢酶活性的突变体AROM多肽。当在重组宿主中表达时,突变体AROM多肽将代谢流从芳香族氨基酸生产重新导向香草醛前体生产(图2)。降低的莽草酸脱氢酶活性可以使用LC-MS,从表达突变体AROM多肽的重组宿主中脱氢莽草酸的积累推断出来。
本文描述的突变体AROM多肽可以在结构域5中具有一个或多个突变(例如一个或多个氨基酸的置换,一个或多个氨基酸的缺失,一个或多个氨基酸的插入,或置换、缺失和插入的组合)。例如,突变体AROM多肽可以在结构域5的至少一部分中具有缺失(例如整个结构域5即SEQ ID NO:4中的氨基酸序列的1305至1588位氨基酸的缺失),或者可以在结构域5中具有一个或多个氨基酸置换,从而使得突变体AROM多肽具有降低的莽草酸脱氢酶活性。缺少结构域5的示例性的突变体AROM多肽提供在SEQ ID NO:2中。在结构域5中具有至少一个氨基酸置换的示例性突变体AROM多肽包括分别显示在SEQ ID NO:6、SEQ ID NO:8、SEQID NO:10、SEQ ID NO:12、SEQ ID NO:14、SEQ ID NO:16、SEQ ID NO:18、SEQ ID NO:20、SEQID NO:22和SEQ ID NO:24中的AROM多肽A1533P、P1500K、R1458W、V1349G、T1366G、I1387H、W1571V、T1392K、K1370L和A1441P。
特别有用的氨基酸置换可以存在于例如与SEQ ID NO:4中显示的氨基酸序列的1349、1366、1370、1387、1392、1441、1458、1500、1533或1571位置对齐的一个或多个位置处。例如,修饰的AROM多肽可以在与SEQ ID NO:4中显示的氨基酸序列的1370位置或1392位置对齐的位置处具有置换。
例如,修饰的AROM多肽可以具有一个或多个下列氨基酸:在与SEQ ID NO:4中显示的氨基酸序列的1349位置对齐的位置处,缬氨酸之外的氨基酸(例如甘氨酸);在与SEQ IDNO:4中显示的氨基酸序列的1366位置对齐的位置处,苏氨酸之外的氨基酸(例如甘氨酸);在与SEQ ID NO:4中显示的氨基酸序列的1370位置对齐的位置处,赖氨酸之外的氨基酸(例如亮氨酸);在与SEQ ID NO:4中显示的氨基酸序列的1387位置对齐的位置处,异亮氨酸之外的氨基酸(例如组氨酸);在与SEQ ID NO:4中显示的氨基酸序列的1392位置对齐的位置处,苏氨酸之外的氨基酸(例如赖氨酸);在与SEQ ID NO:4中显示的氨基酸序列的1441位置对齐的位置处,丙氨酸之外的氨基酸(例如脯氨酸);在与SEQ ID NO:4中显示的氨基酸序列的1458位置对齐的位置处,精氨酸之外的氨基酸(例如色氨酸);在与SEQ ID NO:4中显示的氨基酸序列的1500位置对齐的位置处,脯氨酸之外的氨基酸(例如赖氨酸);在与SEQ IDNO:4中显示的氨基酸序列的1533位置对齐的位置处,丙氨酸之外的氨基酸(例如脯氨酸);或在与SEQ ID NO:4中显示的氨基酸序列的1571位置对齐的位置处,色氨酸之外的氨基酸(例如缬氨酸)。
在某些实施方式中,将修饰的AROM多肽融合到催化香草醛生物合成的第一关键步骤的多肽3-脱氢莽草酸脱水酶(3DSD)。具有3DSD活性并且适用于融合多肽中的多肽包括来自于Podospora pauciseta、玉米黑粉菌(Ustilago maydis)、Rhodoicoccus jostii、不动杆菌属物种(Acinetobacter sp.)、黑曲霉(Aspergillus niger)或粗糙脉孢霉(Neurospora crassa)的3DSD多肽。参见GENBANK登记号CAD60599、XP_001905369.1、XP_761560.1、ABG93191.1、AAC37159.1和XM_001392464。
例如,可以将修饰的缺少结构域5的AROM多肽融合到具有3DSD活性的多肽(例如Podospora pauciseta 3DSD)。SEQ ID NO:26显示了这样的蛋白质的氨基酸序列,SEQ IDNO:27显示了编码所述蛋白质的核酸序列。
儿茶酚-O-甲基转移酶(COMT)多肽
在某些实施方式中,本发明的COMT多肽可以是咖啡酰基-O-甲基转移酶。在其他实施方式中,COMT多肽优选地是儿茶酚-O-甲基转移酶。更优选地,本发明的COMT多肽是相对于具有SEQ ID NO:27中显示的氨基酸序列的智人(Homo sapiens)COMT具有提高的原儿茶醛、原儿茶酸和/或原儿茶醇的间位羟基甲基化的突变体(COMT)多肽。
可以根据本发明进行突变的COMT多肽的非限制性实例,包括分类在EC编号2.1.1.6下的家族中的COMT多肽,例如具有SEQ ID NO:27中显示的氨基酸序列的智人(Hs)多肽(也参见GENBANK登记号NM_000754),具有SEQ ID NO:53中显示的氨基酸序列的拟南芥(Arabidopsis thaliana)多肽(GENBANK登记号AY062837),或具有SEQ ID NO:54中显示的氨基酸序列的草莓(Fragaria x ananassa)多肽(GENBANK登记号AF220491)。人类COMT多肽存在几种变体,并且COMT多肽可以是这些变体中的任一个,然而在优选实施方式中,人类COMT多肽是SEQ ID NO:27或SEQ ID NO:55。其他适合用于本发明的哺乳动物COMT多肽包括但不限于从下述物种分离到的COMT多肽:黑猩猩(Pan troglodytes)(GENBANK登记号XP_514984),普通猕猴(Macaca mulatta)(GENBANK登记号AFJ70145),家马(Equus caballus)(GENBANK登记号NP_001075303),家犬(Canis lupus familiaris)(GENBANK登记号AAR20324),黑线仓鼠(Cricetulus griseus)(GENBANK登记号EGV97595),野猪(Susscrofa)(GENBANK登记号NP_001182259)和家牛(Bos taurus)(GENBANK登记号NP_001095787)。来自于植物和微生物来源的其他示例性COMT多肽包括但不限于从下列物种分离的COMT多肽:月季(Rosa chinensis)(GENBANK登记号CAD29457),扁桃(Prunus dulcis)(GENBANK 登记号CAA58218),陆地棉(Gossypium hirsutum)(GENBANK登记号ACT32028),麻风树(Jatropha curcas)(GENBANK登记号ACT87981),赤桉(Eucalyptus camaldulensis)(ADB82906),Candida orthopsilosis(GENBANK登记号CCG25047),树干毕赤酵母(Pichiastipitis)(GENBANK登记号ABN67921)和Spathaspora passalidarum(GENBANK登记号EGW29958)。在某些实施方式中,本发明的COMT多肽从下述物种获得:致病疫霉(Phytophthera infestans)(GENBANK登记号XP_002899214),长春花(Catharanthusroseus)(GENBANK登记号EGS21863),解脂耶氏酵母(Yarrowia lipolytica)(GENBANK登记号XP_500451),玻璃海鞘(Ciona intestinalis)(GENBANK登记号XP_002121420或XP_002131313),Capsasproa owczarzaki(GENBANK登记号EFW46044),嗜热毛壳菌(Chaetomiumtherophilum)(GENBANK登记号EGS21863),卢西坦棒孢酵母(Clavispora lusitaniae)(GENBANK登记号XP_002899214),副球孢子菌属物种(Paracoccidioides sp)‘lutzii’Pb01(GENBANK登记号XP_002793380),扁叶香果兰(Vanilla planifolia)(SEQ ID NO:56),阿拉比卡咖啡树(Coffea Arabica)(GENBANK登记号AAN03726),大鼠(Rattus norvegicus)(GENBANK登记号NP_036663),小鼠(Mus musculus)(GENBANK登记号NP_031770),泉古菌(Crenarchaeote)(GENBANK登记号ABZ07345),Mycobacterium vanbaleeni(GENBANK登记号ABM14078),或粟酒裂殖酵母(Schizosaccharomyces pombe)(GENBANK登记号NP_001018770),其已被显示表现出所需的COMT活性(图3B和3C)。
当在本文中使用时,术语“COMT多肽”是指与SEQ ID NO:27中显示的Hs COMT序列具有至少80%(例如至少85、90、95、96、97、98、99或100%)的同一性并具有野生型Hs COMT多肽的儿茶酚-O-甲基转移酶活性的任意氨基酸序列。
在一种实施方式中,当在本文中使用时,术语“突变体COMT多肽”是指其具有的氨基酸序列与SEQ ID NO:27中显示的Hs COMT序列具有至少80%、例如至少85%、例如至少90%、例如至少95%、例如至少96%、例如至少97%、例如至少98%、例如至少99%的同一性,并且能够催化原儿茶酸和/或原儿茶醛的间位处的–OH基团的甲基化的任意多肽,其中所述突变体COMT多肽的氨基酸序列与SEQ ID NO:27的差别为至少一个氨基酸。此外,突变体COMT多肽的氨基酸序列应该与SEQ ID NO:53、SEQ ID NO:54和SEQ ID NO:55有至少一个氨基酸的差别。优选地,突变体COMT多肽与任意野生型COMT多肽的任意序列有至少一个氨基酸的差别。
在本发明的另一种实施方式中,术语“突变体COMT多肽”是指其具有的氨基酸序列与SEQ ID NO:53或SEQ ID NO:54具有至少80%、例如至少85%、例如至少90%、例如至少95%、例如至少96%、例如至少97%、例如至少98%、例如至少99%的同一性,并且能够催化原儿茶酸和/或原儿茶醛的间位处的–OH基团的甲基化的多肽,其中所述突变体COMT多肽的氨基酸序列与SEQ ID NO:53和SEQ ID NO:54中的每一个具有至少一个氨基酸的差别。
本文描述的突变体COMT多肽在例如底物结合位点中可以具有一个或多个突变(例如一个或多个氨基酸的置换,一个或多个氨基酸的缺失,一个或多个氨基酸的插入,或置换、缺失和插入的组合)。例如,突变体COMT多肽可以在人类COMT的底物结合位点中具有一个或多个氨基酸置换。
在某些实施方式中,本发明的“突变体COMT多肽”与SEQ ID NO:27、SEQ ID NO:53、SEQ ID NO:54或SEQ ID NO:55仅有一个或两个氨基酸残基的差别,其中所述突变体与野生型蛋白质之间的差别在底物结合位点中。
正如本文所描述的,突变体COMT多肽可用于提高香草醛葡萄糖苷的生物合成。例如,突变体COMT多肽可以具有下述性质中的一个或多个:提高的周转;倾向于在间位(3’)而不是对位(4’)处甲基化,以便相对于异香草醛更有利于香草醛的生产;或对香草醛途径底物原儿茶酸和原儿茶醛的更好的特异性。突变体COMT多肽可以在体外使用甲基化分析法表征,或者在重组宿主体内基于香草酸、香草醛或香草醛葡萄糖苷的生产来表征。
异香草醛、香草醛、异香草酸和香草酸的结构如下所示。
野生型Hs COMT缺少原儿茶醛和原儿茶酸的区域选择性O-甲基化,表明Hs COMT的结合位点与这些底物的结合取向不允许所需的区域选择性甲基化。不受特定机制的限制,Hs COMT的活性位点由起到甲基供体作用的辅酶S-腺苷甲硫氨酸(SAM)和含有与Mg2+配位并靠近Lys144的待甲基化羟基的儿茶酚底物构成。O-甲基化通过SN2机制进行,其中Lys144起到催化性碱的作用,其使邻近的羟基脱质子化,以形成攻击来自于SAM的锍的甲基的氧负离子。参见例如Zheng&Bruice(1997)J.Am.Chem.Soc.119(35):8137-8145;Kuhn&Kollman(2000)J.Am.Chem.Soc.122(11):2586-2596;Roca等,(2003)J.Am.Chem.Soc.125(25):7726-37。
在本发明的一种实施方式中,本发明提供了突变体COMT多肽,其能够催化原儿茶酸的–OH基团的甲基化,其中所述甲基化导致产生与异香草酸相比至少4倍多的香草酸,优选地与异香草酸相比至少5 倍多的香草酸,例如与异香草酸相比至少10倍多的香草酸,例如与异香草酸相比至少15倍多的香草酸,例如与异香草酸相比至少20倍多的香草酸,例如与异香草酸相比至少25倍多的香草酸,例如与异香草酸相比至少30倍多的香草酸;并且其具有的氨基酸序列与SEQ ID NO:27具有至少一个氨基酸的差别。
除了上面提到的性质之外,优选地突变体COMT多肽还能够催化原儿茶醛的–OH基团的甲基化,其中所述甲基化导致产生与异香草醛相比至少4、5、10、15、20、25或30倍多的香草醛;和/或能够催化原儿茶醇的–OH基团的甲基化,其中所述甲基化导致产生与异香草醇相比至少4、5、10、15、20、25或30倍多的香草醇。
为了确定给定的突变体COMT多肽是否能够催化原儿茶酸的–OH基团的甲基化,其中所述甲基化导致产生与异香草酸相比至少X倍多的香草酸,可以执行体外测定法。在这样的测定法中,将原儿茶酸与突变体COMT多肽在甲基供体的存在下温育,然后测定产生的异香草酸和香草酸的水平。所述甲基供体可以是例如S-腺苷甲硫氨酸。更优选地,这可以通过产生带有编码待测试的突变体COMT多肽的异源核酸的重组宿主来测定,其中所述重组宿主还能够生产原儿茶酸。在培养重组宿主后,可以测定产生的异香草酸和香草酸的水平。与这种方法相关,编码待测试的突变体COMT多肽的所述异源核酸优选地可操作地连接到允许在所述重组宿主中表达的调控区。此外,优选地,重组宿主表达至少一种3DSD和至少一种ACAR,其优选地可以是本文中描述的3DSD和ACAR之一。在重组宿主表达能够催化香草酸向香草醛的转化的ACAR的实施方式中,则所述方法也可以包括测定产生的香草醛和异香草醛的水平。重组宿主也可以表达能够催化香草醛和异香草醛的葡萄糖基化的至少一种UGT,在这种情况下,代替香草醛和异香草醛的水平,可以分别测定香草醛-葡萄糖苷和异香草醛-葡萄糖苷的水平。或者,这可以通过产生带有编码待测试的突变体COMT多肽的异源核酸的重组宿主,并将原儿茶酸进料到所述重组宿主,然后测定产生的异香草酸和香草酸的水平,来测定。
同样地,可以使用体外测定法或重组宿主细胞来确定突变体COMT多肽是否能够催化原儿茶醛的–OH基团的甲基化,其中所述甲基化导致产生与异香草醛相比至少X倍多的香草醛。然而,在这种测定法中,使用原儿茶醛作为起始原料,并测定香草醛和异香草醛的水平。
同样地,可以使用体外测定法或重组宿主细胞来确定给定的突变体COMT多肽是否能够催化原儿茶醇的–OH基团的甲基化,其中所述甲基化导致产生与异香草醇相比至少X倍多的香草醇。然而,在这种测定法中,使用原儿茶醇作为起始原料,并测定香草醇和异香草醇的水平。
异香草醛和香草醛的水平可以通过可用于检测这些化合物的任何适合的方法来测定,其中所述方法可以辨别异香草醛和香草醛。这样的方法包括例如HPLC。同样地,异香草酸、香草酸、异香草醇和香草醇的水平可以通过可用于检测这些化合物的任何适合的方法来测定,其中所述方法可以辨别异香草醛和香草醛。这样的方法包括例如HPLC。
对于原儿茶醛和原儿茶酸大小的底物来说,发现Hs COMT的底物结合位点的边界由下列疏水残基形成:Trp38,Met40,Cys173,Pro174,Trp143和Leu198。可能影响结合的其他亲水残基是Arg201。儿茶酚未被甲基化的羟基与Glu199形成氢键。
根据这种机制,为了使原儿茶醛和原儿茶酸发生间位甲基化,这些底物的结合取向必须将间位羟基放置成使其与Mg2+配位并靠近Lys144,而对位羟基靠近Glu199。野生型HsCOMT中观察到的这种所需区域选择性的缺乏,表明这种结合取向并不优先发生。对于HsCOMT的结合位点残基来说,一个或多个氨基酸可以被置换,以允许促进原儿茶醛和原儿茶酸的所需间位O-甲基化的底物结合取向。
因此,本发明还包括用于鉴定具有提高的底物特异性的COMT多肽的方法。具体来说,这样的方法提供了计算方法,以鉴定提供COMT多肽的提高的间位O-甲基化的残基突变。方法包括几个明显不同的步骤。具体来说,鉴定最适突变的一种方法包括下述步骤:(a)选择COMT多肽的蛋白质结构;(b)将底物原儿茶酸和原儿茶醛对接到(a)中确定的COMT多肽的蛋白质结构,以推衍促进区域选择性间位或对位O-甲基化的不同构象;(c)鉴定靠近原儿茶酸和原儿茶醛底物的结合位点突变;(d)对在(c)中鉴定到的每个残基进行计算机突变分析;(e)根据间位或对位O-甲基化的预测的底物构象,对来自于(d)的每个位置的候选残基突变进行排序;以及(f)为(c)中鉴定到的每个候选残基选择最佳评分突变。
正如上面讨论的,COMT多肽可以处于分类在EC编号2.1.1.6儿茶酚O-甲基转移酶下的COMT家族中。就此而言,本领域技术人员将会认识到,除了Hs COMT之外,所述方法可以应用于这一分类内的任何种类的COMT,其中这样的蛋白质具有相似的结合位点残基。在某些实施方式中,所述方法应用于拟南芥(Arabidopsis)或草莓COMT(分别为SEQ ID NO:54和SEQ ID NO:54)。尽管每个方法步骤是就Hs COMT进行了更详细地描述,但应该认识到,类似的方法步骤可以使用其他COMT多肽来进行。
在步骤(a)中,选择COMT蛋白结构。Hs COMT的蛋白结构可以从蛋白质数据库(Protein Data Bank)公开获取,并且可以根据分辨率、突变的包含和可以被导入以产生结晶的其他序列变化来评估用途。可以在选择结构中考虑的其他因素包括结构是否包含结合于所述蛋白的底物,或结合于蛋白的底物的本性。晶体结构编码3BWM(RCSB蛋白质数据库)是对Hs COMT特别有用的结构,并被用于本文描述的方法中。专业技术人员将会认识到,其他Hs COMT晶体结构可以被用作建模程序的输入。
在步骤(b)中,将目标底物例如原儿茶酸和原儿茶醛对接到蛋白质结构。对接是描述在预测所选小分子在蛋白质结合位点中的可能构象的各种算法中执行的计算方法的术语。所述技术通常计算结合分值,以为评估适配度和预测的相互作用结合能提供基础。在现有程序中使用的用于推演底物构象的对接程序是ProtoScreen(Haydon等,(2008)Science321:1673-1675)。所述方法根据ProtoScore算法产生结合分值(Bhurruth-Alcor等,(2011)Org.Biomol.Chem.9:1169-1188)。本领域技术人员将会认识到,在这一方法步骤中可以使用其他对接程序来鉴定原儿茶酸和原儿茶醛底物在Hs COMT(或其他COMT家族成员)中的适合的结合构象。
在步骤(c)中,进行突变分析。这可以包括几个子步骤,例如(i)鉴定待突变的第一残基;(ii)从适合的氨基酸名单中鉴定待突变成的残基;(iii)对于(ii)中的每个残基,搜索旋转异构体文库以寻找构象候选物;(iv)将来自于(i)的残基用来自于(iii)的每个新的残基侧链旋转异构体选择方案进行突变;(v)最小化蛋白质复合物构象;(vi)在不同的原儿茶酸或醛构象下,使用突变体-底物、突变体-蛋白质、突变体-溶剂和底物-溶剂能量的各种计算,对(iv)中的每个旋转异构体候选物进行评分,允许对底物在间位或对位处修饰的情况之间进行能量比较;以及(vii)对来自于(iii)的旋转异构体选择方案进行排序,然后对来自于(ii)的氨基酸突变找出最高评分。
在子步骤(i)中,轮流对每个待突变的残基进行分析。
在子步骤(ii)中,从可用氨基酸的确定名单选择待突变成的残基。作为缺省,该名单是自然界中存在的20种标准氨基酸,但是可以包括其他非天然氨基酸。
在子步骤(iii)中,在旋转异构体文库中鉴定与突变的氨基酸身份匹配的所有旋转异构体。旋转异构体文库是采取可用的3D格式的标准氨基酸侧链的一组预先计算的优选构象。这样的文库通常用于蛋白质结构分析工作,并且被包括在大多数可商购的分子建模软件包中。选择与进行突变的残基的身份匹配的旋转异构体组用于子步骤(iv)。
在子步骤(iv)中,将所选的蛋白质残基换成在(iii)中鉴定到的每个旋转异构体。这包括对蛋白质残基原子的计算表示法(representation)进行修改,以便在使用矢量数学将选自于(iii)的输入旋转异构体连接到α碳位置之前,删除起始残基侧链原子。对每种候选的旋转异构体重复这种方法,以获得蛋白质的3D表示法的名单,其中每个表示法的差别仅在于单个残基位置处的不同旋转异构体构象。
在子步骤(v)中,对在(iv)中推演的蛋白质-底物复合物进行力场最小化。在特别有用的实施方式中,该力场是具有施加到底物的AM1-BCC电荷的AMBER99。在所述方法的另一种情况下,使用Born溶剂化项。可以使用边壁约束来限制蛋白质骨架,同时侧链保持不受约束。这具有减少总体蛋白质运动但允许探索单个残基突变对邻近残基的影响的效果。本领域技术人员将会认识到,各种可商购的分子建模软件包能够执行这些任务。
在子步骤(vi)中,对得到的蛋白质-复合物构象进行能学计算,以确定各个突变体构象的可行性。这包括单个地计算野生型残基与底物、蛋白质环境和溶剂的相互作用能量,然后对突变的残基构象进行同样计算。为底物的(间位和对位反应)构象两者确定计算值。因此,这一方法步骤鉴定了与处于对位激活造型(pose)相比,对处于间位反应造型的底物具有有利结合能的突变。使用基于力场的项计算能量。在一种实施方式中,力场是AMBER99,并且氨基酸与蛋白质-底物环境之间的相互作用能量通过方程1来描述。
E非成键=E范德华+E静电 (方程1)
其中
本领域技术人员将会认识到,可以使用其他类似方程来推导适合于不同底物-突变氨基酸造型的这种差异能量分析的相互作用能量。
在子步骤(vii)中,与处于对位反应预测造型的底物相比,选择与处于间位反应预测造型的底物结合提供有利结合能的氨基酸。因此,这些计算确定了哪个氨基酸突变可能促进间位区域选择性O-甲基化。
在步骤(iv)中,重复步骤(iii)直至为待突变的每个结合位点残基输出能量值列表。
在步骤(v)中,通过比较每个突变在底物处于间位反应位置与对位反应位置之间的能量差,对源自于步骤(iv)的能量值列表进行排序。该列表排名前列的条目代表了与对位O-甲基化相比有利于间位O-甲基化的突变。
在步骤(vi)中,根据能量值选择来自于步骤(v)的有限数量的这样的候选物。不选择下述突变:(1)突变在能量上不利,或(2)预计突变不改变区域选择性。
正如在本文中所描述,上述方法对Hs COMT的应用导致鉴定到被设计用于提高酶的区域选择性O-甲基化的一组突变。所述突变可以单独地或组合地使用。
在Hs COMT的膜结合同工型中,等同的残基编号是可溶性Hs COMT的编号加上50。因此,在可溶性Hs COMT中被描述或替换的残基,在膜结合Hs COMT中也被推断为在残基编号加上50处被描述或替换。
在一种实施方式中,本发明提供了一种突变体COMT多肽,其(1)具有在SEQ ID NO:27的整个长度上确定的与SEQ ID NO:27具有至少80%、例如至少85%、例如至少90%、例如至少95%、例如至少96%、例如至少97%、例如至少98%、例如至少99%的序列同一性的氨基酸序列;并且(2)在与SEQ ID NO:27的198至199位置对齐的位置处具有至少一个氨基酸置换,其可以是下文中描述的氨基酸置换中的任一种;并且(3)能够催化原儿茶酸的–OH基团的甲基化,其中所述甲基化导致产生与异香草酸相比至少4、5、10、15、20、25或30倍多的香草酸。除了这些特征之外,所述突变体COMT多肽可能还能够催化原儿茶醛的–OH基团的甲基化,其中所述甲基化导致产生与异香草醛相比至少4、5、10、15、20、25或30倍多的香草醛;和/或能够催化原儿茶醇的–OH基团的甲基化,其中所述甲基化导致产生与异香草醇相比至少4、5、10、15、20、25或30倍多的香草醇。
因此,在一种优选实施方式中,突变体COMT多肽可以在与SEQ ID NO:27的198位置对齐的位置处具有氨基酸置换。因此,突变体COMT多肽可以是具有上面概述的特征的突变体COMT多肽,其中所述置换是在与SEQ ID NO:27的198位置对齐的位置处的亮氨酸被具有更低亲水指数的另一种氨基酸置换。例如,突变体COMT多肽可以是具有如上概述的特征的突变体COMT多肽,其中所述置换是在与SEQ ID NO:27的198位置对齐的位置处的亮氨酸被亲水指数低于2的另一种氨基酸置换。因此,突变体COMT多肽可以是具有如上概述的特征的突变体COMT多肽,其中所述置换是在与SEQ ID NO:27的198位置对齐的位置处的亮氨酸被Ala、Arg、Asn、Asp、Cys、Glu、Gln、Gly、His、Lys、Met、Phe、Pro、Ser、Thr、Trp或Tyr,例如Ala、Arg、Asn、Asp、Glu、Gln、Gly、His、Lys、Met、Pro、Ser、Thr、Trp或Tyr置换。然而,优选地,所述置换是在与SEQ ID NO:27的198位置对齐的位置处的亮氨酸被酪氨酸置换。将与SEQ IDNO:27的198位置对齐的亮氨酸用甲硫氨酸置换,增加了对原儿茶醛来说间位>对位O-甲基化的区域选择性。
在另一种优选实施方式中,突变体COMT多肽可以在与SEQ ID NO:27的199位置对齐的位置处具有氨基酸置换。因此,突变体COMT多肽可以是具有如上概述的特征的突变体COMT多肽,其中所述置换是在与SEQ ID NO:27的199位置对齐的位置处的谷氨酸被在pH7.4下具有中性或正的侧链电荷的另一种氨基酸置换。因此,突变体COMT多肽可以是具有如上概述的特征的突变体COMT多肽,其中所述置换是在与SEQ ID NO:27的199位置对齐的位置处的谷氨酸被Ala、Arg、Asn、Cys、Gln、Gly、His、Ile、Leu、Lys、Met、Phe、Pro、Ser、Thr、Trp、Tyr或Val置换。然而,所述置换优选地是在与SEQ ID NO:27的199位置对齐的位置处的谷氨酸被丙氨酸或谷氨酰胺置换。将与SEQ ID NO:27的199位置对齐的谷氨酸用丙氨酸或谷氨酰胺置换,增加了对原儿茶醛来说间位>对位O-甲基化的区域选择性。
例如,突变体COMT多肽可以具有下列突变中的一个或多个:用色氨酸、酪氨酸、苯丙氨酸、谷氨酸或精氨酸置换在与SEQ ID NO:27中显示的氨基酸序列的198位置对齐的位置处的亮氨酸;用精氨酸、赖氨酸或丙氨酸置换在与SEQ ID NO:27中显示的氨基酸序列的40位置对齐的位置处的甲硫氨酸;用酪氨酸、赖氨酸、组氨酸或精氨酸置换在与SEQ ID NO:27中显示的氨基酸序列的143位置对齐的位置处的色氨酸;用异亮氨酸、精氨酸或酪氨酸置换在与SEQ ID NO:27中显示的氨基酸序列的174位置对齐的位置处的脯氨酸;用精氨酸或赖氨酸置换在与SEQ ID NO:27中显示的氨基酸序列的38位置对齐的位置处的色氨酸;用苯丙氨酸、酪氨酸、谷氨酸、色氨酸或甲硫氨酸置换在与SEQ ID NO:27中显示的氨基酸序列的173位置对齐的位置处的半胱氨酸;和/或用丝氨酸、谷氨酸或天冬氨酸置换在与SEQ IDNO:27中显示的氨基酸序列的201位置对齐的位置处的精氨酸。
在一种实施方式中,突变体COMT多肽含有在与198位置对齐的位置处的亮氨酸被色氨酸的置换。这一突变可以提高对原儿茶酸来说间位>对位O-甲基化的区域选择性。含有L198W突变的COMT多肽的蛋白质结合位点的模型化,表明在突变的残基与底物之间可以发生空间位阻。在间位反应构象中不发生这种空间位阻,因为底物的羧酸远离该残基。
在本发明的另一种实施方式中,突变体COMT多肽是SEQ ID NO:27的多肽,其中198位置处的氨基酸已被具有比亮氨酸更低的亲水指数的氨基酸置换。例如,突变体COMT多肽可以是SEQ ID NO:27的多肽,其中198位置处的亮氨酸已被亲水指数小于2的氨基酸置换。因此,突变体COMT多肽可以是SEQ ID NO:1的多肽,其中198位置处的亮氨酸已被Ala、Arg、Asn、Asp、Glu、Gln、Gly、His、Lys、Met、Pro、Ser、Thr、Trp或Tyr置换,优选地被Met或Tyr置换。
在另一种优选实施方式中,突变体COMT多肽可以是SEQ ID NO:27的多肽,其中199位置处的氨基酸已被在pH 7.4下具有中性或正的侧链电荷的另一种氨基酸置换。因此,突变体COMT多肽可以是SEQ ID NO:27的多肽,其中199位置处的谷氨酸已被Ala、Arg、Asn、Cys、Gln、Gly、His、Ile、Leu、Lys、Met、Phe、Pro、Ser、Thr、Trp、Tyr或Val置换,优选地被Ala或Gln置换。
在某些实施方式中,突变体COMT多肽具有两个或更多个突变。例如,底物结合位点中的2、3、4、5、6或7个残基可以被突变。例如,在一种实施方式中,突变体COMT多肽可以具有在与SEQ ID NO:27的氨基酸序列的40位置对齐的位置处的甲硫氨酸被精氨酸或赖氨酸的置换;在与SEQ ID NO:27的氨基酸序列的143位置对齐的位置处的色氨酸被酪氨酸或组氨酸的置换;在与SEQ ID NO:27的氨基酸序列的174位置对齐的位置处的脯氨酸被异亮氨酸的置换;以及在38位置处的色氨酸被精氨酸或赖氨酸的置换。突变体COMT多肽还可以具有在与SEQ ID NO:27的氨基酸序列的143位置对齐的位置处的色氨酸被赖氨酸或精氨酸的置换,以及在SEQ ID NO:27的174位置处的脯氨酸被精氨酸或酪氨酸的置换。突变体COMT多肽还可以具有在与SEQ ID NO:27中显示的氨基酸序列的173位置对齐的位置处的半胱氨酸被苯丙氨酸、酪氨酸、谷氨酸、色氨酸或甲硫氨酸的置换,在与SEQ ID NO:27中显示的氨基酸序列的40位置对齐的位置处的甲硫氨酸被丙氨酸的置换,以及在与SEQ ID NO:27中显示的氨基酸序列的201位置对齐的位置处的精氨酸被丝氨酸、谷氨酸或天冬氨酸的置换。突变体COMT多肽还可能具有在与SEQ ID NO:27的氨基酸序列的198位置对齐的位置处的亮氨酸的置换,以及在与SEQ ID NO:27的氨基酸序列的199位置对齐的位置处的谷氨酸的置换。所述置换可以是本部分中上面描述的置换中的任一种。突变体COMT多肽还可能具有在与SEQ IDNO:27的198位置对齐的位置处的亮氨酸的置换,以及在与SEQ ID NO:27的201位置对齐的位置处的精氨酸的置换。所述置换可以是本部分中上面描述的置换中的任一种。
同一性百分数
本文中给出的序列同一性优选地是在参比序列的整个长度上的序列同一性。因此,与本文中作为SEQ ID NO:4或SEQ ID NO:27提供的氨基酸序列的序列同一性分别是在SEQ ID NO:4或SEQ ID NO:27的整个长度上的序列同一性。
同一性百分数可以如下确定。使用计算机程序ClustalW(1.83版,缺省参数)将参比序列(例如SEQ ID NO:4或SEQ ID NO:27中显示的核酸序列或氨基酸序列)与一个或多个候选序列对齐,所述程序允许在跨过核酸或多肽序列的整个长度上对它们进行比对(全面比对)。Chenna等,(2003)Nucleic Acids Res.31(13):3497-500。
ClustalW计算参比序列与一个或多个候选序列之间的最佳匹配并对它们进行比对,以便可以确定同一性、相似性和差异。可以在参比序列、候选序列或两者中插入一个或多个残基的空位,以最大化序列比对。对于核酸序列的快速成对比对来说,使用下面的缺省参数:字符大小:2;窗口大小:4;评分方法:百分率;顶部对角线数目:4;空位罚分:5。对于核酸序列的多重比对来说,使用下列参数:空位开放罚分:10.0;空位拓展罚分:5.0;权重转移:是。对于蛋白质序列的快速成对比对来说,使用下列参数:字符大小:1;窗口大小:5;评分方法:百分率;顶部对角线数目:5;空位罚分:3。对于蛋白质序列的多重比对来说,使用下面的参数:权重矩阵:blosum;空位开放罚分:10.0;空位拓展罚分:0.05;亲水性空位:打开;亲水性残基:Gly,Pro,Ser,Asn,Asp,Gln,Glu,Arg和Lys;残基特异性空位罚分:打开。ClustalW输出是反映出序列之间的关系的序列排列。
为了确定候选核酸或氨基酸序列与参比序列的同一性百分数,使用ClustalW将序列对齐,用排列中一致的匹配数目除以参比序列的长度,并将结果乘以100。应该指出,同一性百分数值可以四舍五入到小数点后一位。例如,78.11、78.12、78.13和78.14四舍五入到78.1,而78.15、78.16、78.17、78.18和78.19四舍五入到78.2。
氨基酸置换
氨基酸置换可以是保护或非保守的。保守氨基酸置换将氨基酸用相同类别的氨基酸置换,而非保守氨基酸置换将氨基酸用不同类别的氨基酸置换。保守置换的实例包括下列组内的氨基酸置换:(1)甘氨酸和丙氨酸;(2)缬氨酸、异亮氨酸和亮氨酸;(3)天冬氨酸和谷氨酸;(4)天冬酰胺、谷氨酰胺、丝氨酸和苏氨酸;(5)赖氨酸、组氨酸和精氨酸;以及(6)苯丙氨酸和酪氨酸。
非保守氨基酸置换可以将一个类别的氨基酸用不同类别的氨基酸置换。非保守置换可以使基因产物的电荷或疏水性显著变化。非保守氨基酸置换还可以使残基侧链的体积显著变化,例如用丙氨酸残基置换异亮氨酸残基。非保守置换的实例包括用碱性氨基酸置换非极性氨基酸或用极性氨基酸置换酸性氨基酸。本领域普通技术人员将会认识到,对于本文描述的突变体来说,可以用类似的氨基酸置换。
核酸
本文还提供了编码突变的AROM和COMT多肽的分离的核酸。“分离的核酸”是指与基因组中存在的其他核酸分子,包括在基因组中通常位于所述核酸的一侧或两侧的核酸分离开的核酸。当在本文中使用时,术语“分离的”对于核酸来说还包括任何非天然存在的核酸序列,因为这样的非天然存在的序列在自然界中不存在,并且不具有在天然存在的基因组中紧邻的序列。
分离的核酸可以是例如DNA分子,只要在天然存在的基因组中通常紧邻该DNA分子侧翼存在的核酸序列之一被移除或不存在即可。因此,分离的核酸包括但不限于作为独立于其他序列的分开的分子存在的DNA分子(例如化学合成的核酸或通过PCR或限制性内切酶处理产生的cDNA或基因组DNA片段),以及合并到载体、自主复制的质粒、病毒(例如任何副粘病毒、反转录病毒、慢病毒、腺病毒或疱疹病毒)中或原核或真核生物的基因组DNA内的DNA。此外,分离的核酸可以包括工程化改造的核酸,例如作为杂合或融合核酸的一部分的DNA分子。存在于例如cDNA文库或基因组文库内的数百至数百万其他核酸中的核酸,或含有基因组DNA限制性消化物的凝胶切片,不被当作分离的核酸。
编码AROM或COMT多肽的核酸可以使用常见分子克隆技术(例如定点突变)进行修饰,以在编码的多肽的特定位置处(例如与SEQ ID NO:4中显示的AROM氨基酸序列的1349、1366、1370、1387、1392、1441、1458、1500、1533或1571位置对齐的位置;或者与SEQ ID NO:27中显示的可溶形式的人类COMT氨基酸序列的38、40、143、173、174、198或201位置对齐的位置)产生突变。核酸分子可以包括单核苷酸突变或超过一个突变或超过一种类型的突变。聚合酶链反应(PCR)和核酸杂交技术可用于鉴定编码具有改变的氨基酸序列的AROM多肽的核酸。
在某些实施方式中,编码本发明的突变体多肽的核酸分子可以含有标签序列,其编码被设计以便于被编码多肽的后续操作(例如便于纯化或检测)、分泌或定位的“标签”。标签序列可以被插入到编码AROM或COMT多肽的核酸序列中,使得被编码的标签位于AROM或COMT多肽的羧基或氨基末端。编码的标签的非限制性实例包括绿色荧光蛋白(GFP)、谷胱甘肽S转移酶(GST)、HIS标签和FLAG标签(Kodak,New Haven,CT)。标签的其他实例包括叶绿体转运肽、线粒体转运肽、造粉体肽、信号肽或分泌标签。
本发明的多肽(突变体或野生型)可以使用任何方法来生产。例如,多肽可以通过化学合成来生产。或者,本文描述的多肽可以使用编码多肽的异源表达载体,通过标准的重组技术来生产。可以将表达载体导入到宿主细胞中(例如通过转化或转染),以表达编码的多肽,然后可以纯化所述多肽。可用于多肽的小规模或大规模生产的表达系统包括但不限于微生物例如用含有本文描述的核酸分子的重组噬菌体DNA、质粒DNA或粘粒DNA表达载体转化的细菌(例如大肠杆菌和枯草芽孢杆菌(B.subtilis)),以及用含有本文描述的核酸分子的重组酵母表达载体转化的酵母(例如酿酒酵母或粟酒裂殖酵母)。有用的表达系统还包括用含有本文描述的核酸分子的重组病毒表达载体(例如杆状病毒)感染的昆虫细胞系统,以及用含有本文描述的核酸分子的重组病毒表达载体(例如烟草花叶病毒)感染或用重组质粒表达载体(例如Ti质粒)转化的植物细胞系统。本发明的多肽还可以使用带有重组表达构建物的哺乳动物表达系统来生产,所述表达构建物含有源自于哺乳动物细胞基因组的启动子(例如金属硫蛋白启动子)或源自于哺乳动物病毒的启动子(例如腺病毒晚期启动子和细胞肥大病毒启动子)以及本文描述的核酸。本发明的多肽可以具有如上讨论的N-端或C-端标签。
重组宿主
本发明还描述了重组宿主。当在本文中使用时,术语重组宿主打算是指其基因组已增强(augment)至少一个并入的DNA序列的宿主。这样的DNA序列包括但不限于非天然存在的基因,正常情况下不被转录成RNA或翻译成蛋白质(“表达”)的基因,以及人们希望导入到非重组宿主中的其他基因或DNA序列。应该认识到,本文描述的重组宿主的基因组通常通过稳定地导入一个或多个重组基因来增强。然而,在本发明的范围内也可以使用自主或复制型质粒或载体。此外,本发明可以使用低拷贝数例如单拷贝或高拷贝数(正如本文中示例的)质粒或载体来实践。
一般来说,导入的DNA原本不存在于作为DNA受体的宿主中,但是从给定宿主分离DNA区段,然后将该DNA的一个或多个附加拷贝导入到同一宿主中,以例如提高基因产物的生产或改变基因的表达模式,也在本发明的范围之内。在某些情况下,导入的DNA将通过例如同源重组或定点突变而修饰或甚至替换内源基因或DNA序列。适合的重组宿主包括微生物、植物细胞和植物。
术语“重组基因”是指导入到受体宿主中的基因或DNA序列,不论这样的宿主中是否可能已存在相同或相似的基因或DNA序列。在这种情形中,“导入”或“增强”在本领域中已知是指通过人工导入或增强。因此,重组基因可以是来自于另一个物种的DNA序列,或者可以是源自于或存在于同一物种中,但已通过重组方法被合并到宿主中以形成重组宿主的DNA序列。应该认识到,被导入到宿主中的重组基因可以与待转化宿主中正常存在的DNA序列一致,但是为了提供所述DNA的一个或多个附加拷贝而被导入,从而允许该DNA的基因产物的过表达或改变的表达。
编码本文描述的多肽的重组基因包括所述多肽的编码序列,其在正义方向上可操作地连接到适合于表达所述多肽的一个或多个调控区。由于许多微生物能够从多顺反子mRNA表达多个基因产物,因此如果需要,多个多肽可以在用于那些微生物的单一调控区的控制之下表达。当调控区和编码序列被放置成使得调控区有效地调控序列的转录或翻译时,所述编码序列和调控区被认为是可操作连接的。对于单顺反子基因来说,编码序列的翻译阅读框的翻译起始位点通常位于调控区下游1至约50个核苷酸之间。
在许多情况下,本文描述的多肽的编码序列在重组宿主之外的物种中被鉴定,即是异源核酸。当在本文中使用时,术语“异源核酸”是指被导入到重组宿主中的核酸,其中所述核酸在所述宿主中不是天然存在的。因此,如果重组宿主是微生物,编码序列可以来自于其他原核或真核微生物、来自于植物或来自于动物。然而,在某些情况下,编码序列是宿主本源的序列,但是被重新导入到该生物体中。本源序列通常可以通过与外源核酸相连的非天然序列、例如在重组核酸构建物中本源序列侧翼的非本源调控序列的存在,与天然存在的序列区分开。此外,稳定转化的外源核酸通常被整合在与存在本源序列的位置不同的位置处。
“调控区”是指具有影响转录或翻译产物的转录或翻译起始和速率、稳定性和/或移动性的核苷酸序列的核酸。调控区包括但不限于启动子序列、增强子序列、响应元件、蛋白质识别位点、可诱导元件、蛋白质结合序列、5’和3’非翻译区(UTR)、转录起始位点、终止序列、多腺苷化序列、内含子及其组合。调控区通常包括至少一个核心(基本)启动子。调控区还可以包括至少一个控制元件,例如增强子序列、上游元件或上游激活区(UAR)。通过将调控区和编码序列放置成使得调控区有效地调控所述序列的转录和翻译,将调控区可操作连接到编码序列。例如,为了可操作连接编码序列和启动子序列,编码序列的翻译阅读框的翻译起始位点通常被置于启动子下游1至约50个核苷酸之间。然而,调控区可以被放置在翻译起始位点上游多达约5,000个核苷酸或翻译起始位点上游约2,000个核苷酸处。
待包含的调控区的选择取决于几种因素,包括但不限于效率、选择能力、可诱导性、所需表达水平和某些培养阶段中的偏好性表达。对于本领域技术人员来说,通过适当地选择调控区并将其相对于编码序列适当地放置以调节编码序列的表达,是常规工作。应该理解,可以存在超过一个调控区,例如内含子、增强子、上游激活区、转录终止子和可诱导元件。
一个或多个基因例如一个或多个异源核酸,可以合并在重组核酸构建物中成为“模块”,其可用于香草醛和/或香草醛葡萄糖苷生产中互不关联的方面。将多个基因或异源核酸在模块中组合,便于在各种物种中使用模块。例如,可以将香草醛基因簇合并以使每个编码序列与分开的调控区可操作连接,以形成用于在真核生物体中生产的香草醛模块。或者,模块可以表达多顺反子信使,用于在原核宿主例如Rodobacter、大肠埃希氏菌、芽孢杆菌属(Bacillus)或乳杆菌属(Lactobacillus)的物种中生产香草醛和/或香草醛葡萄糖苷。除了可用于香草醛或香草醛葡萄糖苷生产的基因之外,重组构建物通常还含有用于构建物在适合物种中的维持的复制原点和一个或多个可选择标志物。
应该认识到,由于遗传密码的简并性,许多核酸可以编码特定多肽;即对于许多氨基酸来说,存在超过一个用作所述氨基酸的密码子的核苷酸三联体。因此,对于给定多肽来说,可以使用对特定宿主(例如微生物)适合的密码子偏倚表来修改编码序列中的密码子,以便获得在所述宿主中的优化表达。作为分离的核酸,这些修改的序列可以作为纯化的分子存在,或者可以合并到载体或病毒中,用于构建重组核酸构建物的模块。
本文描述的重组宿主表达突变体AROM多肽和/或突变体COMT多肽。因此,在一种情况下,本发明涉及带有编码突变体AROM多肽和/或突变体COMT多肽的异源核酸的重组宿主,所述突变体多肽可以是本文描述的任何突变体多肽。具体来说,本发明涉及带有编码所述突变体AROM多肽和/或突变体COMT多肽的异源核酸的重组宿主,其中所述核酸可操作连接到允许在所述重组宿主中表达的调控区。
这样的宿主还可以其他基因或生物合成模块,用于生产香草醛或香草醛葡萄糖苷,提高将能源和碳源转变成香草醛及其葡萄糖苷的效率,和/或提供细胞培养物或植物的生产率。这样的其他生物合成模块可以包括以下基因中的一个或多个:编码3DSD多肽的基因、编码磷酸泛酰巯基乙胺转移酶(PPTase)的基因和编码UGT多肽的基因。参见图1。这些基因可以是内源基因或重组基因。此外,当宿主细胞带有编码突变体AROM多肽的异源核酸时,宿主细胞还可以包括野生型OMT基因。同样地,当宿主细胞带有编码突变体COMT多肽的异源核酸时,宿主细胞还可以包括野生型AROM基因。或者,宿主细胞可以带有编码本文中描述的突变体COMT多肽和突变体AROM多肽的异源核酸。此外,宿主还可以表达香草醇氧化酶(VAO)。
适合的3DSD多肽是已知的。本发明的3DSD多肽可以是具有3-脱氢莽草酸脱水酶活性的任何酶。优选地,3DSD多肽是能够催化3-脱氢-莽草酸向原儿茶酸和H2O的转化的酶。本发明的3DSD多肽优选地是分类在EC 4.2.1.118下的酶。例如,具有3DSD活性的适合多肽包括由Podospora pauciseta、玉米黑粉菌(Ustilago maydis)、Rhodoicoccus jostii、不动杆菌属物种(Acinetobacter sp.)、黑曲霉(Aspergillus niger)或粗糙脉孢霉(Neurospora crassa)制造的3DSD多肽。参见GENBANK 登记号CAD60599、XP_001905369.1、XP_761560.1、ABG93191.1、AAC37159.1和XM_001392464。因此,重组宿主可以包括编码Podospora pauciseta、玉米黑粉菌(Ustilago maydis)、Rhodoicoccus jostii、不动杆菌属物种(Acinetobacter sp.)、黑曲霉(Aspergillus niger)或粗糙脉孢霉(Neurosporacrassa)的3DSD多肽的异源核酸,或编码任何上述多肽的与其具有至少80%、例如至少85%、例如至少90%、例如至少95%、例如至少98%的序列同一性的功能性同源物的异源核酸。
正如本文中讨论的,适合的野生型OMT多肽是已知的。例如,适合的野生型OMT多肽包括由智人、拟南芥或草莓制造的OMT(参见GENBANK登记号NM_000754、AY062837和AF220491),以及从各种其他哺乳动物、植物或微生物分离的OMT多肽。
同样地,适合的野生型AROM多肽是已知的。例如,适合的野生型AROM多肽包括由酿酒酵母、粟酒裂殖酵母、日本裂殖酵母(S.japonicas)、粗糙脉孢霉和解脂耶氏酵母制造的AROM。参见GENBANK登记号X06077、NP_594681.1、XP_002171624和XP_956000。
适合的ACAR多肽是已知的。本发明的ACAR多肽可以是具有芳香族羧酸还原酶活性的任何酶。优选地,ACAR多肽是能够催化原儿茶酸向原儿茶醛的转化和/或香草酸向香草醛的转化的酶。本发明的ACAR多肽优选地是分类在EC 1.2.1.30下的酶。例如,适合的ACAR多肽由诺卡氏菌物种(Nocardia sp)制造。参见例如GENBANK登记号AY495697。因此,重组宿主可以包括编码诺卡氏菌物种的ACAR多肽或与其具有至少80%、例如至少85%、例如至少90%、例如至少95%、例如至少98%的序列同一性的功能性同源物的异源核酸。
适合的PPTase多肽是已知的。本发明的PPTase多肽可以是能够催化磷酸泛酰巯基乙胺化的任何酶。优选地,PPTase多肽是能够催化ACAR的磷酸泛酰巯基乙胺化的酶。例如,适合的PPTase多肽由大肠杆菌、谷氨酸棒杆菌(Corynebacterium glutamicum)或皮疽诺卡氏菌(Nocardia farcinica)制造。参见GENBANK登记号NP_601186、BAA35224和YP_120266。因此,重组宿主可以包括编码大肠杆菌、谷氨酸棒杆菌或皮疽诺卡氏菌的PPTase多肽或任何上述多肽的与其具有至少80%、例如至少85%、例如至少90%、例如至少95%、例如至少98%的序列同一性的功能性同源物的异源核酸。
香草醛的葡萄糖基化是特别有用的。香草醛-β-D-葡萄糖苷是香草豆荚中存在的香草醛的储存形式。它对大多数生物体包括酵母是无毒的,并且与香草醛相比在水中具有更高溶解性。此外,香草醛-β-D-葡萄糖苷的形成最可能将生物合成导向香草醛生产。UGT72E2(Hansen等,(2009)Appl.Environ.Microbiol.75:2765-27740)表现出对香草醛的高的底物特异性。根据这一发现,它在产生香草醛的酿酒酵母菌株中的表达导致几乎所有香草醛被转变成香草醛-β-D-葡萄糖苷。在体内将香草醛转变成香草醛-β-D-葡萄糖苷的能力是重要的,因为超过0.5-1g/升规模的非葡萄糖基化香草醛的微生物生产将受到游离香草醛的毒性的阻碍。葡萄糖基化起到克服这种抑制效应的作用。
因此,本发明的重组宿主还表达UGT多肽。UGT多肽可以是任何UDP-葡萄糖:糖苷配基-葡萄糖基转移酶。优选地,UGT多肽可以催化香草醛的葡萄糖基化(即生产香草醛β-D-葡萄糖苷)。因此,UGT多肽可以是家族1的葡萄糖基转移酶。本发明的优选UGT多肽被分类在EC2.4.1下。适合的UGT多肽包括UGT71C2、UGT72B1、UGT72E2、UGT84A2、UGT89B1、UGT85B1和熊果苷合成酶多肽。参见例如GENBANK登记号AC0005496、NM_116337和NM_126067。拟南芥UGT72E2是特别有用的(参见例如Hansen等,(2009)同上)。因此,重组宿主可以包括编码UGT71C2、UGT72B1、UGT72E2、UGT84A2、UGT89B1、UGT85B1或熊果苷合成酶或任何上述多肽的与其具有至少80%、例如至少85%、例如至少90%、例如至少95%、例如至少98%的序列同一性的功能同源物的异源核酸。其他有用的UGT描述在WO 01/40491中。
作为本发明的其他实施方式,宿主细胞也可以表达VAO酶(EC 1.1.3.38),以将任何形成的香草醇氧化成香草醛。VAO酶在本领域中是已知的,并且包括但不限于来自于丝状真菌例如尖孢镰刀菌(Fusarium monilifomis)(GENBANK登记号AFJ11909)和简青霉(Penicillium simplicissium)(GENBANK登记号P56216;Benen等,(1998)J.Biol.Chem.273:7865-72)以及细菌例如Modestobacter marinus(GENBANK登记号YP_006366868)、Rhodococcus jostii(GENBANK登记号YP_703243.1)和不透明红球菌(R.opacus)(GENBANK登记号EHI39392)的酶。
在某些情况下,为了使代谢中间体转向香草醛或香草醛葡萄糖苷生物合成,需要抑制一种或多种内源多肽的功能。例如,可以降低丙酮酸脱羧酶(PDC1)和/或谷氨酸脱氢酶活性。在这样的情况下,可以将抑制所述多肽或基因产物的表达的核酸包含在转化到菌株中的重组构建物中。或者,可以使用诱变在需要抑制功能的基因中产生突变。
许多原核和真核生物适用于构建本文描述的重组微生物,例如革兰氏阴性细菌、革兰氏阳性细菌、酵母或其他真菌。首先对被选择用作香草醛或香草醛葡萄糖苷生产菌株的物种和菌株进行分析,以确定哪些生产基因对于菌株来说是内源的,哪些基因不存在。将在所述菌株中不存在内源对应物的基因组装在一个或多个重组构建物中,然后将所述构建物转化到菌株中以提供缺失的功能。
示例性的原核和真核物种在下面更详细地描述。然而应该认识到,其他物种可能是适合的。例如,适合的物种可以来自于伞菌属(Agaricus,)、曲霉属(Aspergillus)、芽孢杆菌属(Bacillus)、假丝酵母属(Candida)、棒状杆菌属(Corynebacterium)、埃希氏菌属(Escherichia)、镰刀菌/赤霉菌属(Fusarium/Gibberella)、克鲁维酵母属(Kluyveromyces)、孔菌属(Laetiporus)、香菇属(Lentinus)、发夫酵母属(Phaffia)、平革菌属(Phanerochaete)、毕赤酵母属(Pichia)、小立碗藓属(Physcomitrella)、红酵母属(Rhodoturula)、酵母菌属(Saccharomyces)、裂殖酵母属(Schizosaccharomyces)、痂圆孢霉属(Sphaceloma)、Xanthophyllomyces、耶氏酵母属(Yarrowia)和乳杆菌属(Lactobacillus)。来自这些属的示例性物种包括虎皮香菇(Lentinus tigrinus)、硫磺孔菌(Laetiporus sulphureus)、黄孢原毛平革菌(Phanerochaete chrysosporium)、巴斯德毕赤酵母(Pichia pastoris)、小立碗藓(Physcomitrella patens)、粘红酵母(Rhodoturula glutinis)32、Rhodoturula mucilaginosa、红发夫酵母(Phaffiarhodozyma)UBV-AX、Xanthophyllomyces dendrorhous、藤仓镰刀菌(Fusariumfujikuroi)/藤仓赤霉(Gibberella fujikuroi)、产朊假丝酵母(Candida utilis)和解脂耶氏酵母(Yarrowia lipolytica)。在某些实施方式中,微生物可以是子囊菌类例如藤仓赤霉(Gibberella fujikuroi)、乳酸克鲁维酵母(Kluyveromyces lactis)、粟酒裂殖酵母(Schizosaccharomyces pombe)、黑曲霉(Aspergillus niger)或酿酒酵母(Saccharomycescerevisiae)。在某些实施方式中,微生物可以是原核生物例如大肠埃希氏菌(Escherichiacoli)、类球红细菌(Rhodobacter sphaeroides)或荚膜红细菌(Rhodobactercapsulatus)。应该认识到,某些微生物可用于以高通量方式筛选和测试目标基因,而具有所需生产率或生长特性的其他微生物可用于香草醛β-D-葡萄糖苷的大规模生产。
有用的重组宿主的具体的非限制性实例描述在WO 01/40491以及Hansen等,(2009)Appl.Environ.Microbiol.75:2765-2774和Brochado等,(2010)Microbial CellFactories 9:84中,其中本发明的重组宿主含有编码突变体COMT多肽和/或突变体AROM多肽的异源核酸代替WO 01/40491中描述的OMT基因。
与本发明一起使用的一种优选的重组宿主是酿酒酵母,其可能如本文中所述被重组工程化改造。酿酒酵母是合成生物学中广泛使用的底盘生物体,并可以用作重组微生物平台。对于酿酒酵母有突变体文库、质粒、详细的代谢计算机模型和其他信息可用,允许合理地设计各种模块以提高产物得率。制造重组微生物的方法是已知的。酿酒酵母VG4株(Brochado等,(2010)Microb.Cell Fact.9:84)是特别有用的。VG4具有基因型pdc1Δgdh1Δ↑GDH2。
曲霉属(Aspergillus)物种例如米曲霉(A.oryzae)、黑曲霉(A.niger)和酱油曲霉(A.sojae)是在食品生产中广泛使用的微生物,并且也可用作重组微生物平台。因此,重组宿主可以是曲霉属物种(Aspergillus spp)。对于构巢曲霉(A.nidulans)、烟曲霉(A.fumigatus)、米曲霉(A.oryzae)、棒曲霉(A.clavatus)、黄曲霉(A.flavus)、黑曲霉(A.niger)和土曲霉(A.terreus)的基因组来说,核苷酸序列是可以获得的,允许对内源途径进行合理设计和修改以增加流量并提高产物得率。对曲霉来说已开发了代谢模型并进行了转录组学研究和蛋白质组学研究。黑曲霉培养被用于工业化生产大量食品成分例如柠檬酸和葡萄糖酸,因此诸如黑曲霉的物种总的来说适用于生产食品成分例如香草醛和香草醛葡萄糖苷。
合成生物学中的另一种广泛使用的平台生物大肠杆菌,也可用作重组微生物平台。因此,重组宿主可以是大肠杆菌。与酵母菌类似,对于大肠杆菌也存在突变体文库、质粒、详细的代谢计算机模型和其他信息,允许合理地设计各种模块以提高产物得率。与上面对酵母菌属所描述的方法类似的方法,可用于制造重组大肠杆菌微生物。
红细菌属(Rhodobacter)物种可以用作重组微生物平台。因此,宿主可以是红细菌属物种(Rhodobacter spp)。与大肠杆菌类似,存在着可用的突变体文库以及适合的质粒载体,允许合理地设计各种模块以提高产物得率。与上面对大肠杆菌所描述的方法类似的方法,可用于制造重组红细菌属微生物。
小立碗藓属(Physcomitrella)苔藓,当在悬浮培养中生长时,具有与酵母或其他真菌培养物类似的特性。这个属正变成在其他类型的细胞中难以生产的植物次生代谢物生产的重要的细胞类型。因此,重组宿主可以是小立碗藓属物种(Physcomitrella spp)。
在某些实施方式中,本文描述的核酸和多肽被导入到植物或植物细胞中以提高总体香草醛或香草醛葡萄糖苷生产。因此,重组宿主可以是包括本文描述的至少一种异源核酸的植物或植物细胞。植物或植物细胞可以通过使异源核酸整合在其基因组中来转化,即可以稳定地转化。稳定转化的细胞通常随着每次细胞分裂保留导入的核酸。植物或植物细胞也可以被瞬时转化,使得异源核酸未整合在其基因组中。瞬时转化的细胞通常随着每次细胞分裂失去所有或一部分导入的核酸,使得在足够次数的细胞分裂后,在子代细胞中不能检测到导入的核酸。瞬时转化和稳定转化的转基因植物和植物细胞两者在本文描述的方法中都可能是有用的。
在本文描述的方法中使用的转基因植物细胞可以构成完整植物的一部分或全部。这样的植物可以以适合于所考虑的物种的方式,在生长箱、温室中或田间生长。可以根据特定目的的需要,例如为了将异源核酸例如重组核酸构建物导入到其他株系中、将异源核酸转移到其他物种、或进一步选择其他所需性状,对转基因植物进行繁育。或者,对于适合于营养繁殖技术的物种来说,可以将转基因植物营养繁殖。当在本文中使用时,转基因植物也指初始转基因植物的后代,只要所述后代继承所述转入基因即可。由转基因植物产生的种子可以被生长,然后进行自交(或远交和自交),以获得对核酸构建物纯合的种子。
转基因植物可以生长在悬浮培养或组织或器官培养中。出于本发明的目的,可以使用固体和/或液体组织培养技术。当使用固体培养基时,转基因植物细胞可以被直接放置在培养基上,或者可以放置在滤纸上,然后将滤纸放置成与培养基相接触。当使用液体培养基时,可以将转基因植物细胞放置在与液体培养基相接触的漂浮装置例如多孔膜上。
当使用瞬时转染的植物细胞时,可以将编码具有报告物活性的报告多肽的报告序列包含在转化程序中,并且可以在转化后的适合时间进行报告物活性或表达的测定。用于进行测定的适合时间通常为转化后约1-21天,例如约1-14天、约1-7天或约1-3天。对于不同物种中的快速分析来说,或者为了证实在特定受体细胞中表达尚未得到证实的异源多肽的表达,使用瞬时测定法是特别方便的。
用于将核酸导入单子叶或双子叶植物中的技术,在本领域中是已知的,并且包括但不限于土壤杆菌(Agrobacterium)介导的转化、病毒载体介导的转化、电穿孔和粒子枪转化,参见美国专利号5,538,880、5,204,253、6,329,571和6,013,863。如果细胞或培养的组织被用作转化的受体组织,如果需要,植物可以从转化的培养物,通过本领域技术人员已知的技术来再生。
可以从转基因植物的群体筛选和/或选择具有由转入基因的表达赋予的性状或表型的群体成员。例如,可以从单一转化事件的后代群体筛选具有本文描述的多肽或核酸的所需表达水平的植物。可以使用物理和生物化学方法来鉴定表达水平。这些方法包括用于多核苷酸检测的Southern分析或PCR扩增,用于检测RNA转录本的northern印迹、S1RNase保护、引物延伸或RT-PCR扩增,用于检测多肽和多核苷酸的酶或核酶活性的酶测定法,以及用于检测多肽的蛋白质凝胶电泳、western印迹、免疫沉淀和酶联免疫测定法。其他技术例如原位杂交、酶染色和免疫染色,也可用于检测多肽和/或核酸的存在或表达。用于执行所有指称技术的方法是已知的。
作为可选方案,可以从具有独立转化事件的植物群体筛选具有所需性状例如香草醛葡萄糖苷生产的植物。选择和/或筛选可以在一代或更多代中和/或在超过一个地理位置中进行。在某些情况下,转基因植物可以在诱导所需表型或者不然就是对转基因植物中产生所需表型是必需的条件下进行生长和选择。此外,可以在预期植物将表现出表型的特定发育阶段期间进行选择和/或筛选。可以进行选择和/或筛选以选择与缺少转入基因的对照植物相比具有香草醛或香草醛β-D-葡萄糖苷水平的统计学显著差异的转基因植物。
功能性同源物
本文描述的多肽的功能性同源物也适合用于在重组宿主中生产香草醛或香草醛葡萄糖苷。因此,重组宿主可以包括编码上述多肽的功能性同源物的一种或多种异源核酸和/或编码本文中描述的突变的COMT或AROM多肽的异源核酸。功能性同源物是与参比多肽具有序列相似性,并执行参比多肽的生物化学或生理功能中的一种或多种的多肽。功能性同源物和参比多肽可以是天然存在的多肽,并且序列相似性可能是由趋同或趋异进化事件造成的。就此而言,功能性同源物有时在文献中被称为同源物或直向同源物或横向同源物。天然存在的功能性同源物的变体例如由野生型编码序列的突变体编码的多肽,自身可能是功能性同源物。功能性同源物还可以通过多肽的编码序列的定点突变产生,或通过将来自于不同的天然存在的多肽的编码序列的结构域进行组合(“结构域交换”)来产生。用于改变编码本文描述的功能性AROM和/或COMT多肽的基因的技术是已知的,尤其是包括定向进化技术、定点突变技术和随机突变技术,并且可用于提高多肽的比活性、改变底物特异性、改变表达水平、改变亚细胞定位或以所需方式改变多肽:多肽相互作用。这样的修改的多肽被认为是功能性同源物。术语“功能性同源物”有时应用于编码功能同源的多肽的核酸。
功能性同源物可以通过分析核苷酸和多肽序列比对来鉴定。例如,在核苷酸或多肽序列数据库上进行查询,可以鉴定AROM或COMT多肽、3DSD、ACAR、PPTase或UGT多肽的同源物。系列分析可以包括使用AROM或COMT、3DSD、ACAR、PPTase或UGT氨基酸序列作为参比序列的非冗余数据库的BLAST、相互(reciprocal)BLAST或PSI-BLAST分析。在某些情况下,从核苷酸序列推演氨基酸序列。数据库中具有高于40%序列同一性的多肽是候选物,用于进一步评估作为香草醛或香草醛葡萄糖苷生物合成多肽的适合性。氨基酸序列相似性允许保守氨基酸置换,例如用一个疏水残基置换另一个疏水残基或用一个极性残基置换另一个极性残基。如果需要,可以对这样的候选物进行人工检查,以便缩小进一步评估的候选物的数目。人工检查可以通过选择显得具有AROM或COMT多肽或香草醛生物合成多肽中存在的结构域例如保守性功能结构域的候选物来进行。
可以通过在多肽的一级氨基酸序列内定位是重复序列、形成一些二级结构(例如螺旋和β片层)、建立起带正电荷或负电荷的结构域或代表蛋白质基序或结构域的区域,来鉴定保守区。参见例如描述了各种蛋白质基序和结构域的共有序列的Pfam数据库。Pfam数据库所包括的信息描述在Sonnhammer等,(1998)Nucl.Acids Res.26:320-322;Sonnhammer等,(1997)Proteins 28:405-420和Bateman等,(1999)Nucl.Acids Res.27:260-262中。保守区还可以通过将来自于密切相关物种的相同或相关多肽的序列进行比对来确定。密切相关的物种优选地来自于相同家族。在某些实施方式中,来自于两种不同物种的序列的比对是足够的。
通常,表现出至少约40%氨基酸序列同一性的多肽可用于鉴定保守区。相关多肽的保守区表现出至少45%的氨基酸序列同一性(例如至少50%、至少60%、至少70%、至少80%或至少90%的氨基酸序列同一性)。在某些实施方式中,保守区表现出至少92%、94%、96%、98%或99%的氨基酸序列同一性。序列同一性可以如上所提出地进行确定。
在下面的实施例中对本发明进行进一步描述,所述实施例不限制权利要求书中描述的本发明的范围。
实施例1:用于从葡萄糖生产香草醛葡萄糖苷的酵母报告菌株
按照Brochado等,(2010)Microb.Cell Fact.9:84中所述,产生从葡萄糖生产香草醛葡萄糖苷的遗传稳定的酵母菌株,即具有PDC1(丙酮酸脱羧酶)和GDH1(谷氨酸脱氢酶)基因缺失并过表达GDH2的菌株VG4。此外,所述菌株带有表达构建物,其含有整合到酵母基因组的ECM3基因座间区域中的PPTase。谷氨酸棒杆菌(Corynebacterium glutamicum)PPTase编码序列的表达受到酵母TPI1启动子的控制(Hansen等,(2009)Appl.Environ.Microbiol.75(9):2765-74.Epub 2009,3月13日)。得到的菌株被命名为V12。
实施例2:缺少结构域5的AROM的构建
使用校对性PCR聚合酶,通过PCR扩增,从制备自酿酒酵母菌株S288C的基因组DNA分离酵母ARO1基因的最靠近5’的3912bp,其包括除了结构域5(具有莽草酸脱氢酶活性)之外的所有功能结构域。将得到的DNA片段亚克隆到pTOPO载体中,测序以证实DNA序列。核酸序列和相应的氨基酸序列分别显示在SEQ ID NO:1和SEQ ID NO:2中。将该片段用SpeI和SalI进行限制性消化,并克隆在高拷贝数酵母表达载体p426-GPD(基于2μ的载体)中的相应限制性位点内,从所述载体可以通过强的组成型酵母GPD1启动子表达插入的基因。得到的质粒被命名为pVAN133。
实施例3:在结构域5中具有单个氨基酸置换的酵母AROM
在本实施例中描述的所有突变体AROM多肽是其中一个氨基酸已被另一个氨基酸置换的SEQ ID NO:4的多肽。突变体AROM多肽如下命名:XnnnY,其中nnn指示被置换的氨基酸在SEQ ID NO:4中的位置,X是在SEQ ID NO:4中的nnn位置中的氨基酸的单字母代码,Y是置换X的氨基酸的单字母代码。例如,A1533P是指其中1533位置处的丙氨酸被脯氨酸置换的SEQ ID NO:4的突变体AROM多肽。
使用校对性PCR聚合酶,通过PCR扩增,从制备自酿酒酵母菌株S288C的基因组DNA分离全部4764bp的酵母ARO1基因。将得到的DNA片段亚克隆在pTOPO载体中,测序以证实DNA序列。核酸序列和相应的氨基酸序列分别显示在SEQ ID NO:3和SEQ ID NO:4中。将该片段用SpeI和SalI进行限制性消化,并克隆在低拷贝数酵母表达载体p416-TEF(基于CEN-ARS的载体)中的相应限制性位点内,从所述载体可以从强的TEF启动子表达基因。得到的质粒被命名为pVAN183。
使用QUICKCHANGE II定点突变试剂盒(Agilent Technologies),使用质粒pVAN183来制造ARO1的10个不同的结构域5突变体。参考SEQ ID NO:4,突变体含有下列氨基酸置换:A1533P,P1500K,R1458W,V1349G,T1366G,I1387H,W1571V,T1392K,K1370L和A1441P。
在证实这些突变体AROM基因的序列后,将含有A1533P、P1500K、R1458W、V1349G、T1366G、I1387H、W1571V、T1392K、K1370L和A1441P置换的表达质粒分别命名为pVAN368-pVAN377。AROM突变体的核酸序列和相应的氨基酸序列列于表1中。
表1
实施例4:酵母AROM和3DHS脱水酶融合蛋白
使用校对性PCR聚合酶,通过PCR扩增,从制备自酿酒酵母菌株S288C的基因组DNA分离酵母ARO1基因的最靠近5’的3951bp。将得到的DNA片段亚克隆到pTOPO载体中,测序以证实DNA序列。为了将该片段融合到来自于香草醛途径的3-脱氢莽草酸脱水酶(3DSD)基因,将来自于Podospora pauciseta的3DSD基因(Hansen等,(2009)同上)插入到酵母表达载体p426-GPD的XmaI-EcoRI位点中,然后将克隆的ARO1片段切下并插入到得到的构建物的SpeI-XmaI位点中。最终的融合基因从强的组成型GPD1启动子表达。得到的质粒被命名为pVAN132。该融合蛋白的核酸序列和相应的氨基酸序列分别显示在SEQ ID NO:25和SEQ IDNO:26中。
实施例5:突变体或融合AROM酶在已经生物合成香草醛葡萄糖苷的酵母中的表达
使用乙酸锂转化流程,通过转化将实施例2、3和4中描述的每个质粒导入到酵母菌株V12中,产生下列酵母菌株:V12-Aro1-1(含有质粒pVAN133),V12-Aro1-2(含有质粒pVAN132),V12-1-3(含有质粒pVAN183),V12-Aro1-4(含有质粒pVAN368),V12-Aro1-5(含有质粒pVAN369),V12-Aro1-6(含有质粒pVAN370),V12-Aro1-7(含有质粒pVAN371),V12-Aro1-8(含有质粒pVAN372),V12-Aro1-9(含有质粒pVAN373),V12-Aro1-10(含有质粒pVAN374),V12-Aro1-11(含有质粒pVAN375),V12-Aro1-12(含有质粒pVAN376)和V12-Aro1-13(含有质粒pVAN377)。
将酵母菌株V12-Aro1-1、V12-Aro1-2、V12-1-3、V12-Aro1-4、V12-Aro1-5、V12-Aro1-6、V12-Aro1-7、V12-Aro1-8、V12-Aro1-9、V12-Aro1-10、V12-Aro1-11、V12-Aro1-12和V12-Aro1-13作为200ml培养物,在500ml Erlenmeyer摇瓶中,使用不含芳香族氨基酸的SC(合成完全)生长培养基,在30℃和中等转数(150rpm)下生长72小时。在48小时时取样并测定香草醛葡萄糖苷的含量。含有空载体p416-TEF或p426-GPD的酵母菌株V12被包含作为对照。对照菌株(含有空质粒p416-和p426-GPD)中的香草醛葡萄糖苷(VG)生产通常在250mg/L左右。表达截去结构域5的AROM预期将提高VG生产(菌株V12-Aro1-1),并且将该截短的AROM进一步物理融合到异源香草醛途径中的第一关键酶(Podospora pauciseta 3DSD),预期将导致香草醛葡萄糖苷生产的进一步提高(菌株V12-Aro1-2)。
在结构域5的单个氨基酸被改变的AROM的突变体中,T1392K(菌株V12-Aro1-9)和K1370L(菌株V12-Aro1-10)可用于提高VG生产。例如,对于T1392K(菌株V12-Aro1-9)和K1370L(菌株V12-Aro1-10),可以观察到VG生产的约30-35%的提高。
本实施例证实了通过过表达具有降低的莽草酸脱氢酶活性的突变体AROM多肽,可以提高3-DHS的细胞浓度,其足以在异源途径中得到的最终滴度中发挥作用。本实施例还证实,将异源途径的第一个酶即3DHD融合到截短的AROM酶,导致进入异源途径的流量增加,获得底物通道效果。最后,这里描述的实验表明,通过改变AROM结构域中独立的氨基酸,所述结构域天然代谢为香草醛生产所需化合物的3DHS,可以提高可用于香草醛生物合成的3DHS的量。
实施例6:COMT突变体
在本实施例中描述的所有突变体COMT多肽是其中一个氨基酸被另一个氨基酸置换的SEQ ID NO:27的多肽。突变体COMT多肽如下命名:XnnnY,其中nnn指示被置换的氨基酸在SEQ ID NO:27中的位置,X是在SEQ ID NO:27中的nnn位置中的氨基酸的单字母代码,Y是置换X的氨基酸的单字母代码。例如,L198Y是指其中198位置处的亮氨酸被色氨酸置换的SEQ ID NO:27的突变体COMT多肽。
使用含有所需密码子的引物,通过PCR来构建编码突变体COMT多肽的核酸。PCR作为单一PCR或使用序列重叠延伸PCR(SOE),通过标准程序来进行。该步骤可以例如由商业化提供者例如Life Technologies来进行。使用的引物列于表2中。
表2
在上面提供的引物序列中:M可能是A或C;R可能是A或G;W可能是A或T;S可能是G或C;Y可能是C或T;K可能是G或T;V可能是A、G或C;H可能是A、C或T;D可能是A、G或T;B可能是G、C或T;并且N可能是A、G、C或T。
用于EcoRI/XbaI的限制性位点被包含在引物中,以便于将PCR产物克隆到着丝粒酵母表达载体p416-TEF(Mumberg等,(1995)Gene156(1):119-22)中。将得到的质粒转化到酵母菌株EFSC2055(基因型Mata his3D1leu2D0met15D0ura3D0adh6::LEU2bgl1::KanMX4PTPI1::3DSD[AurC]::(HsOMT::MET15[NatMX])::ACAR[HphMX]::UGT72E2[HIS3]ECM3::(CorPPTase-ScHAP4))中。该酵母菌株基于VG4菌株(Brochado等,(2010)Microbial.Cell Factories 9:84),另外包括用MET15标志物破坏的HsOMT,并且具有两个其他整合的基因,即谷氨酸棒杆菌(Corynebacterium glutamicum)的PPtase(NCBI数据库登记号NP_601186)和酿酒酵母(S.cerevisiae)的HAP4(NCBI数据库登记号Z28109)。
HsOMT如下所述进行破坏:使用分别具有与HsOMT的前部和后部末端同源的70bp尾部的引物,通过PCR扩增酿酒酵母的MET15(甲硫氨酸营养缺陷型选择标志物)。然后用PCR产物转化酵母菌株,产生不具有HsOMT活性并且能够在不添加甲硫氨酸的平板上生长的转化体。得到的菌株被命名为EFSC2055。
使用标准的酵母转化乙酸锂/PEG流程,将表达CorPPTase和ScHAP4的质粒转化到EFSC2055前身菌株中,并在SC-尿嘧啶平板上选择转化体。通过在增补有8%甜菜糖蜜的Delft培养基中在3ml培养物中生长72小时,来测试转化体。
使用HPLC-UV分析培养物来定量香草醛葡萄糖苷/异香草醛葡萄糖苷和相关产物。HPLC分析使用带有二元泵的AGILENT 1100系列系统和Phenomenex Synergi Polar-RP2.5u 100x2.00mm柱来进行,所述柱能够将前体与异香草醛和香草醛分离开。使用水/乙腈+0.1%三氟乙酸运行平梯度。如表3中所示,执行8.9分钟程序+1.1分钟后运行。
表3
时间 | 乙腈的% | 流速ml/分钟. |
0 | 5 | 0.5 |
0.7 | 5 | 0.5 |
5.7 | 27 | 0.5 |
6.2 | 100 | 0.5 |
6.6 | 100 | 0.7 |
7.8 | 100 | 1.0 |
8.1 | 100 | 1.0 |
8.6 | 5 | 0.8 |
8.9 | 5 | 0.6 |
通过对HPLC峰的面积进行积分并将其与标准曲线进行比较,来定量香草醛葡萄糖苷和异香草醛葡萄糖苷。该分析的结果示出在图3A中。表达SEQ ID NO:27的野生型Hs-OMT(被称为Hs-COMT wt)的酵母细胞以约1:3的比率生产异香草醛和香草醛,而突变体L198Y以约1:125的比率生产异香草醛和香草醛。由于异香草醛的生产处于检测极限处,准确的比率难以确定。
除了图3A中的突变分析之外,在突变体L198C、L198N、L198D、L198F和L198E中还具有良好的特异性和低的异香草醛生产。
实施例7:香草醇的减少
作为说明,将简青霉(GENBANK登记号P56216)和R.jostii(GENBANK登记号YP_703243.1)的VAO基因分离并克隆到酵母表达载体中。随后将表达载体转化到表达葡萄糖基转移酶的酵母菌株中。通过将酵母在增补有3mM香草醇的培养基中生长48小时,测试转化的菌株的VAO活性。该分析的结果显示在图4中。来自于简青霉和R.jostii两者的VAO酶在酵母中表现出活性。当在能够生产香草醛葡萄糖苷的菌株中分析VAO酶时,在香草醛葡萄糖苷发酵期间香草醇的积累减少。
实施例8:来自于粗糙脉孢霉(Neurospora crassa)的ACAR基因
作为来自于Nocardia iowensis的ACAR蛋白(EC 1.2.1.30)(Hansen等,(2009)Appl.Environ.Microbiol.75:2765-74)的可替选方案,调查了在酵母中使用粗糙脉孢霉的ACAR酶(Gross&Zenk(1969)Eur.J.Biochem.8:413-9;US 6,372,461),因为脉孢霉(面包霉)是一种GRAS生物。将与Nocardia iowensis ACAR具有同源性的粗糙脉孢霉基因(GENBANKXP_955820)分离并克隆在酵母表达载体中。将表达载体转化到表达PPtase的酵母菌株中,选择存在ACAR基因的菌株,并将所选的酵母在增补有3mM香草酸的培养基中培养72小时以证实ACAR活性。该分析的结果显示在图5中。发现粗糙脉孢霉的ACAR酶与N.iowensis ACAR相比在酵母中表现出更高活性。因此,在本文公开的方法的某些实施方式中,将粗糙脉孢霉的ACAR酶用于香草醛或香草醛葡萄糖苷的生产。
除了N.iownsis或粗糙脉孢霉的ACAR蛋白之外,设想了可以使用其他ACAR蛋白,包括但不限于从巴西诺卡氏菌(Nocardia brasiliensis)(GENBANK登记号EHY26728)、皮疽诺卡氏菌(Nocardia farcinica)(GENBANK登记号BAD56861)、柄孢壳(Podospora anserina)(GENBANK登记号CAP62295)或粪生粪壳菌(Sordaria macropora)(GENBANK登记号CCC14931)分离到的与N.iownsis或粗糙脉孢霉的ACAR蛋白具有显著序列同一性的ACAR蛋白。
序列表
<110> 国际香料香精公司(International Flavors & Fragrances Inc.)
伊沃华公司(Evolva SA)
<120> 用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法(COMPOSITIONSAND METHODS FOR THE BIOSYNTHESIS OF VANILLIN OR VANILLIN BETA-D-GLUCOSIDE)
<130> SPI182113-91
<150> US 61/521,090
<151> 2011-08-08
<150> US 61/522,096
<151> 2011-08-10
<160> 56
<170> PatentIn version 3.5
<210> 1
<211> 3912
<212> DNA
<213> Saccharomyces cerevisiae
<400> 1
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggtt ga 3912
<210> 2
<211> 1303
<212> PRT
<213> Saccharomyces cerevisiae
<400> 2
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly
1295 1300
<210> 3
<211> 4767
<212> DNA
<213> Saccharomyces cerevisiae
<400> 3
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attggtgaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcacaat tcctctgaaa ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttat tggtgctgta aacacagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
gccctttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa caggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtacca 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccgcat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag tggacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 4
<211> 1588
<212> PRT
<213> Saccharomyces cerevisiae
<400> 4
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Val Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Thr Ile Pro
1355 1360 1365
Leu Lys Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val Ile Gly Ala Val Asn Thr Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Ala Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Arg
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Pro Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Ala
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Trp Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 5
<211> 4767
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 5
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attggtgaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcacaat tcctctgaaa ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttat tggtgctgta aacacagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
gccctttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa caggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtacca 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccccat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag tggacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 6
<211> 1588
<212> PRT
<213> Artificial sequence
<220>
<223> Synthethic polypeptide
<400> 6
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Val Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Thr Ile Pro
1355 1360 1365
Leu Lys Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val Ile Gly Ala Val Asn Thr Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Ala Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Arg
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Pro Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Pro
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Trp Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 7
<211> 4767
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 7
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attggtgaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcacaat tcctctgaaa ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttat tggtgctgta aacacagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
gccctttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa caggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtaaaa 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccgcat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag tggacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 8
<211> 1588
<212> PRT
<213> Artificial sequence
<220>
<223> Synthetic polypeptide
<400> 8
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Val Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Thr Ile Pro
1355 1360 1365
Leu Lys Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val Ile Gly Ala Val Asn Thr Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Ala Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Arg
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Lys Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Ala
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Trp Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 9
<211> 4767
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 9
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attggtgaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcacaat tcctctgaaa ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttat tggtgctgta aacacagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
gccctttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa ctggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtacca 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccgcat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag tggacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 10
<211> 1588
<212> PRT
<213> Artificial sequence
<220>
<223> Synthetic polypeptide
<400> 10
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Val Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Thr Ile Pro
1355 1360 1365
Leu Lys Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val Ile Gly Ala Val Asn Thr Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Ala Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Trp
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Pro Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Ala
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Trp Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 11
<211> 4767
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 11
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attgggaaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcacaat tcctctgaaa ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttat tggtgctgta aacacagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
gccctttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa caggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtacca 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccgcat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag tggacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 12
<211> 1588
<212> PRT
<213> Artificial sequence
<220>
<223> Synthetic polypeptide
<400> 12
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Gly Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Thr Ile Pro
1355 1360 1365
Leu Lys Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val Ile Gly Ala Val Asn Thr Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Ala Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Arg
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Pro Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Ala
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Trp Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 13
<211> 4767
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 13
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attggtgaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcaggat tcctctgaaa ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttat tggtgctgta aacacagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
gccctttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa caggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtacca 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccgcat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag tggacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 14
<211> 1588
<212> PRT
<213> Artificial sequence
<220>
<223> Synthetic polypeptide
<400> 14
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Val Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Arg Ile Pro
1355 1360 1365
Leu Lys Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val Ile Gly Ala Val Asn Thr Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Ala Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Arg
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Pro Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Ala
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Trp Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 15
<211> 4767
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 15
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attggtgaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcacaat tcctctgaaa ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttca tggtgctgta aacacagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
gccctttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa caggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtacca 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccgcat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag tggacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 16
<211> 1588
<212> PRT
<213> Artificial sequence
<220>
<223> Synthetic polypeptide
<400> 16
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Val Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Thr Ile Pro
1355 1360 1365
Leu Lys Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val His Gly Ala Val Asn Thr Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Ala Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Arg
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Pro Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Ala
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Trp Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 17
<211> 4767
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 17
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attggtgaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcacaat tcctctgaaa ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttat tggtgctgta aacacagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
gccctttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa caggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtacca 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccgcat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag gttacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 18
<211> 1588
<212> PRT
<213> Artificial sequence
<220>
<223> Synthetic polypeptide
<400> 18
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Val Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Thr Ile Pro
1355 1360 1365
Leu Lys Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val Ile Gly Ala Val Asn Thr Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Ala Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Arg
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Pro Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Ala
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Val Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 19
<211> 4767
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 19
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attggtgaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcacaat tcctctgaaa ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttat tggtgctgta aacaaagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
gccctttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa caggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtacca 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccgcat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag tggacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 20
<211> 1588
<212> PRT
<213> Artificial sequence
<220>
<223> Synthetic polypeptide
<400> 20
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Val Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Thr Ile Pro
1355 1360 1365
Leu Lys Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val Ile Gly Ala Val Asn Lys Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Ala Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Arg
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Pro Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Ala
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Trp Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 21
<211> 4767
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 21
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attggtgaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcacaat tcctctgtta ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttat tggtgctgta aacacagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
gccctttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa caggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtacca 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccgcat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag tggacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 22
<211> 1588
<212> PRT
<213> Artificial sequence
<220>
<223> Synthetic polypeptide
<400> 22
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Val Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Thr Ile Pro
1355 1360 1365
Leu Leu Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val Ile Gly Ala Val Asn Thr Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Ala Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Arg
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Pro Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Ala
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Trp Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 23
<211> 4767
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 23
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccactctaga 3960
tcgccaattt tacataacac tggctatgaa attttaggtt tacctcacaa gttcgataaa 4020
tttgaaactg aatccgcaca attggtgaaa gaaaaacttt tggacggaaa caagaacttt 4080
ggcggtgctg cagtcacaat tcctctgaaa ttagatataa tgcagtacat ggatgaattg 4140
actgatgctg ctaaagttat tggtgctgta aacacagtta taccattggg taacaagaag 4200
tttaagggtg ataataccga ctggttaggt atccgtaatg ccttaattaa caatggcgtt 4260
cccgaatatg ttggtcatac cgctggtttg gttatcggtg caggtggcac ttctagagcc 4320
ccactttacg ccttgcacag tttaggttgc aaaaagatct tcataatcaa caggacaact 4380
tcgaaattga agccattaat agagtcactt ccatctgaat tcaacattat tggaatagag 4440
tccactaaat ctatagaaga gattaaggaa cacgttggcg ttgctgtcag ctgtgtacca 4500
gccgacaaac cattagatga cgaactttta agtaagctgg agagattcct tgtgaaaggt 4560
gcccatgctg cttttgtacc aaccttattg gaagccgcat acaaaccaag cgttactccc 4620
gttatgacaa tttcacaaga caaatatcaa tggcacgttg tccctggatc acaaatgtta 4680
gtacaccaag gtgtagctca gtttgaaaag tggacaggat tcaagggccc tttcaaggcc 4740
atttttgatg ccgttacgaa agagtag 4767
<210> 24
<211> 1588
<212> PRT
<213> Artificial sequence
<220>
<223> Synthetic polypeptide
<400> 24
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly His Ser Arg Ser Pro Ile
1310 1315 1320
Leu His Asn Thr Gly Tyr Glu Ile Leu Gly Leu Pro His Lys Phe
1325 1330 1335
Asp Lys Phe Glu Thr Glu Ser Ala Gln Leu Val Lys Glu Lys Leu
1340 1345 1350
Leu Asp Gly Asn Lys Asn Phe Gly Gly Ala Ala Val Thr Ile Pro
1355 1360 1365
Leu Lys Leu Asp Ile Met Gln Tyr Met Asp Glu Leu Thr Asp Ala
1370 1375 1380
Ala Lys Val Ile Gly Ala Val Asn Thr Val Ile Pro Leu Gly Asn
1385 1390 1395
Lys Lys Phe Lys Gly Asp Asn Thr Asp Trp Leu Gly Ile Arg Asn
1400 1405 1410
Ala Leu Ile Asn Asn Gly Val Pro Glu Tyr Val Gly His Thr Ala
1415 1420 1425
Gly Leu Val Ile Gly Ala Gly Gly Thr Ser Arg Ala Pro Leu Tyr
1430 1435 1440
Ala Leu His Ser Leu Gly Cys Lys Lys Ile Phe Ile Ile Asn Arg
1445 1450 1455
Thr Thr Ser Lys Leu Lys Pro Leu Ile Glu Ser Leu Pro Ser Glu
1460 1465 1470
Phe Asn Ile Ile Gly Ile Glu Ser Thr Lys Ser Ile Glu Glu Ile
1475 1480 1485
Lys Glu His Val Gly Val Ala Val Ser Cys Val Pro Ala Asp Lys
1490 1495 1500
Pro Leu Asp Asp Glu Leu Leu Ser Lys Leu Glu Arg Phe Leu Val
1505 1510 1515
Lys Gly Ala His Ala Ala Phe Val Pro Thr Leu Leu Glu Ala Ala
1520 1525 1530
Tyr Lys Pro Ser Val Thr Pro Val Met Thr Ile Ser Gln Asp Lys
1535 1540 1545
Tyr Gln Trp His Val Val Pro Gly Ser Gln Met Leu Val His Gln
1550 1555 1560
Gly Val Ala Gln Phe Glu Lys Trp Thr Gly Phe Lys Gly Pro Phe
1565 1570 1575
Lys Ala Ile Phe Asp Ala Val Thr Lys Glu
1580 1585
<210> 25
<211> 5064
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic nucleic acid
<400> 25
atggtgcagt tagccaaagt cccaattcta ggaaatgata ttatccacgt tgggtataac 60
attcatgacc atttggttga aaccataatt aaacattgtc cttcttcgac atacgttatt 120
tgcaatgata cgaacttgag taaagttcca tactaccagc aattagtcct ggaattcaag 180
gcttctttgc cagaaggctc tcgtttactt acttatgttg ttaaaccagg tgagacaagt 240
aaaagtagag aaaccaaagc gcagctagaa gattatcttt tagtggaagg atgtactcgt 300
gatacggtta tggtagcgat cggtggtggt gttattggtg acatgattgg gttcgttgca 360
tctacattta tgagaggtgt tcgtgttgtc caagtaccaa catccttatt ggcaatggtc 420
gattcctcca ttggtggtaa aactgctatt gacactcctc taggtaaaaa ctttattggt 480
gcattttggc aaccaaaatt tgtccttgta gatattaaat ggctagaaac gttagccaag 540
agagagttta tcaatgggat ggcagaagtt atcaagactg cttgtatttg gaacgctgac 600
gaatttacta gattagaatc aaacgcttcg ttgttcttaa atgttgttaa tggggcaaaa 660
aatgtcaagg ttaccaatca attgacaaac gagattgacg agatatcgaa tacagatatt 720
gaagctatgt tggatcatac atataagtta gttcttgaga gtattaaggt caaagcggaa 780
gttgtctctt cggatgaacg tgaatccagt ctaagaaacc ttttgaactt cggacattct 840
attggtcatg cttatgaagc tatactaacc ccacaagcat tacatggtga atgtgtgtcc 900
attggtatgg ttaaagaggc ggaattatcc cgttatttcg gtattctctc ccctacccaa 960
gttgcacgtc tatccaagat tttggttgcc tacgggttgc ctgtttcgcc tgatgagaaa 1020
tggtttaaag agctaacctt acataagaaa acaccattgg atatcttatt gaagaaaatg 1080
agtattgaca agaaaaacga gggttccaaa aagaaggtgg tcattttaga aagtattggt 1140
aagtgctatg gtgactccgc tcaatttgtt agcgatgaag acctgagatt tattctaaca 1200
gatgaaaccc tcgtttaccc cttcaaggac atccctgctg atcaacagaa agttgttatc 1260
ccccctggtt ctaagtccat ctccaatcgt gctttaattc ttgctgccct cggtgaaggt 1320
caatgtaaaa tcaagaactt attacattct gatgatacta aacatatgtt aaccgctgtt 1380
catgaattga aaggtgctac gatatcatgg gaagataatg gtgagacggt agtggtggaa 1440
ggacatggtg gttccacatt gtcagcttgt gctgacccct tatatctagg taatgcaggt 1500
actgcatcta gatttttgac ttccttggct gccttggtca attctacttc aagccaaaag 1560
tatatcgttt taactggtaa cgcaagaatg caacaaagac caattgctcc tttggtcgat 1620
tctttgcgtg ctaatggtac taaaattgag tacttgaata atgaaggttc cctgccaatc 1680
aaagtttata ctgattcggt attcaaaggt ggtagaattg aattagctgc tacagtttct 1740
tctcagtacg tatcctctat cttgatgtgt gccccatacg ctgaagaacc tgtaactttg 1800
gctcttgttg gtggtaagcc aatctctaaa ttgtacgtcg atatgacaat aaaaatgatg 1860
gaaaaattcg gtatcaatgt tgaaacttct actacagaac cttacactta ttatattcca 1920
aagggacatt atattaaccc atcagaatac gtcattgaaa gtgatgcctc aagtgctaca 1980
tacccattgg ccttcgccgc aatgactggt actaccgtaa cggttccaaa cattggtttt 2040
gagtcgttac aaggtgatgc cagatttgca agagatgtct tgaaacctat gggttgtaaa 2100
ataactcaaa cggcaacttc aactactgtt tcgggtcctc ctgtaggtac tttaaagcca 2160
ttaaaacatg ttgatatgga gccaatgact gatgcgttct taactgcatg tgttgttgcc 2220
gctatttcgc acgacagtga tccaaattct gcaaatacaa ccaccattga aggtattgca 2280
aaccagcgtg tcaaagagtg taacagaatt ttggccatgg ctacagagct cgccaaattt 2340
ggcgtcaaaa ctacagaatt accagatggt attcaagtcc atggtttaaa ctcgataaaa 2400
gatttgaagg ttccttccga ctcttctgga cctgtcggtg tatgcacata tgatgatcat 2460
cgtgtggcca tgagtttctc gcttcttgca ggaatggtaa attctcaaaa tgaacgtgac 2520
gaagttgcta atcctgtaag aatacttgaa agacattgta ctggtaaaac ctggcctggc 2580
tggtgggatg tgttacattc cgaactaggt gccaaattag atggtgcaga acctttagag 2640
tgcacatcca aaaagaactc aaagaaaagc gttgtcatta ttggcatgag agcagctggc 2700
aaaactacta taagtaaatg gtgcgcatcc gctctgggtt acaaattagt tgacctagac 2760
gagctgtttg agcaacagca taacaatcaa agtgttaaac aatttgttgt ggagaacggt 2820
tgggagaagt tccgtgagga agaaacaaga attttcaagg aagttattca aaattacggc 2880
gatgatggat atgttttctc aacaggtggc ggtattgttg aaagcgctga gtctagaaaa 2940
gccttaaaag attttgcctc atcaggtgga tacgttttac acttacatag ggatattgag 3000
gagacaattg tctttttaca aagtgatcct tcaagacctg cctatgtgga agaaattcgt 3060
gaagtttgga acagaaggga ggggtggtat aaagaatgct caaatttctc tttctttgct 3120
cctcattgct ccgcagaagc tgagttccaa gctctaagaa gatcgtttag taagtacatt 3180
gcaaccatta caggtgtcag agaaatagaa attccaagcg gaagatctgc ctttgtgtgt 3240
ttaacctttg atgacttaac tgaacaaact gagaatttga ctccaatctg ttatggttgt 3300
gaggctgtag aggtcagagt agaccatttg gctaattact ctgctgattt cgtgagtaaa 3360
cagttatcta tattgcgtaa agccactgac agtattccta tcatttttac tgtgcgaacc 3420
atgaagcaag gtggcaactt tcctgatgaa gagttcaaaa ccttgagaga gctatacgat 3480
attgccttga agaatggtgt tgaattcctt gacttagaac taactttacc tactgatatc 3540
caatatgagg ttattaacaa aaggggcaac accaagatca ttggttccca tcatgacttc 3600
caaggattat actcctggga cgacgctgaa tgggaaaaca gattcaatca agcgttaact 3660
cttgatgtgg atgttgtaaa atttgtgggt acggctgtta atttcgaaga taatttgaga 3720
ctggaacact ttagggatac acacaagaat aagcctttaa ttgcagttaa tatgacttct 3780
aaaggtagca tttctcgtgt tttgaataat gttttaacac ctgtgacatc agatttattg 3840
cctaactccg ctgcccctgg ccaattgaca gtagcacaaa ttaacaagat gtatacatct 3900
atgggaggta tcgagcctaa ggaactgttt gttgttggaa agccaattgg ccccgggaaa 3960
atgccttcca aactcgccat cacttccatg tcacttggcc ggtgttatgc cggccactcc 4020
ttcaccacta agctcgatat ggcccggaaa tatggctatc aaggcctaga gctcttccac 4080
gaggacttgg ctgatgtagc ctatcgtctc tccggagaga ccccttcccc atgtggcccg 4140
tccccagcag cccagctctc ggctgcccgt caaatcctcc gcatgtgcca agtcagaaac 4200
attgaaatcg tctgcctcca gcccttcagc cagtacgacg gcctactcga ccgcgaggag 4260
cacgagcgcc gtctggagca gctcgagttc tggatcgagc tcgcccacga gcttgacaca 4320
gacattatcc aaatccccgc caactttctc cccgccgagg aagtaactga ggacatttcg 4380
ctcatcgtct cggaccttca agaagtggcc gacatgggcc tgcaggccaa cccacccatc 4440
cgctttgtct acgaggctct gtgctggagc actcgtgtcg acacttggga gcgtagctgg 4500
gaggtggtgc agagggtgaa caggcccaac tttggcgtgt gcctggacac tttcaacatt 4560
gcggggcggg tatatgctga tccgacggtt gcctctggcc gcacccccaa cgcggaggaa 4620
gcgatacgga agtcgattgc gcgtctcgtt gaaagggtcg atgtcagcaa ggtcttttat 4680
gtgcaggttg tggacgctga gaagttgaag aagccgctgg tgccgggtca tcggttttat 4740
gacccggagc agccggcgag gatgagctgg tcaaggaact gcaggttatt ctacggggag 4800
aaggacagag gggcgtattt gcccgtcaag gagattgcct gggccttctt caacgggctc 4860
ggattcgagg gttgggtcag tctggagctc ttcaacagaa gaatgtcgga cacaggcttt 4920
ggggtgcccg aggagctggc caggagaggg gccgtgtcgt gggcaaagct ggtgagggac 4980
atgaagatca ctgttgattc accaacacaa caacaagcca cacagcagcc catcaggatg 5040
ctgtcgctgt cagcggcttt gtaa 5064
<210> 26
<211> 1687
<212> PRT
<213> Artificial sequence
<220>
<223> Synthetic polypeptide
<400> 26
Met Val Gln Leu Ala Lys Val Pro Ile Leu Gly Asn Asp Ile Ile His
1 5 10 15
Val Gly Tyr Asn Ile His Asp His Leu Val Glu Thr Ile Ile Lys His
20 25 30
Cys Pro Ser Ser Thr Tyr Val Ile Cys Asn Asp Thr Asn Leu Ser Lys
35 40 45
Val Pro Tyr Tyr Gln Gln Leu Val Leu Glu Phe Lys Ala Ser Leu Pro
50 55 60
Glu Gly Ser Arg Leu Leu Thr Tyr Val Val Lys Pro Gly Glu Thr Ser
65 70 75 80
Lys Ser Arg Glu Thr Lys Ala Gln Leu Glu Asp Tyr Leu Leu Val Glu
85 90 95
Gly Cys Thr Arg Asp Thr Val Met Val Ala Ile Gly Gly Gly Val Ile
100 105 110
Gly Asp Met Ile Gly Phe Val Ala Ser Thr Phe Met Arg Gly Val Arg
115 120 125
Val Val Gln Val Pro Thr Ser Leu Leu Ala Met Val Asp Ser Ser Ile
130 135 140
Gly Gly Lys Thr Ala Ile Asp Thr Pro Leu Gly Lys Asn Phe Ile Gly
145 150 155 160
Ala Phe Trp Gln Pro Lys Phe Val Leu Val Asp Ile Lys Trp Leu Glu
165 170 175
Thr Leu Ala Lys Arg Glu Phe Ile Asn Gly Met Ala Glu Val Ile Lys
180 185 190
Thr Ala Cys Ile Trp Asn Ala Asp Glu Phe Thr Arg Leu Glu Ser Asn
195 200 205
Ala Ser Leu Phe Leu Asn Val Val Asn Gly Ala Lys Asn Val Lys Val
210 215 220
Thr Asn Gln Leu Thr Asn Glu Ile Asp Glu Ile Ser Asn Thr Asp Ile
225 230 235 240
Glu Ala Met Leu Asp His Thr Tyr Lys Leu Val Leu Glu Ser Ile Lys
245 250 255
Val Lys Ala Glu Val Val Ser Ser Asp Glu Arg Glu Ser Ser Leu Arg
260 265 270
Asn Leu Leu Asn Phe Gly His Ser Ile Gly His Ala Tyr Glu Ala Ile
275 280 285
Leu Thr Pro Gln Ala Leu His Gly Glu Cys Val Ser Ile Gly Met Val
290 295 300
Lys Glu Ala Glu Leu Ser Arg Tyr Phe Gly Ile Leu Ser Pro Thr Gln
305 310 315 320
Val Ala Arg Leu Ser Lys Ile Leu Val Ala Tyr Gly Leu Pro Val Ser
325 330 335
Pro Asp Glu Lys Trp Phe Lys Glu Leu Thr Leu His Lys Lys Thr Pro
340 345 350
Leu Asp Ile Leu Leu Lys Lys Met Ser Ile Asp Lys Lys Asn Glu Gly
355 360 365
Ser Lys Lys Lys Val Val Ile Leu Glu Ser Ile Gly Lys Cys Tyr Gly
370 375 380
Asp Ser Ala Gln Phe Val Ser Asp Glu Asp Leu Arg Phe Ile Leu Thr
385 390 395 400
Asp Glu Thr Leu Val Tyr Pro Phe Lys Asp Ile Pro Ala Asp Gln Gln
405 410 415
Lys Val Val Ile Pro Pro Gly Ser Lys Ser Ile Ser Asn Arg Ala Leu
420 425 430
Ile Leu Ala Ala Leu Gly Glu Gly Gln Cys Lys Ile Lys Asn Leu Leu
435 440 445
His Ser Asp Asp Thr Lys His Met Leu Thr Ala Val His Glu Leu Lys
450 455 460
Gly Ala Thr Ile Ser Trp Glu Asp Asn Gly Glu Thr Val Val Val Glu
465 470 475 480
Gly His Gly Gly Ser Thr Leu Ser Ala Cys Ala Asp Pro Leu Tyr Leu
485 490 495
Gly Asn Ala Gly Thr Ala Ser Arg Phe Leu Thr Ser Leu Ala Ala Leu
500 505 510
Val Asn Ser Thr Ser Ser Gln Lys Tyr Ile Val Leu Thr Gly Asn Ala
515 520 525
Arg Met Gln Gln Arg Pro Ile Ala Pro Leu Val Asp Ser Leu Arg Ala
530 535 540
Asn Gly Thr Lys Ile Glu Tyr Leu Asn Asn Glu Gly Ser Leu Pro Ile
545 550 555 560
Lys Val Tyr Thr Asp Ser Val Phe Lys Gly Gly Arg Ile Glu Leu Ala
565 570 575
Ala Thr Val Ser Ser Gln Tyr Val Ser Ser Ile Leu Met Cys Ala Pro
580 585 590
Tyr Ala Glu Glu Pro Val Thr Leu Ala Leu Val Gly Gly Lys Pro Ile
595 600 605
Ser Lys Leu Tyr Val Asp Met Thr Ile Lys Met Met Glu Lys Phe Gly
610 615 620
Ile Asn Val Glu Thr Ser Thr Thr Glu Pro Tyr Thr Tyr Tyr Ile Pro
625 630 635 640
Lys Gly His Tyr Ile Asn Pro Ser Glu Tyr Val Ile Glu Ser Asp Ala
645 650 655
Ser Ser Ala Thr Tyr Pro Leu Ala Phe Ala Ala Met Thr Gly Thr Thr
660 665 670
Val Thr Val Pro Asn Ile Gly Phe Glu Ser Leu Gln Gly Asp Ala Arg
675 680 685
Phe Ala Arg Asp Val Leu Lys Pro Met Gly Cys Lys Ile Thr Gln Thr
690 695 700
Ala Thr Ser Thr Thr Val Ser Gly Pro Pro Val Gly Thr Leu Lys Pro
705 710 715 720
Leu Lys His Val Asp Met Glu Pro Met Thr Asp Ala Phe Leu Thr Ala
725 730 735
Cys Val Val Ala Ala Ile Ser His Asp Ser Asp Pro Asn Ser Ala Asn
740 745 750
Thr Thr Thr Ile Glu Gly Ile Ala Asn Gln Arg Val Lys Glu Cys Asn
755 760 765
Arg Ile Leu Ala Met Ala Thr Glu Leu Ala Lys Phe Gly Val Lys Thr
770 775 780
Thr Glu Leu Pro Asp Gly Ile Gln Val His Gly Leu Asn Ser Ile Lys
785 790 795 800
Asp Leu Lys Val Pro Ser Asp Ser Ser Gly Pro Val Gly Val Cys Thr
805 810 815
Tyr Asp Asp His Arg Val Ala Met Ser Phe Ser Leu Leu Ala Gly Met
820 825 830
Val Asn Ser Gln Asn Glu Arg Asp Glu Val Ala Asn Pro Val Arg Ile
835 840 845
Leu Glu Arg His Cys Thr Gly Lys Thr Trp Pro Gly Trp Trp Asp Val
850 855 860
Leu His Ser Glu Leu Gly Ala Lys Leu Asp Gly Ala Glu Pro Leu Glu
865 870 875 880
Cys Thr Ser Lys Lys Asn Ser Lys Lys Ser Val Val Ile Ile Gly Met
885 890 895
Arg Ala Ala Gly Lys Thr Thr Ile Ser Lys Trp Cys Ala Ser Ala Leu
900 905 910
Gly Tyr Lys Leu Val Asp Leu Asp Glu Leu Phe Glu Gln Gln His Asn
915 920 925
Asn Gln Ser Val Lys Gln Phe Val Val Glu Asn Gly Trp Glu Lys Phe
930 935 940
Arg Glu Glu Glu Thr Arg Ile Phe Lys Glu Val Ile Gln Asn Tyr Gly
945 950 955 960
Asp Asp Gly Tyr Val Phe Ser Thr Gly Gly Gly Ile Val Glu Ser Ala
965 970 975
Glu Ser Arg Lys Ala Leu Lys Asp Phe Ala Ser Ser Gly Gly Tyr Val
980 985 990
Leu His Leu His Arg Asp Ile Glu Glu Thr Ile Val Phe Leu Gln Ser
995 1000 1005
Asp Pro Ser Arg Pro Ala Tyr Val Glu Glu Ile Arg Glu Val Trp
1010 1015 1020
Asn Arg Arg Glu Gly Trp Tyr Lys Glu Cys Ser Asn Phe Ser Phe
1025 1030 1035
Phe Ala Pro His Cys Ser Ala Glu Ala Glu Phe Gln Ala Leu Arg
1040 1045 1050
Arg Ser Phe Ser Lys Tyr Ile Ala Thr Ile Thr Gly Val Arg Glu
1055 1060 1065
Ile Glu Ile Pro Ser Gly Arg Ser Ala Phe Val Cys Leu Thr Phe
1070 1075 1080
Asp Asp Leu Thr Glu Gln Thr Glu Asn Leu Thr Pro Ile Cys Tyr
1085 1090 1095
Gly Cys Glu Ala Val Glu Val Arg Val Asp His Leu Ala Asn Tyr
1100 1105 1110
Ser Ala Asp Phe Val Ser Lys Gln Leu Ser Ile Leu Arg Lys Ala
1115 1120 1125
Thr Asp Ser Ile Pro Ile Ile Phe Thr Val Arg Thr Met Lys Gln
1130 1135 1140
Gly Gly Asn Phe Pro Asp Glu Glu Phe Lys Thr Leu Arg Glu Leu
1145 1150 1155
Tyr Asp Ile Ala Leu Lys Asn Gly Val Glu Phe Leu Asp Leu Glu
1160 1165 1170
Leu Thr Leu Pro Thr Asp Ile Gln Tyr Glu Val Ile Asn Lys Arg
1175 1180 1185
Gly Asn Thr Lys Ile Ile Gly Ser His His Asp Phe Gln Gly Leu
1190 1195 1200
Tyr Ser Trp Asp Asp Ala Glu Trp Glu Asn Arg Phe Asn Gln Ala
1205 1210 1215
Leu Thr Leu Asp Val Asp Val Val Lys Phe Val Gly Thr Ala Val
1220 1225 1230
Asn Phe Glu Asp Asn Leu Arg Leu Glu His Phe Arg Asp Thr His
1235 1240 1245
Lys Asn Lys Pro Leu Ile Ala Val Asn Met Thr Ser Lys Gly Ser
1250 1255 1260
Ile Ser Arg Val Leu Asn Asn Val Leu Thr Pro Val Thr Ser Asp
1265 1270 1275
Leu Leu Pro Asn Ser Ala Ala Pro Gly Gln Leu Thr Val Ala Gln
1280 1285 1290
Ile Asn Lys Met Tyr Thr Ser Met Gly Gly Ile Glu Pro Lys Glu
1295 1300 1305
Leu Phe Val Val Gly Lys Pro Ile Gly Pro Gly Lys Met Pro Ser
1310 1315 1320
Lys Leu Ala Ile Thr Ser Met Ser Leu Gly Arg Cys Tyr Ala Gly
1325 1330 1335
His Ser Phe Thr Thr Lys Leu Asp Met Ala Arg Lys Tyr Gly Tyr
1340 1345 1350
Gln Gly Leu Glu Leu Phe His Glu Asp Leu Ala Asp Val Ala Tyr
1355 1360 1365
Arg Leu Ser Gly Glu Thr Pro Ser Pro Cys Gly Pro Ser Pro Ala
1370 1375 1380
Ala Gln Leu Ser Ala Ala Arg Gln Ile Leu Arg Met Cys Gln Val
1385 1390 1395
Arg Asn Ile Glu Ile Val Cys Leu Gln Pro Phe Ser Gln Tyr Asp
1400 1405 1410
Gly Leu Leu Asp Arg Glu Glu His Glu Arg Arg Leu Glu Gln Leu
1415 1420 1425
Glu Phe Trp Ile Glu Leu Ala His Glu Leu Asp Thr Asp Ile Ile
1430 1435 1440
Gln Ile Pro Ala Asn Phe Leu Pro Ala Glu Glu Val Thr Glu Asp
1445 1450 1455
Ile Ser Leu Ile Val Ser Asp Leu Gln Glu Val Ala Asp Met Gly
1460 1465 1470
Leu Gln Ala Asn Pro Pro Ile Arg Phe Val Tyr Glu Ala Leu Cys
1475 1480 1485
Trp Ser Thr Arg Val Asp Thr Trp Glu Arg Ser Trp Glu Val Val
1490 1495 1500
Gln Arg Val Asn Arg Pro Asn Phe Gly Val Cys Leu Asp Thr Phe
1505 1510 1515
Asn Ile Ala Gly Arg Val Tyr Ala Asp Pro Thr Val Ala Ser Gly
1520 1525 1530
Arg Thr Pro Asn Ala Glu Glu Ala Ile Arg Lys Ser Ile Ala Arg
1535 1540 1545
Leu Val Glu Arg Val Asp Val Ser Lys Val Phe Tyr Val Gln Val
1550 1555 1560
Val Asp Ala Glu Lys Leu Lys Lys Pro Leu Val Pro Gly His Arg
1565 1570 1575
Phe Tyr Asp Pro Glu Gln Pro Ala Arg Met Ser Trp Ser Arg Asn
1580 1585 1590
Cys Arg Leu Phe Tyr Gly Glu Lys Asp Arg Gly Ala Tyr Leu Pro
1595 1600 1605
Val Lys Glu Ile Ala Trp Ala Phe Phe Asn Gly Leu Gly Phe Glu
1610 1615 1620
Gly Trp Val Ser Leu Glu Leu Phe Asn Arg Arg Met Ser Asp Thr
1625 1630 1635
Gly Phe Gly Val Pro Glu Glu Leu Ala Arg Arg Gly Ala Val Ser
1640 1645 1650
Trp Ala Lys Leu Val Arg Asp Met Lys Ile Thr Val Asp Ser Pro
1655 1660 1665
Thr Gln Gln Gln Ala Thr Gln Gln Pro Ile Arg Met Leu Ser Leu
1670 1675 1680
Ser Ala Ala Leu
1685
<210> 27
<211> 221
<212> PRT
<213> Homo sapiens
<400> 27
Met Gly Asp Thr Lys Glu Gln Arg Ile Leu Asn His Val Leu Gln His
1 5 10 15
Ala Glu Pro Gly Asn Ala Gln Ser Val Leu Glu Ala Ile Asp Thr Tyr
20 25 30
Cys Glu Gln Lys Glu Trp Ala Met Asn Val Gly Asp Lys Lys Gly Lys
35 40 45
Ile Val Asp Ala Val Ile Gln Glu His Gln Pro Ser Val Leu Leu Glu
50 55 60
Leu Gly Ala Tyr Cys Gly Tyr Ser Ala Val Arg Met Ala Arg Leu Leu
65 70 75 80
Ser Pro Gly Ala Arg Leu Ile Thr Ile Glu Ile Asn Pro Asp Cys Ala
85 90 95
Ala Ile Thr Gln Arg Met Val Asp Phe Ala Gly Val Lys Asp Lys Val
100 105 110
Thr Leu Val Val Gly Ala Ser Gln Asp Ile Ile Pro Gln Leu Lys Lys
115 120 125
Lys Tyr Asp Val Asp Thr Leu Asp Met Val Phe Leu Asp His Trp Lys
130 135 140
Asp Arg Tyr Leu Pro Asp Thr Leu Leu Leu Glu Glu Cys Gly Leu Leu
145 150 155 160
Arg Lys Gly Thr Val Leu Leu Ala Asp Asn Val Ile Cys Pro Gly Ala
165 170 175
Pro Asp Phe Leu Ala His Val Arg Gly Ser Ser Cys Phe Glu Cys Thr
180 185 190
His Tyr Gln Ser Phe Leu Glu Tyr Arg Glu Val Val Asp Gly Leu Glu
195 200 205
Lys Ala Ile Tyr Lys Gly Pro Gly Ser Glu Ala Gly Pro
210 215 220
<210> 28
<211> 40
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<400> 28
cgtagcatgc agtctagaaa aatgggtgac actaaggagc 40
<210> 29
<211> 45
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<400> 29
gacgacgtta gtgacagaat tcttatggac cagcttcaga acctg 45
<210> 30
<211> 25
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<400> 30
cgtagcatgc agtctagaaa aatgg 25
<210> 31
<211> 22
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<400> 31
gacgacgtta gtgacagaat tc 22
<210> 32
<211> 45
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (30)..(30)
<223> n is a, c, g, or t
<400> 32
ctattgacac ttattgtgag caaaaggagn rkgctatgaa cgttg 45
<210> 33
<211> 45
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (30)..(30)
<223> n is a, c, g, or t
<400> 33
ctattgacac ttattgtgag caaaaggagn ykgctatgaa cgttg 45
<210> 34
<211> 29
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<400> 34
ctccttttgc tcacaataag tgtcaatag 29
<210> 35
<211> 45
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (31)..(31)
<223> n is a, c, g, or t
<400> 35
gacacttatt gtgagcaaaa ggagtgggct nrkaacgttg gtgac 45
<210> 36
<211> 43
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (29)..(29)
<223> n is a, c, g, or t
<400> 36
cacttattgt gagcaaaagg agtgggctny kaacgttggt gac 43
<210> 37
<211> 24
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<400> 37
ctccttttgc tcacaataag tgtc 24
<210> 38
<211> 46
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (27)..(27)
<223> n is a, c, g, or t
<400> 38
ctttggacat ggttttcttg gaccatnrka aggacagata tttgcc 46
<210> 39
<211> 46
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (27)..(27)
<223> n is a, c, g, or t
<400> 39
ctttggacat ggttttcttg gaccatnyka aggacagata tttgcc 46
<210> 40
<211> 26
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<400> 40
atggtccaag aaaaccatgt ccaaag 26
<210> 41
<211> 53
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (30)..(30)
<223> n is a, c, g, or t
<400> 41
gtactgtttt gttagctgac aacgttattn rkccaggtgc tccagacttc ttg 53
<210> 42
<211> 53
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (30)..(30)
<223> n is a, c, g, or t
<400> 42
gtactgtttt gttagctgac aacgttattn ykccaggtgc tccagacttc ttg 53
<210> 43
<211> 29
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<400> 43
aataacgttg tcagctaaca aaacagtac 29
<210> 44
<211> 47
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (30)..(30)
<223> n is a, c, g, or t
<400> 44
ctgttttgtt agctgacaac gttatttgtn rkggtgctcc agacttc 47
<210> 45
<211> 47
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (30)..(30)
<223> n is a, c, g, or t
<400> 45
ctgttttgtt agctgacaac gttatttgtn ykggtgctcc agacttc 47
<210> 46
<211> 29
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<400> 46
acaaataacg ttgtcagcta acaaaacag 29
<210> 47
<211> 103
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (86)..(86)
<223> n is a, c, g, or t
<400> 47
atttagaatt cttatggacc agcttcagaa cctggaccct tatatatagc cttctccaaa 60
ccgtcaacaa cctctctata ttcmyngaaa gattgataat gag 103
<210> 48
<211> 102
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (85)..(85)
<223> n is a, c, g, or t
<400> 48
atttagaatt ctatggacca gcttcagaac ctggaccctt atatatagcc ttctccaaac 60
cgtcaacaac ctctctatat tcmrngaaag attgataatg ag 102
<210> 49
<211> 101
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (83)..(83)
<223> n is a, c, g, or t
<400> 49
atttagaatt cttatggacc agcttcagaa cctggaccct tatatatagc cttctccaaa 60
ccgtcaacaa cctctctata myncaagaaa gattgataat g 101
<210> 50
<211> 101
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (83)..(83)
<223> n is a, c, g, or t
<400> 50
atttagaatt cttatggacc agcttcagaa cctggaccct tatatatagc cttctccaaa 60
ccgtcaacaa cctctctata mrncaagaaa gattgataat g 101
<210> 51
<211> 95
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (77)..(77)
<223> n is a, c, g, or t
<400> 51
atttagaatt cttatggacc agcttcagaa cctggaccct tatatatagc cttctccaaa 60
ccgtcaacaa cctcmynata ttccaagaaa gattg 95
<210> 52
<211> 95
<212> DNA
<213> Artificial sequence
<220>
<223> Synthetic oligonucleotide
<220>
<221> misc_feature
<222> (77)..(77)
<223> n is a, c, g, or t
<400> 52
atttagaatt cttatggacc agcttcagaa cctggaccct tatatatagc cttctccaaa 60
ccgtcaacaa cctcmrnata ttccaagaaa gattg 95
<210> 53
<211> 363
<212> PRT
<213> Arabidopsis thaliana
<400> 53
Met Gly Ser Thr Ala Glu Thr Gln Leu Thr Pro Val Gln Val Thr Asp
1 5 10 15
Asp Glu Ala Ala Leu Phe Ala Met Gln Leu Ala Ser Ala Ser Val Leu
20 25 30
Pro Met Ala Leu Lys Ser Ala Leu Glu Leu Asp Leu Leu Glu Ile Met
35 40 45
Ala Lys Asn Gly Ser Pro Met Ser Pro Thr Glu Ile Ala Ser Lys Leu
50 55 60
Pro Thr Lys Asn Pro Glu Ala Pro Val Met Leu Asp Arg Ile Leu Arg
65 70 75 80
Leu Leu Thr Ser Tyr Ser Val Leu Thr Cys Ser Asn Arg Lys Leu Ser
85 90 95
Gly Asp Gly Val Glu Arg Ile Tyr Gly Leu Gly Pro Val Cys Lys Tyr
100 105 110
Leu Thr Lys Asn Glu Asp Gly Val Ser Ile Ala Ala Leu Cys Leu Met
115 120 125
Asn Gln Asp Lys Val Leu Met Glu Ser Trp Tyr His Leu Lys Asp Ala
130 135 140
Ile Leu Asp Gly Gly Ile Pro Phe Asn Lys Ala Tyr Gly Met Ser Ala
145 150 155 160
Phe Glu Tyr His Gly Thr Asp Pro Arg Phe Asn Lys Val Phe Asn Asn
165 170 175
Gly Met Ser Asn His Ser Thr Ile Thr Met Lys Lys Ile Leu Glu Thr
180 185 190
Tyr Lys Gly Phe Glu Gly Leu Thr Ser Leu Val Asp Val Gly Gly Gly
195 200 205
Ile Gly Ala Thr Leu Lys Met Ile Val Ser Lys Tyr Pro Asn Leu Lys
210 215 220
Gly Ile Asn Phe Asp Leu Pro His Val Ile Glu Asp Ala Pro Ser His
225 230 235 240
Pro Gly Ile Glu His Val Gly Gly Asp Met Phe Val Ser Val Pro Lys
245 250 255
Gly Asp Ala Ile Phe Met Lys Trp Ile Cys His Asp Trp Ser Asp Glu
260 265 270
His Cys Val Lys Phe Leu Lys Asn Cys Tyr Glu Ser Leu Pro Glu Asp
275 280 285
Gly Lys Val Ile Leu Ala Glu Cys Ile Leu Pro Glu Thr Pro Asp Ser
290 295 300
Ser Leu Ser Thr Lys Gln Val Val His Val Asp Cys Ile Met Leu Ala
305 310 315 320
His Asn Pro Gly Gly Lys Glu Arg Thr Glu Lys Glu Phe Glu Ala Leu
325 330 335
Ala Lys Ala Ser Gly Phe Lys Gly Ile Lys Val Val Cys Asp Ala Phe
340 345 350
Gly Val Asn Leu Ile Glu Leu Leu Lys Lys Leu
355 360
<210> 54
<211> 365
<212> PRT
<213> Fragaria x ananassa
<400> 54
Met Gly Ser Thr Gly Glu Thr Gln Met Thr Pro Thr His Val Ser Asp
1 5 10 15
Glu Glu Ala Asn Leu Phe Ala Met Gln Leu Ala Ser Ala Ser Val Leu
20 25 30
Pro Met Val Leu Lys Ala Ala Ile Glu Leu Asp Leu Leu Glu Ile Met
35 40 45
Ala Lys Ala Gly Pro Gly Ser Phe Leu Ser Pro Ser Asp Leu Ala Ser
50 55 60
Gln Leu Pro Thr Lys Asn Pro Glu Ala Pro Val Met Leu Asp Arg Met
65 70 75 80
Leu Arg Leu Leu Ala Ser Tyr Ser Ile Leu Thr Cys Ser Leu Arg Thr
85 90 95
Leu Pro Asp Gly Lys Val Glu Arg Leu Tyr Cys Leu Gly Pro Val Cys
100 105 110
Lys Phe Leu Thr Lys Asn Glu Asp Gly Val Ser Ile Ala Ala Leu Cys
115 120 125
Leu Met Asn Gln Asp Lys Val Leu Val Glu Ser Trp Tyr His Leu Lys
130 135 140
Asp Ala Val Leu Asp Gly Gly Ile Pro Phe Asn Lys Ala Tyr Gly Met
145 150 155 160
Thr Ala Phe Asp Tyr His Gly Thr Asp Pro Arg Phe Asn Lys Val Phe
165 170 175
Asn Lys Gly Met Ala Asp His Ser Thr Ile Thr Met Lys Lys Ile Leu
180 185 190
Glu Thr Tyr Lys Gly Phe Glu Gly Leu Lys Ser Ile Val Asp Val Gly
195 200 205
Gly Gly Thr Gly Ala Val Val Asn Met Ile Val Ser Lys Tyr Pro Ser
210 215 220
Ile Lys Gly Ile Asn Phe Asp Leu Pro His Val Ile Glu Asp Ala Pro
225 230 235 240
Gln Tyr Pro Gly Val Gln His Val Gly Gly Asp Met Phe Val Ser Val
245 250 255
Pro Lys Gly Asn Ala Ile Phe Met Lys Trp Ile Cys His Asp Trp Ser
260 265 270
Asp Glu His Cys Ile Lys Phe Leu Lys Asn Cys Tyr Ala Ala Leu Pro
275 280 285
Asp Asp Gly Lys Val Ile Leu Ala Glu Cys Ile Leu Pro Val Ala Pro
290 295 300
Asp Thr Ser Leu Ala Thr Lys Gly Val Val His Met Asp Val Ile Met
305 310 315 320
Leu Ala His Asn Pro Gly Gly Lys Glu Arg Thr Glu Gln Glu Phe Glu
325 330 335
Ala Leu Ala Lys Gly Ser Gly Phe Gln Gly Ile Arg Val Cys Cys Asp
340 345 350
Ala Phe Asn Thr Tyr Val Ile Glu Phe Leu Lys Lys Ile
355 360 365
<210> 55
<211> 271
<212> PRT
<213> Homo sapiens
<400> 55
Met Pro Glu Ala Pro Pro Leu Leu Leu Ala Ala Val Leu Leu Gly Leu
1 5 10 15
Val Leu Leu Val Val Leu Leu Leu Leu Leu Arg His Trp Gly Trp Gly
20 25 30
Leu Cys Leu Ile Gly Trp Asn Glu Phe Ile Leu Gln Pro Ile His Asn
35 40 45
Leu Leu Met Gly Asp Thr Lys Glu Gln Arg Ile Leu Asn His Val Leu
50 55 60
Gln His Ala Glu Pro Gly Asn Ala Gln Ser Val Leu Glu Ala Ile Asp
65 70 75 80
Thr Tyr Cys Glu Gln Lys Glu Trp Ala Met Asn Val Gly Asp Lys Lys
85 90 95
Gly Lys Ile Val Asp Ala Val Ile Gln Glu His Gln Pro Ser Val Leu
100 105 110
Leu Glu Leu Gly Ala Tyr Cys Gly Tyr Ser Ala Val Arg Met Ala Arg
115 120 125
Leu Leu Ser Pro Gly Ala Arg Leu Ile Thr Ile Glu Ile Asn Pro Asp
130 135 140
Cys Ala Ala Ile Thr Gln Arg Met Val Asp Phe Ala Gly Val Lys Asp
145 150 155 160
Lys Val Thr Leu Val Val Gly Ala Ser Gln Asp Ile Ile Pro Gln Leu
165 170 175
Lys Lys Lys Tyr Asp Val Asp Thr Leu Asp Met Val Phe Leu Asp His
180 185 190
Trp Lys Asp Arg Tyr Leu Pro Asp Thr Leu Leu Leu Glu Glu Cys Gly
195 200 205
Leu Leu Arg Lys Gly Thr Val Leu Leu Ala Asp Asn Val Ile Cys Pro
210 215 220
Gly Ala Pro Asp Phe Leu Ala His Val Arg Gly Ser Ser Cys Phe Glu
225 230 235 240
Cys Thr His Tyr Gln Ser Phe Leu Glu Tyr Arg Glu Val Val Asp Gly
245 250 255
Leu Glu Lys Ala Ile Tyr Lys Gly Pro Gly Ser Glu Ala Gly Pro
260 265 270
<210> 56
<211> 263
<212> PRT
<213> Vanilla planifolia
<400> 56
Met Ala Thr Thr Val Ala Thr Ala Thr Arg Ala Thr Glu Asn Lys Thr
1 5 10 15
Gln Thr Glu Glu Asn Ser Gln Asn Gly Gly Gln Gln Thr Gly His Gln
20 25 30
Glu Ile Gly His Lys Ser Leu Leu Lys Ser Asp Ala Leu Tyr Gln Tyr
35 40 45
Ile Leu Glu Thr Ser Val Tyr Pro Arg Glu Pro Glu Cys Leu Lys Glu
50 55 60
Leu Arg Glu Ile Thr Ala Lys His Pro Trp Asn Leu Met Thr Thr Ser
65 70 75 80
Ala Asp Glu Gly Gln Phe Leu Gly Met Leu Leu Lys Leu Ile Asn Ala
85 90 95
Lys Asn Thr Met Glu Ile Gly Val Phe Thr Gly Tyr Ser Leu Leu Ala
100 105 110
Thr Ala Leu Ala Leu Pro Asp Asp Gly Lys Ile Leu Ala Met Asp Ile
115 120 125
Asn Arg Glu Asn Tyr Glu Leu Gly Leu Pro Leu Ile Gln Lys Ala Gly
130 135 140
Val Ala His Lys Ile Asp Phe Arg Glu Gly Pro Ala Leu Pro Val Leu
145 150 155 160
Asp Glu Leu Met Lys Asp Glu Ser Lys His Gly Ser Phe Asp Phe Ile
165 170 175
Phe Val Asp Ala Asp Lys Asp Asn Tyr Leu Asn Tyr His Gln Arg Ile
180 185 190
Ile Asp Leu Val Lys Val Gly Gly Val Ile Gly Tyr Asp Asn Thr Leu
195 200 205
Trp Asn Gly Ala Val Val Leu Pro Pro Asp Ala Pro Met Arg Lys Tyr
210 215 220
Ile Arg Tyr Tyr Arg Asp Phe Val Ile Glu Leu Asn Lys Glu Leu Ala
225 230 235 240
Ala Asp Pro Arg Ile Glu Ile Cys Gln Leu Pro Val Gly Asp Gly Ile
245 250 255
Thr Leu Cys Arg Arg Val Lys
260
Claims (16)
1.一种分离的突变体Arom多功能酶AROM多肽,其为SEQ ID NO:20或SEQ ID NO:22中显示的氨基酸序列。
2.权利要求1的分离的突变体AROM多肽,其中所述突变体AROM多肽还在所述多肽的N-端或C-端包含纯化标签、叶绿体转运肽、线粒体转运肽、造粉体肽、信号肽或分泌标签。
3.一种分离的核酸,其编码权利要求1的突变体AROM多肽。
4.一种重组宿主,其包含编码权利要求1的突变体AROM多肽的异源核酸,其中所述重组宿主是微生物。
5.权利要求4的重组宿主,其中所述重组宿主还包含编码3-脱氢莽草酸脱水酶3DSD多肽的核酸、编码芳香族羧酸还原酶ACAR多肽的核酸、编码磷酸泛酰巯基乙胺转移酶PPTase多肽的核酸、编码尿苷5’-二磷酸葡萄糖基转移酶UGT多肽的核酸和/或编码香草醇氧化酶VAO的核酸。
6.权利要求5的重组宿主,其中所述宿主还包含编码O-甲基转移酶OMT的基因。
7.权利要求4-6任一项的重组宿主,其中所述重组宿主是酿酒酵母(Saccharomycescerevisiae)、粟酒裂殖酵母(Schizosaccharomyces pombe)或大肠埃希氏杆菌(Escherichia coli)。
8.一种用于生产香草醛和/或香草醛β-D-葡萄糖苷的方法,所述方法包括:
(a)提供能够生产香草醛的重组宿主,其中所述重组宿主带有编码权利要求1的突变体Arom多功能酶AROM多肽的异源核酸;
(b)将所述重组宿主培养足以使所述重组宿主产生香草醛和/或香草醛葡萄糖苷的时间;以及
(c)从所述重组宿主或从所述培养的上清液分离香草醛和/或香草醛葡萄糖苷,由此生产香草醛和/或香草醛β-D-葡萄糖苷。
9.权利要求8的方法,其中所述突变体AROM多肽还在所述多肽的N-端或C-端包含纯化标签、叶绿体转运肽、线粒体转运肽、造粉体肽、信号肽或分泌标签。
10.权利要求8的方法,其中所述宿主是微生物。
11.权利要求8的方法,其中所述宿主是酿酒酵母(Saccharomyces cerevisiae)、粟酒裂殖酵母(Schizosaccharomyces pombe)或大肠埃希氏杆菌(Escherichia coli)。
12.权利要求8的方法,其中所述宿主是包含pdc1缺失、gdh1缺失并且过表达谷氨酸脱氢酶2的酿酒酵母(S.cerevisiae)。
13.权利要求8的方法,其中所述宿主是植物或植物细胞。
14.权利要求8的方法,其中所述宿主是小立碗藓属(Physcomitrella)植物或植物细胞、或烟草植物或植物细胞。
15.权利要求8的方法,其中所述宿主还包含编码3-脱氢莽草酸脱水酶3DSD的基因、编码芳香族羧酸还原酶ACAR的基因、编码尿苷5’-二磷酸葡萄糖基转移酶UGT的基因、编码磷酸泛酰巯基乙胺转移酶PPTase的基因和/或编码香草醇氧化酶VAO的基因。
16.权利要求8的方法,其中所述宿主还包含编码O-甲基转移酶OMT的基因。
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201161521090P | 2011-08-08 | 2011-08-08 | |
US61/521,090 | 2011-08-08 | ||
US201161522096P | 2011-08-10 | 2011-08-10 | |
US61/522,096 | 2011-08-10 | ||
CN201280048260.5A CN103987840A (zh) | 2011-08-08 | 2012-08-07 | 用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法 |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201280048260.5A Division CN103987840A (zh) | 2011-08-08 | 2012-08-07 | 用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108570464A CN108570464A (zh) | 2018-09-25 |
CN108570464B true CN108570464B (zh) | 2021-12-21 |
Family
ID=47668880
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201280048260.5A Pending CN103987840A (zh) | 2011-08-08 | 2012-08-07 | 用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法 |
CN201810330216.2A Active CN108570464B (zh) | 2011-08-08 | 2012-08-07 | 用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法 |
CN201510390606.5A Active CN105063104B (zh) | 2011-08-08 | 2012-08-07 | 用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法 |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201280048260.5A Pending CN103987840A (zh) | 2011-08-08 | 2012-08-07 | 用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法 |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510390606.5A Active CN105063104B (zh) | 2011-08-08 | 2012-08-07 | 用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法 |
Country Status (7)
Country | Link |
---|---|
US (3) | US10208293B2 (zh) |
EP (3) | EP2742126B1 (zh) |
CN (3) | CN103987840A (zh) |
BR (1) | BR112014003041B1 (zh) |
ES (2) | ES2730102T3 (zh) |
MX (1) | MX355785B (zh) |
WO (1) | WO2013022881A1 (zh) |
Families Citing this family (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104769121B (zh) | 2012-11-05 | 2021-06-01 | 埃沃尔瓦公司 | 香草醛合酶 |
WO2014128252A1 (en) * | 2013-02-21 | 2014-08-28 | Eviagenics S.A. | Biosynthesis of o-methylated phenolic compounds |
WO2014150504A2 (en) | 2013-03-15 | 2014-09-25 | The Regents Of The University Of California | Tissue specific reduction of lignin |
US10000782B2 (en) | 2013-07-16 | 2018-06-19 | International Flavors & Fragrances Inc. | Recombinant host cell for the biosynthesis of vanillin or vanillin beta-D-glucoside |
US20170172184A1 (en) * | 2014-02-12 | 2017-06-22 | Evolva Sa | Methods of Improving Production of Vanillin |
EP2957629A1 (en) | 2014-06-18 | 2015-12-23 | Rhodia Opérations | Improved production of vanilloids by fermentation |
EP2957635A1 (en) * | 2014-06-18 | 2015-12-23 | Rhodia Opérations | Improved selectivity of the production of vanilloids in a recombinant unicellular host |
CN104878025B (zh) * | 2015-06-01 | 2017-11-03 | 安徽农业大学 | 一种没食子酰葡萄糖基转移酶CsUGT84A22基因及其编码蛋白和应用 |
CN108350412B (zh) | 2015-10-27 | 2022-02-11 | 味之素株式会社 | 用于生产醛的方法 |
US10538746B2 (en) * | 2016-03-17 | 2020-01-21 | INVISTA North America S.à r.l. | Polypeptides and variants having improved activity, materials and processes relating thereto |
WO2017165397A1 (en) | 2016-03-22 | 2017-09-28 | University Of Georgia Research Foundation, Inc. | Genetically engineered microbes and methods for producing citramalate |
EP3438245B1 (en) * | 2016-03-28 | 2023-01-11 | Research Institute Of Innovative Technology For The Earth | Transformant, and method for producing protocatechuic acid or salt thereof using same |
JP2019536449A (ja) | 2016-10-26 | 2019-12-19 | 味の素株式会社 | 目的物質の製造方法 |
EP3532631A1 (en) | 2016-10-26 | 2019-09-04 | Ajinomoto Co., Inc. | Method for producing l-methionine or metabolites requiring s-adenosylmethionine for synthesis |
CN109890970B (zh) * | 2016-10-26 | 2023-04-04 | 味之素株式会社 | 生产目标物质的方法 |
CN109952380B (zh) | 2016-10-26 | 2023-07-14 | 味之素株式会社 | 用于生产目标物质的方法 |
WO2018079685A1 (en) | 2016-10-26 | 2018-05-03 | Ajinomoto Co., Inc. | Method for producing objective substance |
EP3532610B1 (en) | 2016-10-27 | 2023-09-27 | Ajinomoto Co., Inc. | Method for producing aldehyde |
US10858676B2 (en) | 2017-05-22 | 2020-12-08 | Ajinomoto Co., Inc. | Method for producing objective substance |
US11680279B2 (en) | 2017-11-29 | 2023-06-20 | Ajinomoto Co., Inc. | Method for producing objective substance |
US11447800B2 (en) | 2018-03-29 | 2022-09-20 | Firmenich Sa | Method for producing vanillin |
WO2020027251A1 (en) | 2018-08-03 | 2020-02-06 | Ajinomoto Co., Inc. | Method for producing objective substance |
JPWO2020226087A1 (zh) | 2019-05-08 | 2020-11-12 | ||
WO2021022216A1 (en) | 2019-08-01 | 2021-02-04 | Amyris, Inc. | Modified host cells for high efficiency production of vanillin |
JP2022550463A (ja) | 2019-10-01 | 2022-12-01 | エンピリアン ニューロサイエンス, インコーポレイテッド | トリプタミン発現を調整する真菌の遺伝子操作 |
US11441164B2 (en) | 2019-11-15 | 2022-09-13 | Cb Therapeutics, Inc. | Biosynthetic production of psilocybin and related intermediates in recombinant organisms |
CN111676251A (zh) * | 2019-12-31 | 2020-09-18 | 上海仁酶生物科技有限公司 | 一种咖啡酸和香兰素的制备方法及其反应催化剂的制备方法 |
CN112481321B (zh) * | 2020-09-14 | 2023-08-29 | 齐齐哈尔龙江阜丰生物科技有限公司 | 颗粒型苏氨酸的生产工艺 |
EP4214325A1 (en) | 2020-09-15 | 2023-07-26 | Amyris, Inc. | Culture compositions and methods of their use for high yield production of vanillin |
CN112481322A (zh) * | 2020-12-27 | 2021-03-12 | 赵兰坤 | 苏氨酸高效发酵生产工艺 |
FR3120628A1 (fr) | 2021-03-15 | 2022-09-16 | Rhodia Operations | Procédé de purification de vanilline ou ses dérivés obtenus par un procédé biotechnologique |
FR3120627B1 (fr) | 2021-03-15 | 2024-01-19 | Rhodia Operations | Procédé de purification de vanilline ou d’un dérivé de vanilline obtenus par un procédé biotechnologique |
FR3120629B1 (fr) | 2021-03-15 | 2024-01-19 | Rhodia Operations | Procédé de purification de vanilline ou d’un dérivé de vanilline obtenus par un procédé biotechnologique |
BR112023018940A2 (pt) | 2021-03-19 | 2023-11-28 | Amyris Inc | Célula hospedeira geneticamente modificada capaz de produzir vanilina ou glucovanilina, método para produzir vanilina ou glucovanilina, e, vanilina ou glucovanilina |
CN113234611B (zh) * | 2021-05-07 | 2023-01-20 | 天津大学 | 酿酒酵母工程菌及其在制备原儿茶酸中的应用 |
CN115612699B (zh) * | 2021-07-14 | 2024-10-25 | 中国科学院天津工业生物技术研究所 | 共聚物膜及其在气-液界面的酶促自组装合成方法与应用 |
CN114181877B (zh) * | 2021-12-08 | 2024-06-07 | 北京化工大学 | 一种合成香兰素的基因工程菌及其应用 |
WO2023130075A2 (en) | 2021-12-31 | 2023-07-06 | Empyrean Neuroscience, Inc. | Genetically modified organisms for producing psychotropic alkaloids |
KR20240155248A (ko) | 2022-02-15 | 2024-10-28 | 단스타 퍼멘트 에이쥐 | 바닐린을 회수 및 정제하는 방법 |
CN116656641A (zh) * | 2022-02-18 | 2023-08-29 | 中国科学院分子植物科学卓越创新中心 | 咖啡酸o-甲基转移酶突变体及其应用 |
CN115558611B (zh) * | 2022-10-24 | 2024-02-23 | 贵州大学 | 一种高絮凝性的日本裂殖酵母菌株及其应用 |
CN115975833B (zh) * | 2022-10-25 | 2024-07-12 | 厦门大学 | 一种生产香兰素的酿酒酵母重组菌株及其构建方法 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6372461B1 (en) * | 1998-09-18 | 2002-04-16 | Board Of Directors Operating Michigan State University | Synthesis of vanillin from a carbon source |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4684534A (en) | 1985-02-19 | 1987-08-04 | Dynagram Corporation Of America | Quick-liquifying, chewable tablet |
US5011678A (en) | 1989-02-01 | 1991-04-30 | California Biotechnology Inc. | Composition and method for administration of pharmaceutically active substances |
US5484956A (en) | 1990-01-22 | 1996-01-16 | Dekalb Genetics Corporation | Fertile transgenic Zea mays plant comprising heterologous DNA encoding Bacillus thuringiensis endotoxin |
US6946587B1 (en) | 1990-01-22 | 2005-09-20 | Dekalb Genetics Corporation | Method for preparing fertile transgenic corn plants |
US5204253A (en) | 1990-05-29 | 1993-04-20 | E. I. Du Pont De Nemours And Company | Method and apparatus for introducing biological substances into living cells |
US5128253A (en) | 1991-05-31 | 1992-07-07 | Kraft General Foods, Inc. | Bioconversion process for the production of vanillin |
WO1994013614A1 (en) | 1992-12-10 | 1994-06-23 | Quest International B.V. | Production of vanillin |
US6649186B1 (en) | 1996-09-20 | 2003-11-18 | Ethypharm | Effervescent granules and methods for their preparation |
JPH10117776A (ja) | 1996-10-22 | 1998-05-12 | Japan Tobacco Inc | インディカイネの形質転換方法 |
AU734764B2 (en) | 1997-07-15 | 2001-06-21 | David Michael & Co., Inc. | Improved vanillin production |
US6368625B1 (en) | 1998-08-12 | 2002-04-09 | Cima Labs Inc. | Orally disintegrable tablet forming a viscous slurry |
US6060078A (en) | 1998-09-28 | 2000-05-09 | Sae Han Pharm Co., Ltd. | Chewable tablet and process for preparation thereof |
DK1234046T3 (da) | 1999-12-01 | 2009-03-16 | Royal Veterinary & Agricultural Univ | UDP-glucose: aglycon-glucosyltransferase |
US6316029B1 (en) | 2000-05-18 | 2001-11-13 | Flak Pharma International, Ltd. | Rapidly disintegrating solid oral dosage form |
-
2012
- 2012-08-07 EP EP12822242.9A patent/EP2742126B1/en active Active
- 2012-08-07 WO PCT/US2012/049842 patent/WO2013022881A1/en active Application Filing
- 2012-08-07 BR BR112014003041-3A patent/BR112014003041B1/pt active IP Right Grant
- 2012-08-07 CN CN201280048260.5A patent/CN103987840A/zh active Pending
- 2012-08-07 CN CN201810330216.2A patent/CN108570464B/zh active Active
- 2012-08-07 CN CN201510390606.5A patent/CN105063104B/zh active Active
- 2012-08-07 EP EP16184537.5A patent/EP3118304B1/en active Active
- 2012-08-07 US US14/236,991 patent/US10208293B2/en active Active
- 2012-08-07 ES ES16184537T patent/ES2730102T3/es active Active
- 2012-08-07 EP EP18215503.6A patent/EP3514226B1/en active Active
- 2012-08-07 MX MX2014001571A patent/MX355785B/es active IP Right Grant
- 2012-08-07 ES ES12822242T patent/ES2719304T3/es active Active
-
2019
- 2019-01-09 US US16/243,670 patent/US11008592B2/en active Active
-
2021
- 2021-04-14 US US17/230,533 patent/US20210230646A1/en not_active Abandoned
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6372461B1 (en) * | 1998-09-18 | 2002-04-16 | Board Of Directors Operating Michigan State University | Synthesis of vanillin from a carbon source |
Non-Patent Citations (4)
Title |
---|
De novo biosynthesis of vanillin in fission yeast (Schizosaccharomyces pombe) and baker"s yeast (Saccharomyces cerevisiae);Esben H.Hansen等;《Appl Environ Microbiol》;20090313;第75卷(第9期);第2765-2774页 * |
Simultaneous analysis of catechol-O-methyl transferase activity, S-adenosylhomocysteine and adenosine;Ilkka Reenilä and Pekka Rauhala;《Biomed Chromatogr》;20090723;第24卷(第3期);第294-300页 * |
Synthesis of Vanillin from Glucose;Kai Li等;《J. Am. Chem. Soc.》;19980925;第120卷(第40期);第10545-10546页 * |
The pentafunctional arom enzyme of Saccharomyces cerevisiae is a mosaic of monofunctional domains;Kenneth DUNCAN等;《Biochem J.》;19870901;第246卷(第2期);第375-386页 * |
Also Published As
Publication number | Publication date |
---|---|
ES2730102T3 (es) | 2019-11-08 |
ES2719304T3 (es) | 2019-07-09 |
EP3514226A1 (en) | 2019-07-24 |
US20140245496A1 (en) | 2014-08-28 |
MX2014001571A (es) | 2014-09-08 |
EP3118304B1 (en) | 2019-03-13 |
US10208293B2 (en) | 2019-02-19 |
EP3118304A1 (en) | 2017-01-18 |
CN105063104B (zh) | 2019-08-13 |
CN108570464A (zh) | 2018-09-25 |
BR112014003041B1 (pt) | 2021-10-13 |
US20190136270A1 (en) | 2019-05-09 |
EP2742126B1 (en) | 2018-11-07 |
MX355785B (es) | 2018-04-30 |
EP3514226B1 (en) | 2023-03-29 |
EP2742126A1 (en) | 2014-06-18 |
BR112014003041A2 (pt) | 2017-10-17 |
US11008592B2 (en) | 2021-05-18 |
CN103987840A (zh) | 2014-08-13 |
EP2742126A4 (en) | 2015-04-22 |
WO2013022881A8 (en) | 2013-03-21 |
CN105063104A (zh) | 2015-11-18 |
WO2013022881A1 (en) | 2013-02-14 |
US20210230646A1 (en) | 2021-07-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108570464B (zh) | 用于香草醛或香草醛β-D-葡萄糖苷的生物合成的组合物和方法 | |
US10066252B1 (en) | Compositions and methods for the biosynthesis of vanillin or vanillin beta-D glucoside | |
US12123042B2 (en) | Production of steviol glycosides in recombinant hosts | |
EP3283614B1 (en) | Production of non-caloric sweeteners using engineered whole-cell catalysts | |
WO2018083338A1 (en) | Production of steviol glycosides in recombinant hosts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |