[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

AU765833B2 - Compounds and methods for immunotherapy and diagnosis of tuberculosis - Google Patents

Compounds and methods for immunotherapy and diagnosis of tuberculosis Download PDF

Info

Publication number
AU765833B2
AU765833B2 AU71762/00A AU7176200A AU765833B2 AU 765833 B2 AU765833 B2 AU 765833B2 AU 71762/00 A AU71762/00 A AU 71762/00A AU 7176200 A AU7176200 A AU 7176200A AU 765833 B2 AU765833 B2 AU 765833B2
Authority
AU
Australia
Prior art keywords
ala
gly
pro
leu
val
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU71762/00A
Other versions
AU7176200A (en
Inventor
Antonio Campos-Neto
Davin C. Dillon
Raymond Houghton
Steven G. Reed
Yasir A. Skeiky
Daniel R. Twardzik
Thomas H. Vedvick
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Corixa Corp
Original Assignee
Corixa Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU71586/96A external-priority patent/AU727602B2/en
Application filed by Corixa Corp filed Critical Corixa Corp
Priority to AU71762/00A priority Critical patent/AU765833B2/en
Publication of AU7176200A publication Critical patent/AU7176200A/en
Application granted granted Critical
Publication of AU765833B2 publication Critical patent/AU765833B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Landscapes

  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
  • Peptides Or Proteins (AREA)

Description

1
AUSTRALIA
Patents Act 1990 DIVISIONAL APPLICATION Regulation 3.2 go o oO o oooe *°*ooo Name of Applicant: Actual Inventor(s): Address for Service: Corixa Corporation REED, Steven SKEIKY, Yasir DILLON, Davin C.; CAMPOS-NETO, Antonio; HOUGHTON, Raymond; VEDVICK, Thomas and TWARDZIK, Daniel R.
DAVIES COLLISON CAVE, Patent Attorneys, Level 3, 303 Coronation Drive, Milton, Queensland, 4064, Australia "Compounds and methods for immunotherapy and diagnosis of tuberculosis" Invention Title: o Details of Parent Application No: 71586/96 The following statement is a full description of this invention, including the best method of performing it known to me/us: Q:\Opei\Vpa\2361435 divisional.327.doc 22/11/00 Description COMPOUNDS AND METHODS FOR IMMUNOTHERAPY AND DIAGNOSIS OF TUBERCULOSIS Technical Field The present invention relates generally to detecting, treating and preventing Mycobacterium tuberculosis infection. The invention is more particularly related to polypeptides comprising a Mycobacterium tuberculosis antigen, or a portion or other variant thereof, and the use of such polypeptides for diagnosing and vaccinating against Mycobacterium tuberculosis infection.
Background of the Invention Tuberculosis is a chronic, infectious disease, that is generally caused by infection with Mycobacterium tuberculosis. It is a major disease in developing countries, as well as an increasing problem in developed areas of the world, with about 8 million new cases and 3 million deaths each year. Although the infection may be asymptomatic for a considerable period of time, the disease is most commonly manifested as an acute inflammation of the lungs, resulting in fever and a nonproductive i cough. If left untreated, serious complications and death typically result Although tuberculosis can generally be controlled using extended 25 antibiotic therapy, such treatment is not sufficient to prevent the spread of the disease.
Infected individuals may be asymptomatic, but contagious, for some time. In addition, although compliance with the treatment regimen is critical, patient behavior is difficult to monitor. Some patients do not complete the course of treatment, which can lead to ineffective treatment and the development of drug resistance.
Inhibiting the spread of tuberculosis requires effective vaccination and accurate, early diagnosis of the disease. Currently, vaccination with live bacteria is the most efficient method for inducing protective immunity. The most common Mycobacterium employed for this purpose is Bacillus Calmette-Guerin (BCG), an avirulent strain of Mycobacterium bovis. However, the safety and efficacy of BCG is a source of controversy and some countries, such as the United States, do not vaccinate the general public. Diagnosis is commonly achieved using a skin test, which involves intradermal exposure to tuberculin PPD (protein-purified derivative). Antigen-specific T cell responses result in measurable induration at the injection site by 48-72 hours after injection, which indicates exposure to Mycobacterial antigens. Sensitivity and specificity have, however, been a problem with this test, and individuals vaccinated 15 with BCG cannot be distinguished from infected individuals.
While macrophages have been shown to act as the principal effectors of M tuberculosis immunity, T cells are the predominant inducers of such immunity. The essential role of T cells in protection against M tuberculosis infection is illustrated by the frequent occurrence of M. tuberculosis in AIDS patients, due to the depletion of CD4 T cells associated with human immunodeficiency virus (HIV) infection.
Mycobacterium-reactive CD4 T cells have been shown to be potent producers of gamma-interferon (IFN-y), which, in turn, has been shown to trigger the antimycobacterial'effects of macrophages in mice. While the role of IFN-- in humans is less clear, studies have shown that 1, 2 5-dihydroxy-vitamin D3, either alone or in 25 combination with IFN-y or tumor necrosis factor-alpha, activates human macrophages to inhibit M tuberculosis infection. Furthermore, it is known that IFN-Y stimulates human macrophages to make 1, 2 5-dihydroxy-vitamin D3. Similarly, IL-12 has been shown to play a role in stimulating resistance to M. tuberculosis infection. For a review of the immunology of M. tuberculosis infection see Chan and Kaufmann in Tuberculosis: Pathogenesis, Protection and Control, Bloom ASM Press, Washington, DC, 1994.
Accordingly, there is a need in the art for improved vaccines and methods for preventing, treating and detecting tuberculosis. The present invention fulfills these needs and further provides other related advantages.
Summary of the Invention Briefly stated, this invention provides compounds and methods for preventing and diagnosing tuberculosis. In one aspect, polypeptides are provided comprising an immunogenic portion of a soluble M. tuberculosis antigen, or a variant of such an antigen that differs only in conservative substitutions and/or modifications. In one embodiment of this aspect, the soluble antigen has one of the following N-terminal sequences: Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly- Gin-Val-Val-Ala-Ala-Leu; (SEQ ID No. 120) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Proi (SEQ ID No. 121) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 122) Tyr-Tyr-Trp-Cys-Pro-Gly-Gn-Pro-Phe-Asp-Pro-Ala-Trp-Gly- Pro; (SEQ ID No. 123) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val; (SEQ ID No. 124) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID 25 No. 125) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-.Ser- Pro-Pro-Ser; (SEQ ID No. 126) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr- Gly; (SEQ ID No. 127) 4 As-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu- Thr-Ser-Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe- Ala-Asn; (SEQ ID No. 128) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-LysVal-Thr-Asp-Ala- SSer, (SEQ ID No. 134) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- Asp; (SEQ ID No. 135) or 0) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- Gly; (SEQ ID No. 136) wherein Xaa may be any amino acid.
In a related aspect, polypeptides are provided comprising an immunogenic portion of an M. tuberculosis antigen, or a variant of such an antigen that differs only in conservative substitutions and/or modifications, the antigen having one of the following N-terminal sequences: 15 Xaa-TyrIe-Ala-Tyr-Xaa -Thr-Aa-Gly-le-Val-Pro-Gly-L Ile-Asn-Val-His-Leu-Val; (SEQ ID No. 137) or Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr- Tyr-Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 129) wherein Xaa may be any amino acid.
20 In another embodiment, the antigen comprises an amino acid sequence encoded by a DNA sequence selected from the group consisting of the sequences recited in SEQ ID Nos.: 1, 2, 4-10, 13-25, 52, 99 and 101, the complements of said sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 1, S: 2, 4-10, 13-25, 52, 99 and 101 or a complement thereof under moderately stringent 25 conditions.
In a related aspect, the polypeptides comprise an immunogenic portion of a M. tuberculosis antigen, or a variant of such an antigen that differs only in conservative substitutions and/or modifications, wherein the antigen comprises an amino acid sequence encoded by a DNA sequence selected from the group consisting of the sequences recited in SEQ ID Nos.: 26-51, the complements of said sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID Nos.: 26-51 or a complement thereof under moderately stringent conditions.
In related aspects, DNA sequences encoding the above polypeptides, expression vectors comprising these DNA sequences and host cells transformed or transfected with such expression vectors are also provided.
In another aspect, the present invention provides fusion proteins comprising a first and a second inventive polypeptide or, alternatively, an inventive polypeptide and a known M. tuberculosis antigen.
Within other aspects, the present invention provides pharmaceutical compositions that comprise one or more of the above polypeptides, or a DNA molecule encoding such polypeptides. and a physiologically acceptable carrier. The invention also provides vaccines comprising one or more of the polypeptides as described above and a non-specific immune response enhancer, together with vaccines comprising one or more DNA sequences encoding such polypeptides and a non-specific immune response enhancer.
S.-In yet another aspect, methods are provided for inducing protective immunity in a patient, comprising administering to a patient an effective amount of one or more of the above polypeptides.
*ao o In further aspects of this invention, methods and diagnostic kits are provided for detecting tuberculosis in a patient. The methods comprise contacting dermal cells of a patient with one or more of the above polypeptides and detecting an immune response on the patient's skin. The diagnostic kits comprise one or more of the •above polypeptides in combination with an apparatus sufficient to contact the polypeptide with the dermal cells of a patient.
25 These and other aspects of the present invention will become apparent "upon reference to the following detailed description and attached drawings. All references disclosed herein are hereby incorporated by reference in their entirety as if each was incorporated individually.
Brief Description of the Drawings and Sequeince fientifier Figure IA and B illustrate the stimulation of Proliferation and interfeon- If Production in T cells derived from a first and a second M tuberculoss m e donor, respectively, by the 14 Kd, 20 Kd and 26 Kd antigens described in Example 1.
Figure 2 illustrates the stimulation Of Proliferation and interferon 1 production in T cells derived from an Ad tuberculosis~immune individual by the two representative Polypeptides TbRa3 and ThRa9.
SEQ. ID NO. 1 is the DNA sequence of ThRa 1.
SEQ. ID NO. 2 is the DNA sequence of TbRa SEQ. ID NO. 3 is the DNA sequence of TbRal 1.
SEQ. ID NO. 4 is the DNA sequence of TbRal12.
SEQ. ID NO. 5 is the DNA sequence of TbRal3.
SEQ. ID NO. 61is the DNA sequence of ThRa 16.
1 5 S E I O s t e D A s q e c f T. 7 SEQ. ID NO. 8 is the DNA sequence of IhRa 78.
SEQ. ID NO.89 is the DNA sequence of TbRal19.
SEQ. ID NO. 90 is the DNA sequence of TRa29.
SEQ.ID NO. IlIis the DNA sequence of ThRa24.
SE .I O.1 sth N eq e c f b a SEQ. ID NO. 13 is the DNA sequence of ThRa26.
SEQ IDN.1 steDA*euneo ba SEQ. ID NO. 15 is the DNA sequence of TbRa2.
SE.I O 1 steDN eune.fT~.2 SEQ. ID NO. 13 is the DNA sequence of ThRa29.
SEQ. ID NO. 14 is the DNA sequence of ThRa2.
SE.I N.1 i h DAsquneofT*4 SEQ. ID NO. 20 is the DNA sequence of TbRa9.
SEQ DDN.2 steDAsqecSfT~B SEQ. ID NO. 16 is the DNA sequence of ThRaJ.
SEQ. ID NO. 23 is the DNA sequence of TbRaD.
SEQ. ID NO. 24 is the DNA sequence of YYWCPG.
SEQ. ID NO. 25 is the DNA sequence of AAMK.
SEQ. ID NO. 26 is the DNA sequence of TbL-23.
SEQ. ID NO. 27 is the DNA sequence of TbL-24.
SEQ. ID NO. 28 is the DNA sequence of SEQ. ID NO. 29 is the DNA sequence of TbL-28.
SEQ. ID NO. 30 is the DNA sequence of TbL-29.
SEQ. ID NO. 31 is the DNA sequence of SEQ. ID NO. 32 is the DNA sequence of TbH-8.
SEQ. ID NO. 33 is the DNA sequence of TbH-9.
SEQ. ID NO. 34 is the DNA sequence of TbM-1.
SEQ. ID NO. 35 is the DNA sequence of TbM-3.
SEQ. ID NO. 36 is the DNA sequence of TbM-6.
SEQ. ID NO. 37 is the DNA sequence of TbM-7.
SSEQ. ID NO. 38 is the DNA sequence of TbM-9.
ID NO.39istheDNA sequence of TbM12.
SEQ. ID NO. 40 is the DNA sequence of TbM-13.
SEQ. ID NO. 41 is the DNA sequence of TbM-14.
SEQ. ID NO. 42 is the DNA sequence of SEQ. ID NO. 43 is the DNA sequence of TbH-4.
SEQ. ID NO. 4 is the DNA sequence of TbH4-FD.
SEQ. ID NO. 45 is the DNA sequence of TbH-12.
*SEQ. ID NO. 46 is the DNA sequence of Tb38-1.
25 SEQ. ID NO. 47 is the DNA sequence ofTb38-4.
.SEQ. ID NO. 48 is the DNA sequence of TbL-17.
i SEQ. ID NO. 49 is the DNA sequence of SEQ. ID NO. 50 is the DNA sequence ofTbL-21.
SEQ. ID NO. 50 is the DNA sequence ofTbL-21.
SEQ. ID NO. 51 is the DNA sequence of TbH-16.
SEQ. ID NO. 52 is the DNA sequence of DPEP.
8 SEQ. ID NO. 53 is the deduced amino acid sequence of DPEP.
SEQ. ID NO. 54 is the protein sequence of DPV N-terminal Antigen.
SEQ. ID NO. 55 is the protein sequence ofAVGS N-terminal Antigen.
SEQ. ID NO. 56 is the protein sequence of AAMK N-terminal Antigen.
SEQ. ID NO. 57 is the protein sequence of YYWC N-terminal Antigen.
SEQ. ID NO. 58 is the protein sequence of DIGS N-terminal Antigen.
SEQ. ID NO. 59 is the protein sequence ofAEES N-terminal Antigen.
SEQ. ID NO. 60 is the protein sequence of DPEP N-terminal Antigen.
SEQ. ID NO. 61 is the protein sequence of APKT N-terminal Antigen.
SEQ. ID NO. 62 is the protein sequence of DPAS N-terminal Antigen.
SEQ. ID NO. 63 is the deduced amino acid sequence ofTbRal.
SEQ. ID NO. 64 is the deduced amino acid sequence of TbRa0I.
SEQ. ID NO. 65 is the deduced amino acid sequence of TbRal 1.
SEQ. ID NO. 66 is the deduced amino acid sequence of TbRal2.
15 SEQ. ID NO. 67 is the deduced amino acid sequence of TbRal3.
o.
SEQ. ID NO. 68 is the deduced amino acid sequence of TbRal6.
SEQ. ID NO. 69 is the deduced amino acid sequence of TbRal7.
SEQ. ID NO. 70 is the deduced amino acid sequence of TbRal8.
SEQ. ID NO. 71 is the deduced amino acid sequence of TbRal9.
20 SEQ. ID NO. 72 is the deduced amino acid sequence of TbRa24.
SEQ. ID NO. 73 is the deduced amino acid sequence ofTbRa26.
SEQ. ID NO. 74 is the deduced amino acid sequence of TbRa28.
SEQ. ID NO. 75 is the deduced amino acid sequence of TbRa29.
SEQ. ID NO. 76 is the deduced amino acid sequence of TbRa2A.
SEQ. ID NO. 77 is the deduced amino acid sequence of TbRa3 SEQ. ID NO. 78 is the deduced amino acid sequence of TbRa32.
SEQ. ID NO. 79 is the deduced amino acid sequence of SEQ. ID NO. 80 is the deduced amino acid sequence of TbRa36.
SEQ. ID NO. 81 is the deduced amino acid sequence of TbRa4.
SEQ. ID NO. 82 is the deduced amino acid sequence ofTbRa9.
SEQ. ID NO. 83 is the deduced amino acid sequence of TbRaB.
SEQ. ID NO. 84 is the deduced amino acid sequence of TbRaC.
SEQ. ID NO. 85 is the deduced amino acid sequence of TbRaD.
SEQ. ID NO. 86 is the deduced amino acid sequence of YYWCPG.
SEQ. ID NO. 87 is the deduced amino acid sequence of TbAAMK.
SEQ. ID NO. 88 is the deduced amino acid sequence of Tb38-1.
SEQ. ID NO. 89 is the deduced amino acid sequence of TbH-4.
SEQ. ID NO. 90 is the deduced amino acid sequence of TbH-8.
SEQ. ID NO. 91 is the deduced amino acid sequence of TbH-9.
SEQ. ID NO. 92 is the deduced amino acid sequence of TbH-12.
SEQ. ID NO. 93 is the amino acid sequence ofTb38-1 Peptide 1.
SEQ. ID NO. 94 is the amino acid sequence ofTb38-1 Peptide 2.
SEQ. ID NO. 95 is the amino acid sequence ofTb38-1 Peptide 3.
SEQ. ID NO. 96 is the amino acid sequence ofTb38-1 Peptide 4.
15 SEQ. ID NO. 97 is the amino acid sequence of Tb38-1 Peptide 00. SEQ. ID NO. 98 is the amino acid sequence of Tb38-1 Peptide 6.
SEQ. ID NO. 99 is the DNA sequence of DPAS.
SEQ. ID NO. 100 is the deduced amino acid sequence of DPAS.
*SEQ. ID NO. 101 is the DNA sequence of DPV.
SEQ. ID NO. 102 is the deduced amino acid sequence of DPV.
SEQ. ID NO. 103 is the DNA sequence of ESAT-6.
SEQ. ID NO. 104 is the deduced amino acid sequence of ESAT-6.
SEQ. ID NO. 105 is the DNA sequence of TbH-8-2.
"SEQ. ID NO. 106 is the DNA sequence of TbH-9FL.
25 SEQ. ID NO. 107 is the deduced amino acid sequence of TbH-9FL.
SEQ. ID NO. 108 is the DNA sequence of TbH-9-1.
SEQ. ID NO. 109 is the deduced amino acid sequence of TbH-9-1.
SEQ. ID NO. 110 is the DNA sequence of TbH-9-4.
SEQ. ID NO. 111 is the deduced amino acid sequence ofTbH-9-4.
SEQ. ID NO. 112 is the DNA sequence of Tb38-1F2 IN.
SEQ. ID NO. 113 is the DNA sequence ofTb38-2F2
RP.
SEQ. ID NO. 114 is the deduced amino acid sequence ofTb37-FL.
SEQ. ID NO. 115 is the deduced amino acid sequence of Tb38-IN.
SEQ. ID NO. 116 is the DNA sequence ofTb38-1F3.
SEQ. ID NO. 117 is the deduced amino acid sequence ofTb38-1F3.
SEQ. ID NO. 118 is the DNA sequence ofTb38-1F5.
SEQ. ID NO. 119 is the DNA sequence ofTb38-1F6.
SEQ. ID NO. 120 is the deduced N-terminal amino acid sequence of DPV.
SEQ. ID NO. 121 is the deduced N-terminal anmino acid sequence of AVGS.
SEQ. ID NO. 122 is the deduced N-terminal amino acid sequence of AAMK.
SEQ. ID NO. 123 is the deduced N-terminal amino acid sequence of YYWC.
SEQ. ID NO. 124 is the deduced N-terminal amino acid sequence of DIGS.
SEQ. ID NO. 125 is the deduced N-terminal amino acid sequence of AEES.
SEQ. ID NO. 126 is the deduced N-terminal amino acid sequence of DPEP.
15 SEQ. ID NO. 127 is the deduced N-terminal amino acid sequence of APKT.
SEQ. ID NO. 128 is the deduced amino acid sequence of DPAS.
SEQ. ID NO. 129 is the protein sequence of DPPD N-terminal Antigen.
SEQ ID NO. 130-133 are the protein sequences of four DPPD cyanogen bromide fragments.
20 SEQ ID NO. 134 is the N-terminal protein sequence of XDS antigen.
SEQ ID NO. 135 is the N-terminal protein sequence of AGD antigen.
SEQ ID NO. 136 is the N-terminal protein sequence of APE antigen.
SEQ ID NO. 137 is the N-terminal protein sequence of XYI antigen.
25 Detailed Description of the Invention As noted above, the present invention is generally directed to compositions and methods for preventing, treating and diagnosing tuberculosis. The compositions of the subject invention include polypeptides that comprise at least one immunogenic portion of a M tuberculosis antigen, or a variant of such an antigen that differs only in conservative substitutions and/or modifications. Polypeptides within the scope of the present invention include, but are not limited to, immunogenic soluble I1I M tuberculosis antigens. A "soluble M tuberculosis antigen" is a protein of M tuberculosis origin that is present in M tuberculosis culture filtrate. As used herein, the term ."polypeptide" encompasses amino acid chains of any length.. including full length proteins (L antigens), wherein the amino acid residues are linked by covalent peptide bonds. Thus, a polypeptide comprising an immunogenic portion of one of the above antigens may consist entirely of the immunogenic portion, or may contain additional sequences. The additional sequences may be derived from the native M tuberculosis antigen or may be heterologous, and such sequences may (but need not) be immunogenic.
"Immunogenic," as used herein, refers to the ability to elicit an immune response cellular) in a patient, such as a human, and/or in a biological sample. In particular, antigens that are immunogenic (and immunogenic portions or other variants of such antigens) are capable of stimulating cell proliferation, interleukin- 12 production and/or interferon-)' production in biological samples comprising one or more cells selected from the group of T cells, NK cells, B cells and macrophages, where the cells *are derived from an M tuberculosis-imnmune individual. Polypeptides comprising at least an immunogenic portion of one or more M tuberculosis antigens may generally be used to detect tuberculosis or to induce protective immunity against tuberculosis in a patient.
The compositions and methods of this invention also encompass variants of the above polypeptides. A "variant," as used herein, is a polypeptide that differs from the native antigen only in conservative substitutions and/or modifications, such 9 that the ability of the polypeptide to induce an immune response is retained. Such variants may generally be identified by modifying one of the above polypeptide.
25 sequences, and evaluating the immunogenic properties of the modified polypeptide using, for example, the representative procedures described herein.
A "conservative substitution" is one in which an amino acid is substituted for another amino acid that has similar properties, such that one skilled in the art of peptide chemistry would expect the secondary structure and hydropathic nature of the polypeptide to be substantially unchanged. In general, the following 12 groups of amino acids represent conservative changes: ala, pro, gly, glu, asp, gin, asn, ser, thr; cys, ser, tyr, thr; val, ile, leu, met, ala, phe; lys, arg, his; and phe, tyr, trp, his.
Variants may also (or alternatively) be modified by, for example, the deletion or addition of amino acids that have minimal influence on the immunogenic properties, secondary structure and hydropathic nature of the polypeptide. For example, a polypeptide may be conjugated to a signal (or leader) sequence at the N-terminal end of the protein which co-translationally or post-translationally directs transfer of the protein. The polypeptide may also be conjugated to a linker or other sequence for ease of synthesis, purification or identification of the polypeptide poly-His), or to enhance binding of the polypeptide to a solid support. For example, a polypeptide may be conjugated to an immunoglobulin Fc region.
In a related aspect, combination polypeptides are disclosed. A "combination polypeptide" is a polypeptide comprising at least one of the above 15 immunogenic portions and one or more additional immunogenic M. tuberculosis sequences, which are joined via a peptide linkage into a single amino acid chain. The sequences may be joined directly with no intervening amino acids) or may be joined by way of a linker sequence Gly-Cys-Gly) that does not significantly diminish the immunogenic properties of the component polypeptides.
20 In general, M. tuberculosis antigens, and DNA sequences encoding such antigens, may be prepared using any of a variety of procedures. For example, soluble antigens may be isolated from M tuberculosis culture filtrate by procedures known to those of ordinary skill in the art, including anion-exchange and reverse phase chromatography. Purified antigens are then evaluated for their ability to elicit an 25 appropriate immune response cellular) using, for example, the representative methods described herein. Immunogenic antigens may then be partially sequenced using techniques such as traditional Edman chemistry. See Edman and Berg, Eur. J Biochem. 80:116-132, 1967.
Immunogenic antigens may also be produced recombinantly using a DNA sequence that encodes the antigen, which has been inserted into an expression 13 vector and expressed in an appropriate host. DNA molecules encoding soluble antigens may be isolated by screening an appropriate M. tuberculosis expression library with anti-sera rabbit) raised specifically against soluble M. tuberculosis antigens.
DNA
sequences encoding antigens that may or may not be soluble may be identified by screening an appropriate M tuberculosis genomic or cDNA expression library with sera obtained from patients 'infected with M. tuberculosis. Such screens may generally be performed using techniques well known to those of ordinary skill in the art, such as those described in Sanmbrook et al., Molecular Cloning:. A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989.
DNA sequences encoding soluble antigens may also be obtained by screening an appropriate M. tuberculosis cDNA or genomic DNA library for DNA sequences that hybridize to degenerate oligonucleotides derived from partial amino acid sequences of isolated soluble antigens. Degenerate oligonucleotide sequences for use in such a screen may be designed and synthesized, and the screen may be performed, as described (for example) in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989 (and references cited therein). Polymerase chain reaction (PCR) may also be employed, using the above oligonucleotides in methods well known in the art, to isolate a nucleic acid probe from a cDNA or genomic library. The library screen may then be performed using the isolated probe.
Alternatively, genomic or cDNA libraries derived from M. tuberculosis may be screened directly using peripheral blood mononuclear cells (PBMCs) or T cell lines or clones derived from one or more Al tuberculosis-immune individuals. In general, PBMCs and/or T cells for use in such screens may be prepared as described--.
*.25 below. Direct library screens may generally be performed by assaying pools of expressed recombinant proteins for the ability to induce proliferation and/or interferon-y production in T cells derived from an M tuberculosis-immune individual.
Alternatively, potential T cell antigens may be first selected based on antibody reactivity, as described above.
Regardless of the method of preparation, the antigens (and immunogenic portions thereof) described herein (which may or may not be soluble) have the ability to induce an immunogeric response. More specifically, the antigens have the ability to induce proliferation and/or cytokine production interferon-y and/or interleukin-12 production) in T cells, NK cells, B cells and/or macrophages derived from an M tuberculosis-immune individual. The selection of cell type for use in evaluating an immunogenic response to a antigen will, of course, depend on the desired response. For example, interleukin-12 production is most readily evaluated using preparations containing B cells and/or macrophages. An M tuberculosis-immune individual is one who is considered to be resistant to the development of tuberculosis by virtue of having mounted an effective T cell response to M. tuberculosis substantially free of disease symptoms). Such individuals may be identified based on a strongly positive greater than about 10 mm diameter induration) intradermal skin test response to tuberculosis proteins (PPD) and an absence of any signs or symptoms of tuberculosis disease. T cells, NK cells, B cells and macrophages derived from M. tuberculosisimmune individuals may be prepared using methods known to those of ordinary skill in the art. For example, a preparation of PBMCs peripheral blood mononuclear cells) may be employed without further separation of component cells. PBMCs may generally be prepared, for example, using density centrifugation through FicolI
T
M
20 (Winthrop Laboratories, NY). T cells for use in the assays described herein may also be purified directly from PBMCs. Alternatively, an enriched T cell line reactive against mycobacterial proteins, or T cell clones reactive to individual mycobacterial proteins, may be employed. Such T cell clones may be generated by, for example, culturing PBMCs from M. tuberculosis-immune individuals with mycobacterial proteins for a 25 period of 2-4 weeks. This allows expansion of only the mycobacterial protein-specific T cells, resulting in a line composed solely of such cells. These cells may then be cloned and tested with individual proteins, using methods known to those of ordinary skill in the art, to more accurately define individual T cell specificity. In general, antigens that test positive in assays for proliferation and/or cytokine production interferon-y and/or interleukin-12 production) performed using T cells, NK cells, B cells and/or macrophages derived from an M. tuberculosis-immune individual are considered immunogenic. Such assays may be performed, for example, using the representative procedures described below. Immunogenic portions of such antigens may be identified using similar assays, and may be present within the polypeptides described herein.
The ability of a polypeptide an immunogenic antigen, or a portion or other variant thereof) to induce cell proliferation is evaluated by contacting the cells T cells and/or NK cells) with the polypeptide and measuring the proliferation of the cells. In general, the amount of polypeptide that is sufficient for evaluation of about cells ranges from about 10 ng/mL to about 100 pg/mL and preferably is about 10 gg/mL. The incubation of polypeptide with cells is typically performed at 37 0 C for about six days. Following incubation with polypeptide, the cells are assayed for a proliferative response, which may be evaluated by methods known to those of ordinary skill in the art, such as exposing cells to a pulse of radiolabeled thymidine and measuring the incorporation of label into cellular DNA. In general, a polypeptide that 15 results in at least a three fold increase in proliferation above background the proliferation observed for cells cultured without polypeptide) is considered to be able to induce proliferation.
The ability of a polypeptide to stimulate the production of interferon-y and/or interleukin-12 in cells may be evaluated by contacting the cells with the 20 polypeptide and measuring the level of interferon-y or interleukin-12 produced by the cells. In general, the amount of polypeptide that is sufficient for the evaluation of about 5 cells ranges from about 10 ng/mL to about 100 jg/mL and preferably is about 10 ptg/mL. The polypeptide may, but need not, be immobilized on a solid support, such as a bead or a biodegradable microsphere, such as those described in U.S. Patent 25 Nos. 4,897,268 and 5,075,109. The incubation of polypeptide with the cells is typically performed at 37°C for about six days. Following incubation with polypeptide, the cells are assayed for interferon-y and/or interleukin-12 (or one or more subunits thereof), which may be evaluated by methods known to those of ordinary skill in the art, such as an enzyme-linked immunosorbent assay (ELISA) or, in the case of IL-12 P70 subunit, a bioassay such as an assay measuring proliferation of T cells. In general, a polypeptide that results in the production of at least 50 pg of interferon-y per ml. of cultured supernatant (containing 104_1 05 T cells per mL) is considered able to stimulate the production of interferon-y. A polypeptide that stimulates the production of at least 1l pg/mL of IL- 12 P70 subunit, and/or at least 100 pg/mI of IL- 12 P40 subunit, per 101 mnacrophages or B cells (or per 3 x 105 PBMC) is considered able to stimulate the production of IL-12.
In general, immunogenic antigens are those antigens that stimulate proliferation and/or cytokine production interferon-y and/or interleukin- 12 production) in T cells, NK cells, B cells and/or macrophages derived from at least about 25% of M tuberculosis- immnune individuals. Among these immunogenic antigens, polypeptides having superior therapeutic properties may be distinguished based on the magnitude of the responses in the above assays and based on the percentage of individuals for which a response is observed. In addition, antigens having superior therapeutic properties will not stimulate proliferation and/or cytokine production in vitro in cells derived from more than about 25% of individuals that are not M tuberculosis-immune, thereby eliminating responses that are not specifically due to M tuberculosis-responsive cells. Those antigens that induce a response in a high percentage of T cell, NK cell, B cell and/or macrophage preparations from M tuberculosis-immune individuals (with a low incidence of responses in cell preparations from other individuals) have superior therapeutic properties.
Antigens with superior therapeutic properties may also be identified based on their ability to diminish the severity of M tuberculosis infection in experimental animals, when administered as a vaccine. Suitable vaccine preparations for use on experimental animals are described in detail below. Efficacy may be determined based on the ability of the antigen to provide at least about a 50% reduction bacterial numbers and/or at least about a 40% decrease in mortality following experimental infection. Suitable experimental animals include mice, guinea pigs and primates.
Antigens having superior diagnostic properties may generally be identified based on the ability to elicit a response in an intraderrnal skin test performed on an individual with active tuberculosis, but not in a test performed on an individual who is not infected with 1. tuberculosis. Skin tests may generally be performed as described below, with a response of at least 5 mm induration considered positive.
Immunogenic portions of the antigens described herein may be prepared and identified using well known techniques, such as those summarized in Paul, Fundamental Immunology, 3d ed., Raven Press, 1993, pp. 243-247 and references cited therein. Such techniques include screening polypeptide portions of the native antigen for immunogenic properties. The representative proliferation and cytokine production assays described herein may generally be employed in these screens. An immunogenic portion of a polypeptide is a portion that, within such representative assays, generates an immune response proliferation, interferon-y production and/or interleukin-12 production) that is substantially similar to that generated by the full length antigen. in other words, an immunogenic portion of an antigen may generate at least about. and preferably about 100%, of the proliferation induced by the full length antigen in the model proliferation assay described herein. An immunogenic portion may also, or alternatively, stimulate the production of at least about 20%, and preferably about 100%, of the interferon-y and/or interleukin- 12 induced by the full length antigen in the model assay described herein.
Portions and other variants of M tuberculosis antigens may be generated by synthetic or recombinant means. Synthetic polypeptides having fewer than about 100 amino acids, and generally fewer than about 50 amino acids, may be generated using techniques well known to those of ordinary skill in the art. For example, such polypeptides may be synthesized using any of the commercially available solid-phase techniques, such as the Merrifield solid-phase synthesis method, where amino acids are :25 sequentially added to a growing amino acid chain. See Merrifield, J Am. Chem. Soc.
:85:2149-2146, 1963. Equipment for automated synthesis of polypeptides is commercially available from suppliers such as Applied BioSystems, Inc., Foster City, CA, and may be operated accord ing to the manufacturer's instructions. Variants of a native antigen may generally be prepared using standard mutagenesis techniques, such as oligonucleotide-directed site-specific mutagenesis. Sections of the DNA sequence may' also be removed using standard techniques to permit preparation of trunctd polypeptides.
Recombinant polypeptides containing portions and/or variants of a native antigen may be readily prepared from a DNA sequence encoding the polypeptide using a variety of techniques well known to those of ordinary skill in the art. For example, supernatants from suitable host/vector systems which secrete recombinant protein into culture media may be first concentrated using a commercially available filter. Following concentration, the concentrate may -be applied to a suitable purification matrix such as an affinity matrix or an ion exchange resin. Finally, one or more reverse phase HPLC steps can be employed to further purify a recombinant protein.
Any of a variety of expression vectors known to those of ordinary skill in the art may be employed to express recombinant polypeptides of this invention.
Expression may be achieved in any appropriate host cell that has been transformed or *15 transfected with an expression vector containing a DNA molecule that encodes a recombinant polypeptide. Suitable host cells include prokaryotes, yeast and higher eukaryotic cells. Preferably, the host cells employed are E. coli, yeast or a mammalian cell line such as COS or CHO. The DNA sequences expressed in this manner may encode naturally occurring antigens, portions of naturally occurring antigens, or other variants thereof.
In general, regardless of the method of preparation, the polypeptides disclosed herein are prepared in substantially pure form. Preferably, the polypeptides are at least about 80% pure, more preferably at least about 90% pure and most preferably at least about 99% pure. In certain preferred embodiments, described in 25 detail below, the substantially pure polypeptides are incorporated into pharmaceutical compositions or vaccines for use in one or more of the methods disclosed herein.
In certain specific embodiments, the subject invention discloses polypeptides; comprising at least an immunogenic portion of a soluble M tuberculosis antigen having one of the following N-terminal sequences, or a variant thereof that differs only in conservative substitutions and/or modifications: Asp-Pro-VaI-Asp-Ala-VaIle-Asn-Thr-Thr-Cys-Asn-Tyr.Gly Gln-Val-Val-Ala-Ala-Leu; (SEQ ID No. 120) Ala-Va-Gu-Ser-Gy-Met-Leu-Ala-Leu-Giy-Thr-Pro-Ala-Pro- Ser, (SEQ ID No. 12 1) Ala-Mla-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Plro-Leu-Glu-Ala.
Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 122) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- Pro; (SEQ ID No. 123)) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val; (SEQ ID No. 124) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro; (SEQ ID No. 125) Asp-Pro-Glu-Pro-Ala-Pro-Pro-VaI-Pro-Thr-Ala-Ala-Ala-Ser- Pro-Pro-Ser, (SEQ ID No. 126) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Tbr-Asp-Tbr- Gly; (SEQ ID No. 127) Asp-Pro-Ala-Ser-Ala-Pro-Asp-VaI-Pro-Thr-Ala-Ala-Gln-Leu- Thr-Ser-Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe- Ala-Asn; (SEQ ID No. 128) 20. 20 Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala- 0.0. Se, (SEQ ID No. 134) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala- *Asp; (S EQID No. 13 5) or Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-GIn-Ala- Gly; (SEQ ID No. 136) wherein Xaa may be any amino acid, preferably a cysteine residue. A DNA sequence encoding the antigen identified as above is provided in SEQ ID No. 52, and the polypeptide encoded by SEQ ID No. 52 is provided in SEQ ID No. 53. A DNA sequence encoding the antigen defined as above is provided in SEQ ID No. 101; its deduced amino acid sequence is provided in SEQ ID No. 102. A DNA sequence corresponding to antigen above is provided in SEQ ID No. 24 a DNA sequence corresponding to antigen is provided in SEQ ID No. 25 and a DNA sequence corresponding to antigen is provided in SEQ ID No. 99; its deduced amino acid sequence is provided in SEQ ID No. 100.
In a further specific embodiment, the subject invention discloses polypeptides comprising at least an immunogenic portion of an M. tuberculosis antigen having one of the following N-terminal sequences, or a variant thereof that differs only in conservative substitutions and/or modifications: Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- Ile-Asn-Val-His-Leu-Val; (SEQ ID No 137) or Asp-Pro-Pro-Asp-Pro-His-GIn-Xaa-Asp-Met-Thr-Lys-Gly-Tyr- Tyr-Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID No. 129) wherein Xaa may be any amino acid, preferably a cysteine residue.
In other specific embodiments, the subject invention discloses '15 polypeptides comprising at least an immunogenic portion of a soluble M. tuberculosis antigen (or a variant of such an antigen) that comprises one or more of the amino acid sequences encoded by the DNA sequences of SEQ ID Nos.: 1, 2, 4-10, 13-25 and 52; the complements of such DNA sequences, or DNA sequences substantially homologous to a sequence in or 20 In further specific embodiments, the subject invention discloses polypeptides comprising at least an immunogenic portion of a M tuberculosis antigen (or a variant of such an antigen), which may or may not be soluble, that comprises one or more of the amino acid sequences encoded by the DNA sequences of SEQ ID Nos.: 26-51, the complements of such DNA sequences or DNA sequences 25 substantially homologous to a sequence in or In the specific-embodiments-discussed-abve the-_M-tuberculoss antigens include variants that are encoded by DNA sequences which are substantially homologous to one or more of DNA sequences specifically recited herein. "Substantial homology," as used herein, refers to DNA sequences that are capable of hybridizing under moderately stringent conditions. Suitable moderately stringent conditions include prewashing in a solution of 5X SSC, 0.5% SDS, 1.0 mM EDTA (pH hybridizing at 50OC-65 0 C, 5X SSC, overnight or, in the case of cross-species homology at 45 0
C,
SSC; followed by washing twice at 65 0 C for 20 minutes with each of 2X, and 0.2X SSC containing 0.1% SDS). Such hybridizing DNA sequences are also within the scope of this invention, as are nucleotide sequences that, due to code degeneracy, encode an immunogenic polypeptide that is encoded by a hybridizing
DNA
sequence.
In a related aspect, the present invention provides fusion proteins comprising a first and .a second inventive polypeptide or, alternatively, a polypeptide of the present invention and a known M tuberculosis antigen, such as the 38 kD antigen described above or ESAT-6 (SEQ ID Nos. 103 and 104), together with variants of such fusion proteins. The fusion proteins of the present invention may also include a linker peptide between the first and second polypeptides.
A DNA sequence encoding a fusion protein of the present invention is 15 constructed using known recombinant DNA techniques to assemble separate DNA sequences encoding the first and second polypeptides into an appropriate expression vector. The 3' end of a DNA sequence encoding the first polypeptide is ligated, with or :*.without a peptide linker, to the 5' end of a DNA sequence encoding the second polypeptide so that the reading frames of the sequences are in phase to permit mRNA translation of the two DNA sequences into a single fusion protein that retains the biological activity of both the first and the second polypeptides.
A peptide linker sequence may be employed to separate the first and the second polypeptides by a distance sufficient to ensure that each polypeptide folds into its secondary and tertiary structures. Such a peptide linker sequence is incorporated into 25 the fusion protein using standard techniques well known in the art. Suitable peptide linker sequences may be chosen based on the following factors: their ability to adopt a flexible extended conformation; their inability to adopt a secondary structure that could interact with functional epitopes on the first and second polypeptides; and the lack of hydrophobic or charged residues that might react with the polypeptide functional epitopes. Preferred peptide linker sequences contain Gly, Asn and Ser residues. Other near neutral amino acids, such as Thr and Ala may also be used in the linker sequence. Amino acid sequences which may be usefully employed as linkers include those disclosed in Maratea et al.. Gene 40:39-46, 1985; Murphy et al., Proc.
Natl. Acad Sci. USA 83:8258-8262, 1986; U.S. Patent No. 4,935,233 and U.S. Patent No. 4,751,180. The linker sequence may be from 1 to about 50 amino acids in length.
Peptide sequences are not required when the first and second polypeptides have nonessential N-terminal amino acid regions that can be used to separate the functional domains and prevent steric interference.
The ligated DNA sequences are operably linked to suitable transcriptional or translational regulatory elements. The regulatory elements responsible for expression of DNA are located only 5' to the DNA sequence encoding the first polypeptides. Similarly, stop codons require to end translation and transcription termination signals are only present 3' to the DNA sequence encoding the second polypeptide.
15 In another aspect, the present invention provides methods for using one or more of the above polypeptides or fusion proteins (or DNA molecules encoding such *polypeptides) to induce protective immunity against tuberculosis in a patient. As used *....herein, a "patient" refers to any warm-blooded animal, preferably a human. A patient may be afflicted with a disease, or may be free of detectable disease and/or infection. In .20 other words, protective immunity may be induced to prevent or treat tuberculosis.
In this aspect, the polypeptide, fusion protein or DNA molecule is generally present within a pharmaceutical composition and/or a vaccine.
*see.* Pharmaceutical compositions may comprise one or more polypeptides, each of which may contain one or more of the above sequences (or variants thereof), and a .25 physiologically acceptable carrier. Vaccines may comprise one or more of the above polypeptides and a non-specific immune response enhancer, such as an adjuvant or a liposome (into which the polypeptide is incorporated). Such pharmaceutical compositions and vaccines may also contain other M. tuberculosis antigens, either incorporated into a combination polypeptide or present within a separate polypeptide.
Alternatively, a vaccine may contain DNA encoding one or more polypeptides as described above, such that the polypeptide is generated in situ. In such vaccines, the DNA may be present within any of a variety of delivery systems known to those of ordinary skill in the art, including nucleic acid expression systems, bacterial and viral expression systems. Appropriate nucleic acid expression systems contain the necessary DNA sequences for expression in the patient (such as a suitable promoter and terminating signal). Bacterial delivery systems involve the administration of a bacterium (such as Bacillus-Calmette-Guenrin) that expresses an imnmunogenic portion of the polypeptide on its cell surface. In a preferred embodiment, the DNA may be introduc--d using a viral expression system vaccinia or other pox virus, retrovirus, or adenovirus), which may involve the use of a non-pathogenic (defective), replication competent virus. Techniques for incorporating DNA into such expression systems are well known to those of ordinary skill in the art. The DNA may also be "naked," as described, for example, in Ulmer et al., Sc ience 259:1745-1749, 1993 and reviewed by Cohen, Science 259:1691-1692, 1993. The uptake of naked DNA may be increased by coating the DNA onto biodegradable beads, which are efficiently transported into the cells.
In a related aspect, a DNA vaccine as described above may be *5*@administered simultaneously with or sequentially to either a polypeptide of the present invention or a known M tuberculosis antigen, such as the 38 kD antigen described see* above. For example, administration of DNA encoding a polypeptide of the present invention, either "naked" or in a delivery system as described above, may be followed S by administration of an antigen in order to enhance the protective immune effect of the vaccine.
Routes and frequency of administration, as well as dosage, will vary from individual to individual and may parallel those currently being used in immunization using BCG. In general, the pharmaceutical compositions and vaccines may be administered by injection intracutaneous, intramuscular, intravenous or subcutaneous), intanaallY by aspiration) or orally. Between 1 and 3 doses may be administered for a 1-36 week period. Preferably, 3 doses are administered, at intervals of 3-4 months, and booster vaccinations may be given periodically thereafter.
Alternate protocols may be appropriate for individual patients. A suitable dose is an amount of polypeptide or DNA that, when administered as described above, is capable of raising an immune response in an immunized patient sufficient to protect the patient from M. tuberculosis infection for at least 1-2 years. In general, the amount of polypeptide present in a dose (or produced in situ by the DNA in a dose) ranges from about I pg to about 100 mg per kg of host, typically from about 10 pg to about I mg, and preferably from about 100 pg to about I Suitable dose sizes will vary with the size of the patient, but will typically range from about 0.1 mL to about 5 mL.
While any suitable carrier known to those of ordinary skill in the art may be employed in the pharmaceutical compositions of this invention, the type of carrier will vary depending on the mode of administration. For parenteral administration, such as subcutaneous injection, the carrier preferably comprises water, saline, alcohol, a fat, a wax or a buffer. For oral administration, any of the above carriers or a solid carrier, 15 such as mannitol. lactose, starch, magnesium stearate, sodium saccharine, talcum, 0 cellulose, glucose, sucrose, and magnesium carbonate, may be employed.
Biodegradable microspheres polylactic galactide) may also be employed as carriers for the pharmaceutical compositions of this invention. Suitable biodegradable microspheres are disclosed, for example, in U.S. Patent Nos. 4,897,268 and 5,075,109.
0 20 Any of a variety of adjuvants may be employed in the vaccines of this invention to nonspecifically enhance the immune response. Most adjuvants contain a substance designed to protect the antigen from rapid catabolism, such as aluminum o :0 o0: hydroxide or mineral oil, and a nonspecific stimulator of immune responses, such as lipid A, Bortadella pertussis or Mycobacterium tuberculosis. Suitable adjuvants are commercially available as, for example, Freund's Incomplete Adjuvant and Freund's Complete Adjuvant (Difco Laboratories) and Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ). Other suitable adjuvants include alum, biodegradable microspheres, monophosphoryl lipid A and quil A.
In another aspect, this invention provides methods for using one or more of the polypeptides described above to diagnose tuberculosis using a skin test. As used herein, a "skin test" is any assay performed directly on a patient in which a delayed-type hypersensitivity (DTH) reaction (such as swelling, reddening or dermatitis) is measured following intradermal injection of one or more polypeptides as described above. Such injection may be achieved using any suitable device sufficient to contact the polypeptide or polypeptides with dermal cells of the patient, such as a tuberculin syringe or I mL syringe. Preferably, the reaction is measured at least 48 hours after injection, more preferably 48-72 hours.
The DTH reaction is a cell-mediated immune response, which is greater in patients that have been exposed previously to the test antigen the immunogenic portion of the polypeptide employed, or a variant thereof). The response may be measured visually, using a ruler. In general, a response that is greater than about 0.5 cm in diameter, preferably greater than about 1.0 cm in diameter, is a positive response, indicative of tuberculosis infection, which may or may not be manifested as an active disease.
The polypeptides of this invention are preferably formulated, for use in a skin test, as pharmaceutical compositions containing a polypeptide and a physiologically acceptable carrier, as described above. Such compositions typically contain one or more of the above polypeptides in an amount ranging from about I g to about 100 gg, preferably from about 10 g to about 50 Ag in a volume of 0.1 mL.
Preferably, the carrier employed in such pharmaceutical compositions is a saline solution with appropriate preservatives, such as phenol and/or Tween In a preferred embodiment, a polypeptide employed in a skin test is of sufficient size such that it remains at the site of injection for the duration of the reaction period. In general, a polypeptide that is at least 9 amino acids in length is sufficient.
The polypeptide is also preferably broken down by macrophages within hours of injection to allow presentation to T-cells. Such polypeptides may contain repeats of one or more of the above sequences and/or other immunogenic or nonimmunogenic sequences.
The following Examples are offered by way of illustration and not by way of limitation.
EXAMPLES
EXAMPLE I PURIFICATION AND CHARACTERIZATION OF POLYPEPTIDES FROM M TUBERCULOSIS CULTURE FILTRATE This example illustrates the preparation of M.A tuberculosis soluble polypeptides from culture filtrate. Unless otherwise noted, all percentages in the following example are weight per volume.
lM. tuberculosis (either H37Ra, ATCC No. 25177, or H37Rv, ATCC No. 25618) was cultured in sterile GAS media at 37°C for fourteen days. The media was then vacuum filtered (leaving the bulk of the cells) through a 0.45 g filter into a sterile 2.5 L bottle. The media was next filtered through a 0.2 u. filter into a sterile 4 L bottle and NaN 3 was added to the culture filtrate to a concentration of 0.04%. The bottles were then placed in a 4°C cold room.
The culture filtrate was concentrated by placing the filtrate in a 12 L 20 reservoir that had been autoclaved and feeding the filtrate into a 400 ml Amicon stir cell which had been rinsed with ethanol and contained a 10,000 kDa MWCO membrane.
The pressure was maintained at 60 psi using nitrogen gas. This procedure reduced the 12 L volume to approximately 50 ml.
The culture filtrate was dialyzed into 0.1% ammonium bicarbonate using a 8,000 kDa MWCO cellulose ester membrane, with two changes of ammonium bicarbonate solution. Protein concentration was then determined by a commercially available BCA assay (Pierce, Rockford, IL).
The dialyzed culture filtrate was then lyophilized, and the polypeptides resuspended in distilled water. The polypeptides were dialyzed against 0.01 mM 1,3 bis[tris(hydroxymethyl)-methylamino]propane, pH 7.5 (Bis-Tris propane buffer), the initial conditions for anion exchange chromatography. Fractionation was performed using gel profusion chromatography on a POROS 146 II Q/M anion exchange column 4.6 mm x 100 mm (Perseptive BioSystems, Framingham, MA) equilibrated in 0.01 mM Bis-Tris propane buffer pH 7.5. Polypeptides were eluted with a linear 0-0.5 M NaCI gradient in the above buffer system. The column eluent was monitored at a wavelength of 220 nm.
The pools of polypeptides eluting from the ion exchange column were dialyzed against distilled water and lyophilized. The resulting material was dissolved in 0.1% trifluoroacetic acid (TFA) pH 1.9 in water, and the polypeptides were purified on a Delta-Pak C18 column (Waters, Milford, MA) 300 Angstrom pore size, 5 micron particle size (3.9 x 150 mm). The polypeptides were eluted from the column with a linear gradient from 0-60% dilution buffer TFA in acetonitrile). The flow rate was 0.75 ml/minute and the HPLC eluent was monitored at 214nm. Fractions containing the eluted polypeptides were collected to maximize the purity of the 15 individual samples. Approximately 200 purified polypeptides were obtained.
The purified polypeptides were then screened for the ability to induce Tcell proliferation in PBMC preparations. The PBMCs from donors known to be PPD skin test positive and whose T-cells were shown to proliferate in response to PPD and crude soluble proteins from MTB were cultured in medium comprising RPMI 1640 supplemented with 10% pooled human serum and 50 pg/ml gentamicin. Purified polypeptides were added in duplicate at concentrations of 0.5 to 10 Ag/mL. After six days of culture in 96-well round-bottom plates in a volume of 200 ul, 50 1l of medium was removed from each well for determination of IFN-y levels, as described below.
The plates were then pulsed with 1 pCi/well of tritiated thymidine for a further 18 hours, harvested and tritium uptake determined using a gas scintillation counter.
Fractions that resulted in proliferation in both replicates three fold greater than the proliferation observed in cells cultured in medium alone were considered positive.
IFN-y was measured using an enzyme-linked immunosorbent assay (ELISA). ELISA plates were coated with a mouse monoclonal antibody directed to human IFN-y (PharMingen, San Diego, CA) in PBS for four hours at room temperature.
Wells were then blocked with PBS containing 5% non-fat dried milk for 1 hour at room temperature. The plates were then washed six times in PBS/0.2% and samples diluted 1:2 in culture medium in the ELISA plates were incubated overnight at room temperature. The plates were again washed and a polyclonal rabbit anti-human IFN-y serum diluted 1:3000 in PBS/10% normal goat serum was added to each well. The plates were then incubated for two hours at room temperature, washed and horseradish peroxidase-coupled anti-rabbit IgG (Sigma Chemical So., St. Louis, MO) was added at a 1:2000 dilution in PBS/5% non-fat dried milk. After a further two hour incubation at room temperature, the plates were washed and TMB substrate added.
The reaction was stopped after 20 min with I N sulfuric acid. Optical density was determined at 450 nm using 570 nm as a reference wavelength. Fractions that resulted in both replicates giving an OD two fold greater than the mean OD from cells cultured in medium alone, plus 3 standard deviations, were considered positive.
For sequencing, the polypeptides were individually dried onto 15 Biobrene T M (Perkin Elmer/Applied BioSystems Division, Foster City, CA) treated glass fiber filters. The filters with polypeptide were loaded onto a Perkin Elmer/Applied BioSystems Division Procise 492 protein sequencer. The polypeptides were sequenced Sfrom the amino terminal and using traditional Edman chemistry. The amino acid sequence was determined for each polypeptide by comparing the retention time of the 20 PTH amino acid derivative to the appropriate PTH derivative standards.
Using the procedure described above, antigens having the following N-terminal sequences were isolated: Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Xaa-Asn-Tyr-Gly- Gln-Val-Val-Ala-Ala-Leu; (SEQ ID No. 54) Ala-Val-Glu-Ser-Gly-Met-Leu-AIa-Leu-Gly-Thr-Pro-Ala-Pro- Ser; (SEQ ID No. Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala- Ala-Lys-Glu-Gly-Arg; (SEQ ID No. 56) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly- Pro; (SEQ ID No. 57) 9 9 Asp-Ile-Gly-Ser-Glu-Ser-Th-Glu-Asp-Gln-GnXaaAtiaVal; (SEQ ID No. 58) Ala-Glu-Glu-Ser-Ile-Ser-Th-Xaa-GluXaa4eValPp; (SEQ ID No. 59) Asp-Pro-Glu-Pro-Ala-ProPro-Val-ProlTr-Aa-AaAa.Ja Pro-Pro-Ala; (SEQ ID No. 60) and Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Gu-Leu-LysGy-rAspThr- Gly; (SEQ ID No. 61) wherein Xaa may be any amino acid.
An additional antigen was isolated employing a microbore HPLC purification step in addition to the procedure described above. Specifically, 20 ±LI of a fraction comprising a mixture of antigens from the chromatographic purification step previously described, was purified on an Aquapore Cl 18 column (Perkin Elmer/Applied Biosystems Division, Foster City, CA) with a 7 micron pore size, column size 1 bam x 100 mim, in a Perkin Elmer/Applied Biosystems Division Model 172 HPLC. Fractions were eluted from the column with a linear gradient of 11/olminute of acetonitrile (containing 0.05% TFA) in water (0.05% TEA) at a flow rate of 80 A~lminute. The eluent was monitored at 250 ram. The original fraction was separated into 4 major peaks plus other smaller components and a polypeptide was obtained which was shown to have a molecular weight of 12.054 Kd (by mass spectrometry) and the following Nterminal sequence: Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Th-Ala-Ala-Gmn-Gln- Thr-Ser-Leu-Leu-Asn-Asn-Leu-Ala-Asp-Pro-Asp-Val-Ser-Phe.
Ala-Asp (SEQ ID No. 62).
This polypeptide was shown to induce proliferation and IFN-y production in PBMCpreparations using the assays described above.
Additional soluble antigens were isolated from V' tuberculosis culture filtrate as follows. A'L tuberculosis culture filtrate was prepared as described above.
Following dialysis against Bis-Tris propane buffer, at pH 5.5, fractionation was performed using anion exchange chromatography on a Poros QE column 4.6 x 100 mm *9 9 9.
99 99 9 9* (Perseptive Biosystems) equilibrated in Bis-Tris propane buffer pH 5.5. Polypeptides were eluted with a linear 0-1.5 M NaCI gradient in the above buffer system at a flow rate of 10 mi/min. The column eluent was monitored at a wavelength of 214 nm.
The fractions eluting from the ion exchange column were pooled and subjected to reverse phase chromatography using a Poros R2 column 4.6 x 100 mm (Perseptive Biosystems). Polypeptides were eluted from the column with a linear gradient from 0-100% acetonitrile TFA) at a flow rate of 5 ml/min. The eluent was monitored at 214 nm.
Fractions containing the eluted polypeptides were lyophilized and resuspended in 80 ul of aqueous 0.1% TFA and further subjected to reverse phase chromatography on a Vydac C4 column 4.6 x 150 mm (Western Analytical, Temecula, CA) with a linear gradient of 0-100% acetonitrile TFA) at a flow rate of 2 ml/min. Eluent was monitored at 214 nm.
The fraction with biological activity was separated into one major peak pep.
15 plus other smaller components. Western blot of this peak onto PVDF membrane revealed three major bands of molecular weights 14 Kd, 20 Kd and 26 Kd. These polypeptides were determined to have the following N-terminal sequences, respectively: Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lvs-Val-Thr-Asp-Ala- Ser: (SEQ ID No. 134) 20 Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ie-Val-Gly-Asn-Leu-Thr-Ala- Asp; (SEQ ID No. 135) and Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala- Gly; (SEQ ID No. 136), wherein Xaa may be any amino acid.
Using the assays described above, these polypeptides were shown to induce proliferation and IFN-y production in PBMC preparations. Figs. IA and B show the results of such assays using PBMC preparations from a first and a second donor, respectively.
DNA sequences that encode the antigens designated as and above were obtained by screening a genomic M. tuberculosis library using 3 2 P end labeled degenerate oligonucleotides corresponding to the N-terminal sequence and 31 containing M tuberculosis codon bias. The screen performed using a probe corresponding to antigen above identified a clone having the sequence provided in SEQ ID No. 10 1. The poly peptide encoded by SEQ ID No. 101 is provided in SEQ ID No. 102. The screen performed using a probe corresponding to antigen above identified a clone having the sequence provided in SEQ ID No. 52. The polypeptide encoded by SEQ ID No. 52 is provided in SEQ ID No. 53. The screen performed using a probe corresponding to antigen above identified a clone having the sequence provided in SEQ ID No. 24, and the screen performed with a probe corresponding to antigen identified a clone having the sequence provided in SEQ ID No: The above amino acid sequences were compared to known amino acid sequences in the gene bank using the DNA STAR system. The database searched contains some 173 ,000 proteins and is a combination of the Swiss, PIR databases along with translated protein sequences (Version 87). No significant homologies to the amino acid sequences for antigens and were detected.
*.15 The amino acid sequence for antigen was found to be homologous to .a sequence from M leprue. The full length Ad leprue sequence was amplified from genomic DNA using the sequence obtained from GENBANK This sequence was then used to screen the M tuberculosis library described below in Example 2 and a full length copy of the M tuberculosis homologue was obtained (SEQ ID No. 99).
20 The amino acid sequence for antigen was found to be homologous to a known M tuberculosis protein translated from a DNA sequence. To the best of the inventors' knowledge, this protein has not been previously shown to possess T-cel stimulatory activity. The amino acid sequence for antigen was found to be related to a sequence from M leprue.
25 In the proliferation and IFN-y assays described above, using three PPD positive donors, the results for representative antigens provided above are presented in Table 1: TABLE I RESULTS OF PBMC PROLIFERATION AND rFN-Y ASSAYS Sequence Proliferation IFN-y
I++
In Table 1, responses that gave a stimulation index (SI) of between 2 and 4 (compared to cells cultured in medium alone) were scored as an SI of 4-8 or 2-4 at a concentration of 1 Ag or less was scored as and an SI of greater than 8 was scored as The antigen of sequence was found to have a high SI for one donor and lower SI and for the two other donors in both proliferation and IFN-y assays.
10 These results indicate that these antigens are capable of inducing proliferation and/or interferon-y production.
.*o EXAMPLE 2 USE OF PATIENT SERA TO ISOLATE M. TUBERCULOSIS ANTIGENS .o ~This example illustrates the isolation of antigens from M tuberculosis ilysate by screening with serum from M. tuberculosis-infected individuals.
Dessicated M. tuberculosis H37Ra (Difco Laboratories) was added to a o• 2% NP40 solution, and alternately homogenized and sonicated three times. The 20 resulting suspension was centrifuged at 13,000 rpm in microfuge tubes and the supernatant put through a 0.2 micron syringe filter. The filtrate was bound to Macro Prep DEAE beads (BioRad, Hercules, CA). The beads were extensively washed with mM Tris pH 7.5 and bound proteins eluted with IM NaCl. The 1M NaCl elute was dialyzed overnight against 10 mM Tris, pH 7.5. Dialyzed solution was treated with DNase and RNase at 0.05 mg/ml for 30 min. at room temperature and then with a-Dmannosidase, 0.5 U/mg at pH 4.5 for 3-4 hours at room temperature. After returning to pH 7.5, the material was fractionated via FPLC over a Bio Scale-Q-20 column (BioRad). Fractions were combined into nine pools, concentrated in a Centriprep (Amicon, Beverley, MA) and then screened by Western blot for serological activity using a serum pool from M. tuberculosis-infected patients which was not immunoreactive with other antigens of the present invention.
The most reactive fraction was run in SDS-PAGE and transferred to PVDF. A band at approximately 85 Kd was cut out yielding the sequence: Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys- Ile-Asn-Val-His-Leu-Val: (SEQ ID No. 137), wherein Xaa may be any amino acid.
Comparison of this sequence with those in the gene bank as described above, revealed no significant homologies to known sequences.
EXAMPLE 3 PREPARATION OF DNA SEQUENCES ENCODING M. TUBERCULOSIS ANTIGENS This example illustrates the preparation of DNA sequences encoding M tuberculosis antigens by screening a M tuberculosis expression library with sera obtained from patients infected with M. tuberculosis, or with anti-sera raised against soluble M tuberculosis antigens.
A. PREPARATION OF M. TUBERCULOSIS SOLUBLE ANTIGENS USING RABBIT ANTI-.
25 SERA Genomic DNA was isolated from the M tuberculosis strain H37Ra. The DNA was randomly sheared and used to construct an expression library using the Lambda ZAP expression system (Stratagene. La Jolla, CA). Rabbit anti-sera was generated against secretory proteins of the M. tuberculosis strains H37Ra, H37Rv and Erdman by immunizing a rabbit with concentrated supernatant of the M tuberculosis cultures. Specifically, the rabbit was first immunized subcutaneously with 200 gg of protein antigen in a total volume of 2 ml containing 10 .g muramyl dipeptide (Calbiochem, La Jolla, CA) and 1 ml of incomplete Freund's adjuvant. Four weeks later the rabbit was boosted subcutaneously with 100 ;g antigen in incomplete Freund's adjuvant. Finally, the rabbit was immunized intravenously four weeks later with 50 jig protein antigen. The anti-sera were used to screen the expression library as described in Sambrook et al., Molecular Cloning A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989. Bacteriophage plaques expressing immunoreactive antigens were purified. Phagemid from the plaques was rescued and the nucleotide sequences of the M. tuberculosis clones deduced.
Thirty two clones were purified. Of these, 25 represent sequences that have not been previously identified in human M tuberculosis. Recombinant antigens were expressed and purified antigens used in the immunological analysis described in Example I. Proteins were induced by IPTG and purified by gel elution, as described in Skeiky et al., J. Exp. Med 181:1527-1537, 1995. Representative sequences of DNA 15 molecules identified in this screen are provided in SEQ ID Nos.: 1-25. The Scorresponding predicted amino acid sequences are shown in SEQ ID Nos. 63-87.
n comparison of these sequences with known sequences in the gene bank using the databases described above, it was found that the clones referred to hereinafter as TbRA2A, TbRAI6, TbRA 18,and TbRA29 (SEQ ID Nos. 76, 68, 70, show some homology to sequences previously identified in Mycobacterium leprae but not in M tuberculosis. TbRAI I, TbRA26, TbRA28 and TbDPEP (SEQ ID Nos.: 74, 53) have been previously identified in M tuberculosis. No significant homologies were found to TbRA1, TbRA3, TbRA4, TbRA9, TbRAO10, TbRA13 TbRA17, TbRal9, TbRA29, TbRA32, TbRA36 and the overlapping clones 25 and TbRA12 (SEQ ID Nos. 63, 77, 81, 82, 64, 67, 69, 71, 75, 78, 80, 79, 66). The clone TbRa24 is overlapping with clone TbRa29.
The results of PBMC proliferation and interferon-y assays performed on representative recombinant antigens, and using T-cell preparations from several different M. tuberculosis-immune patients, are presented in Tables 2 and 3, respectively.
9 9 9.9 9 9 999 9 .9 .9 9 9 9 999 9..
9 9 9099 9999 .9.9 0999 9 9 9 999 99 9 99 9* 9 9 9 99 9 9 9 9 99 9 9 9 9 .94: 9 999 9 9 099 TABLE 2 RESULTS OF PBMC PROLIFE-RATIO)N To REPRESENTAnTIVE B1U~.EATEN Patient Antigen TbRa I TrbRa3 TbRa9 TbRa 10 TbRa I I TbRa 12 TbRa 16 TbRa24 TbRa26 TbRa29 ThRaB TbRaC TbRaD
AAMK
YY
DPEP
Control nt =not tested lit nt ft nt nt 2 3 4 -t n( ft nt nt nt fi nt nt n nt ft ft nt nt nt [it nt nt
S
6 7 ±t n alt t nt nt t ni fit 8 nt nt ft nt 9 ±t nt nt ntft 10 11 12 13 ft nt ft ft nt nt ±t n n t f +t t
I
0 0 0 RESULTS-OF PBMC INTERFERON.. POUCTION To REPREENATIVE SOLUBLE
ANTIGENS
In Tables 2 and 3, responses that gave a stimulation index (SI) of between 1.2 and 2 (compared to cells cultured in medium alone) were scored as a SI of 2-4 was scored as as SI of 4-8 or 2-4 at a concentration of 1 pg or less was scored as and an SI of greater than 8 was scored as In addition, the effect of concentration on proliferation and interferon-y production is shown for two of the above antigens in the attached Figure. For both proliferation and interferon-y production, TbRa3 was scored as and TbRa9 as These results indicate that these soluble antigens can induce proliferation and/or interferon-y production in T-cells derived from an M. tuberculosis-immune individual.
B. USE OF PATIENT SERA TO IDENTIFY DNA SEQUENCES ENCODING M. TUBERCULOSIS ANTIGENS The genomic DNA library described above, and an additional H37Rv library, were screened using pools of sera obtained from patients with active tuberculosis. To prepare the H37Rv library, M. tuberculosis strain H37Rv genomic DNA was isolated, subjected to partial Sau3A digestion and used to construct an expression library using the Lambda Zap expression system (Stratagene, La Jolla, Ca).
Three different pools of sera, each containing sera obtained from three individuals with active pulmonary or pleural disease, were used in the expression screening. The pools were designated TbL, TbM and TbH, referring to relative reactivity with H37Ra lysate TbL low reactivity, TbM medium reactivity and TbH high reactivity) in both ELISA and immunoblot format. A fourth pool of sera from seven patients with active pulmonary tuberculosis was also employed. All of the sera lacked increased reactivity, with the recombinant 38 kD M. tuberculosis H37Ra phosphate-binding protein.
All pools were pre-adsorbed with E. coli lysate and used to screen the H37Ra and H37Rv expression libraries, as described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989. Bacteriophage plaques expressing immunoreactive antigens were purified.
Phagemid from the plaques was rescued and the nucleotide sequences of the M. tuberculosis clones deduced.
38 Thirty two clones were purified. Of these, 31 represented sequences that had not been previously identified in human M. tuberculosis. Representative sequences of the DNA molecules identified are provided in SEQ ID Nos.: 26-51 and 105. Of these, TbH-8 and TbH-8-2 (SEQ. ID NO. 105) are non-contiguous DNA sequences from the same done, and TbH-4 (SEQ. ID NO. 43) and TbH-4-FWD (SEQ. ID NO. 44) are non-contiguous sequences from the same clone. Amino acid sequences for the antigens hereinafter identified as Tb38-1, TbH-4, TbH-8, TbH-9, and TbH-12 are shown in SEQ ID Nos.: 88-92. Comparison of these sequences with known sequences in the gene bank using the databases identified above revealed no significant homologies to TbH-4, TbH-8, TbH-9 and TbM-3, although weak homologies were found to TbH-9. TbH-12 was found to be homologous to a 34 kD antigenic protein previously identified in M paratuberculosis (Acc. No. S28515). Tb38-1 was found to be located 34 base pairs upstream of the open reading frame for the antigen ESAT-6 previously identified in M bovis (Acc. No. U34848) and in M. tuberculosis (Sorensen et al., Infec. Immun. 63:1710-1717, 1995).
Probes derived from Tb38-1 and TbH-9, both isolated from an H37Ra library, were used to identify clones in an H37Rv library. Tb38-1 hybridized to Tb38-1F2, Tb38-lF3, Tb38-1F5 and Tb38-1F6 (SEQ. ID NOS. 112, 113, 116, 118, and 119). (SEQ ID NOS. 112 and 113 are non-contiguous sequences from clone Tb38- 1F2.) Two open reading frames were deduced in Tb38-IF2; one corresponds to Tb37FL (SEQ. ID. NO. 114), the second, a partial sequence, may be the homologue of Tb38-1 and is called Tb38-IN (SEQ. ID NO. 115). The deduced amino acid sequence of Tb38- 1F3 is presented in SEQ. ID. NO. 117. A TbH-9 probe identified three clones in the H37Rv library: TbH-9-FL (SEQ. ID NO. 106), which may be the homologue of TbH-9 S. 25 (R37Ra), TbH-9-1 (SEQ. ID NO. 108), and TbH-9-4 (SEQ. ID NO. 110), all of which are highly related sequences to TbH-9. The deduced amino acid sequences for these three clones are presented in SEQ ID NOS. 107, 109 and 11.
The results of T-cell assays performed on Tb38-1, ESAT-6 and other representative recombinant antigens are presented in Tables 4A, B and 5, respectively, below: TABLE 4A RESULTs OF PBMC PROLIFERATION
TOREPRESENTATIEANTIGEN
TABLE 4B RESULTS OF PBMC I TRFERON-Y PRODUCTION TO REPRESENTATiVE
ANTIGENS
TABLE SUMMARY OF T-CELL RESPONSES TO REPRESENTATIVE ANTIGENS Proliferation Interferon-y Antigen patient 4 patient 5 patient 6 patient 4 patient 5 patient 6 total TbH9 13 TbM7 4 TbHS 8 TbL23 TbH4 -H 7
T-H
4 7 control These results indicate that both the inventive M. tuberculosis antigens and ESAT-6 can induce proliferation and/or interferon-y production in T-cells derived from an M. tuberculosis-immune individual. To the best of the inventors' knowledge, ESAT-6 has not been previously shown to stimulate human immune responses A set of six overlapping peptides covering the amino acid sequence of the antigen Tb38-1 was constructed using the method described in Example 4. The sequences of these peptides, hereinafter referred to as pepl-6, are provided in SEQ ID Nos. 93-98, respectively. The results ofT-cell assays using these peptides are shown in Tables 6 and 7. These results confirm the existence, and help to localize T-cell epitopes within Tb38-1 capable of inducing proliferation and interferon-y production in T-cells derived from an M tuberculosis immune individual 6 REH JR *P *n~t IFRA*O T TB I P* S.
-LE
3- PEPTIDES S55 5* S EXAMPLE 4 PURIFICATION AND CHARACTERIZATION OF A POLYPEPTDE FROM TUBERCULIN PRFIED PROTEIN DERIVATTVE An M. tuberculosis polypeptide was isolated from tuberculin purified protein derivative (PPD) as follows.
PPD was prepared as published with some modification (Seibert, F. et al., Tuberculin purified protein derivative. Preparation and analyses of a large quantity for standard. The American Review of Tuberculosis 44:9-25, 1941).
M tuberculosis Rv strain was grown for 6 weeks in synthetic medium in roller bottles at 37°C. Bottles containing the bacterial growth were then heated to 100* C in water vapor for 3 hours. Cultures were sterile filtered using a 0.22 u filter and the liquid phase was concentrated 20 times using a 3 kD cut-off membrane. Proteins were precipitated once with 50% ammonium sulfate solution and eight times with ammonium sulfate solution. The resulting proteins (PPD) were fractionated by reverse phase liquid chromatography (RP-HPLC) using a C18 column (7.8 x 300 mM; Waters, Milford, MA) in a Biocad HPLC system (Perseptive Biosystems, Framingham, MA).
Fractions were eluted from the column with a linear gradient from 0-100% buffer (0.1% TFA in acetonitrile). The flow rate was 10 ml/minute and eluent was monitored at 214 nm and 280 nm.
Six fractions were collected, dried, suspended in PBS and tested individually in M tuberculosis-infected guinea pigs for induction of delayed type hypersensitivity (DTH) reaction. One fraction was found to induce a strong DTHreaction and was subsequently fractionated further by RP-HPLC on a microbore Vydac C18 column (Cat. No. 218TP5115) in a Perkin Elmer/Applied Biosystems Division Model 172 HPLC. Fractions were eluted with a linear gradient from 5-100% buffer (0.05% TFA in acetonitrile) with a flow rate of 80 l/minute. Eluent was monitored at 215 nm. Eight fractions were collected and tested for induction of DTH in M.
tuberculosis-infected guinea pigs. One fraction was found to induce strong DTH of about 16 mm induration. The other fractions did not induce detectable DTH. The positive fraction was submitted to SDS-PAGE gel electrophoresis and found to contain a single protein band of approximately 12 kD molecular weight.
This polypeptide, herein after referred to as DPPD, was sequenced from the amino terminal using a Perkin Elmer/Applied Biosystems Division Procise 492 protein sequencer as described above and found to have the N-terminal sequence shown in SEQ ID No.: 129. Comparison of this sequence with known sequences in the gene bank as described above revealed no known homologies. Four cyanogen bromide fragments of DPPD were isolated and found to have the sequences shown in SEQ ID Nos.: 130-133.
The ability of the antigen DPPD to stimulate human PBMC to proliferate and to produce IFN-y was assayed as described in Example 1. As shown in Table 8, DPPD was found to stimulate proliferation and elicit production of large quantities of IFN-y; more than that elicited by commercial
PPD.
TABLE 8 RESULTS OF PROLIFERATION AND INTERFERON-y ASSAYS To DPPD PBMC Donor Stimulator Proliferation (CPM) IFN-y (OD 5 0 A Medium 1,089 0.17 PPD (commercial) 8,394 1.29 DPPD 13,451 2.21 B Medium 450 0.09 o. PPD (commercial) 3,929 1.26 .o EXAMPLE SYNTHESIS OF SYNTHETIC POLYPEPTIDES Polypeptides may be synthesized on a Millipore 9050 peptide synthesizer using FMOC chemistry with HPTU (O-Benzotriazole-NN,N',N'tetramethyluronium hexafluorophosphate) activation. A Gly-Cys-Gly sequence may be attached to the amino terminus of the peptide to provide a method of conjugation or labeling of the peptide. Cleavage of the peptides from the solid support may be carried out using the following cleavage mixture: trifluoroacetic acid:ethanedithiol:thioanisole:water:phenoi After cleaving for 2 hours, the peptides may be precipitated in cold methyl-t-butyl-ether. The peptide pellets may then be dissolved in water containing 0.1% trifluoroacetic acid (TEA) and lyophilized prior 15 to purification by C18 reverse phase HPLC. A gradient of 0%/o60% acetonitrile (containing 0.1% TFA) in water (containing 0.1% TFA) may be used to elute the peptides. Following lyophilization of the pure fractions, the peptides may be characterized using electrospray mass spectrometry and by amino acid analysis.
20 From the foregoing, it will be appreciated that, although specific embodiments of the invention have been described herein for the purpose of illustration, various modifications may be made without deviating from the spirit and scope of the invention.
25 Throughout this specification, unless the context requires otherwise, the word I "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or integer or group of elements or integers but not the exclusion of any other element or integer or group of elements or integers.
The reference to any prior art in this specification is not, and should not be taken as, an acknowledgement or any form of suggestion that that prior art forms part of the common general knowledge in Australia.
46 SEQUENCE LISTING GENERAL INFORMATION: APPLICANTS: Corixa Corporation (ii) TITLE OF INVENTION: COMPOUNDS AND METHODS FOR IMMUNOTHERAPY AND DIAGNOSIS OF TUBERCULOSIS (iii.) NUMBER OF SEQUENCES: 137 (iv) CORRESPONDENCE
ADDRESS:
ADDRESSEE: SEED and BERRY LLP STREET: 6300 Columbia Center. 701 Fifth Avenue CITY: Seattle STATE: Washington COUNTRY: USA ZIP: 98104-7092 S(v) COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (vi) CURRENT APPLICATION
DATA:
APPLICATION NUMBER: FILING DATE: 27-AUG-1996
CLASSIFICATION:
(viii) ATTORNEY/AGENT
INFORMATION:
NAME: Maki. David J.
REGISTRATION NUMBER: 31.392 REFERENCE/DOCKET NUMBER: 210121.411PC (ix) TELECOMMUNICATION
INFORMATION:
TELEPHONE: (206) 622-4900 TELEFAX: (206) 682-6031 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 166 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: CaGAGGCACCG GTAGTTTGAA CCAAACGCAC AATCGACGGG CAAACGAACG GAAGAACACA ACCATGAAGA TGGTGAAATC GATCGCCGCA GGTCTGACCG CCGCGGCTGC AATCGGC-GCC
GCTGCGGCCG
GTCGTCTTCG
GCCCAGTTGA
GGCAGTCTGG
AAGAAGGCCG
GCGGCCGCCG
GTCACGCAGA-
ATGGAGTTGC
GCTACGCCGC
GTNTGCGCAG
GTGTGACTTC
GCGCGCCACT
CCAGCCTGCT
TCGAGGGCGG
CCGAGCACGG
GTTCGGCCAC
ACGTCACGTT
TGCAGGCCGC
CCGCCTGGTG
GGNCGCACGC
GATCATGGCT
GCCGTTGGAC
CAACAGCCTC
CATCGGGGGC
GGATCTGCCG
CGCCGACGTT
CGTGAATCAA
AGGGNAACTG
ACGCGTCCAT
ACCGCCCGGT
GGCGGCCCGC
CCGGCATCCG
GCCGATCCCA
ACCGAGGCGC
CTGTCGTTCA
TCCGTCTCGG
GGCGGCTGGA
ATTGGCGGGC
GTCGAACACT
GCAAGCCGTC
TCGTATACCA
CCCCTGACGT
ACGTGTCGTT
GCATCGCCGA.
GCGTGACGAA
GTCCGAAGCT
TGCTGTCACG
CGGNTTCAGC
GATGGAGCCG
CCCGACCGCC
TGCGMACAAG
CCACAAGCTG
CATCCAGCCG
CTCGTCGCCG
CGCATCGGCG
CCGCTGTTCA
120 180 240 300 360 420 480 540 600 660 720 766 CGCGCGTGTA GCACGGTGCG CTCGAGATAG GTGGTGNCTC GNCACCAGNG ANCACCCCCN NNTCGNCNNT TCTCGNTGNT GNATGA (2Y INFORMATION FOR SEQ ID NO:2: LENGTH: 752 base pairs TYPE: nucleic acid STRANOEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: ATGCATCACC ATCACCATCA CGATGAAGTC ACGGTAGAGA CGACCTCCGT
CTTCCGCGCA
GACTTCCTCA GCGAGCTGGA CGCTCCTGCG CAAGCGGGTA CGGAGAGCGC
GGTCTCCGGG
GTGGAAGGGC TCCCGCCGGG CTCGGCGTTG CTGGTAGTCA AACGAGGCCC
CAACGCCGGG
TCCCGGTTCC TACTCGACCA AGCCATCACG TCGGCTGGTC GGCATCCCGA CAGCGACATA T1TCTCGACG ACGTGACCGT GAGCCGTCGC CATGCTGAAT TCCGGTTGGA AMACAACGAA TTCAATGTCG TCGATGTCGG GAGTCTCAAC GGCACCTACG TCAACCGCGA GCCCGTGGAT TCGGCGGTGC TGGCGAACGG CGACGAGGTC CAGATCGGCA AGCTCCGGTT GGTGTTC1TG ACCGGACCCA AGCAAGGCGA GGATGACGGG AGTACCGGGG GCCCGTGAGC
GCACCCGATA
GCCCCGCGCT GGCCGGGATG TCGATCGGGG CGGTCCTCCG ACCTGCTACG
ACCGGATTT
CCCTGATGTC CACCATCTCC MAGATTFCGAT TCTTGGGAGG CTTGAGGGTC
NGGGTGACCC
CCCCGCGGGC CTCATTCNGG GGTNTCGGCN GG1TTCACCC CNTACCNACT GCCNCCCGGN TTGCNAATTC NTTCTTCNCT GCCCNNAAAG GGACCNTTAN CTTGCCGCTN
GAAANGGTNA
TCCNGGGCCC NTCCThGMAN CCCCNTCCCC
CT
INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: LENGTH: 813 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 120 180 240 300 360 420 480 540 600 660 720 752 9 9 9* .9 (xi SEJENCE DESCRIPTION: SEQ ID NO:3: CATATGCATC ACCATCACCA TCACACTTCT AACCGCCCAG
CGCGTCGGG,
CCACGCGACA CCGGGCCCGA TCGATCTGCT AGCTTGAGTC TGGTCAGG-C, CAGCGCGATG CCCTATGTV GTCGTCGACT CAGATATCU-C
GGCMATCCA
GCGGCCGGCG GTGCTGCAAA -CTACTCCCGG AGGAATTTCG ACGTGCGCA1 ATGCTGGTCA CGGCTGTCGT TTGCTCTGT TGTTCGGGTG
TGGCCACGGC
ACCTACTGCG AGGAGTTGMA AGGCACCGAT ACCGGCCAGG
CGTGCCAGAT
GACCCGGCCT ACAACATCAA CATCAGCCTG CCCAGTTACT
ACCCCGACCA
GMAAATTACA TCGCCCAGAC GCGCGACAAG TTCCTCAGCG
CGGCCACATC
CGCGAAGCCC CCTACGAATT GAATATCACC TCGG"CCACAT
ACCAGTCCGC
CGTGGTACGC AGGCCGTGGT GCTCAMGGTC TACCACAACG
CCGGCGGCAC
ACCACGTACA AGGCCTTCGA TTGGGACCAG GCCTATCGCA
AGCCAATCAC
CTGTGGCAGG CTGACACCGA TCCGCTGCCA GTCGTCTTCC
CCATTGTTGC
GAGCAACGCA GACCGGGACA ACWGGTATCG ATAGCCGCCN
AATGCCGGCT
TGAAATTATC ACAAC1TCGC AGTCACNAAA NMA INFORMATION FOR SEQ ID NO:4: SEQUENCE CHARACTERISTICS: LENGTH: 447 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear G GCGTCGAGCA
TCGTCGTCAG
k TCTCCCGCCT
-CAAGATCTTC
CGCGCCCAAG
TCAAATGTCC
GAAGTCGCTG
GTCCACTCCA
GATACCGCCG
GCACCCAACG
CTATGACACG
AAGGTGAACT
TGGAACCCNG
120 180 240 300 360 420 480 540 600 660 720 780 813 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: CGGTATGAAC ACGGCCGCGT CCGATAACTT CCAGCTGTCC CAGGGTGGGC AGGGATTCGC CATTCCGATC GGGCAGGCGA TGGCGATCGC GGGCCAGATC CGATCGGGTG GGGGGTCACC 120 CACCGTTCAT ATCGGGCCTA CCGCCTTCCT CGGC1TGGGT GTTGTCGACA ACAACGGCAA 180 CGGCGCACGA GTCCAACGCG TGGTCGGGAG CGCTCCGGCG GCAAGTCTCG GCATCTCCAC 240 CGGCGACGTG ATCACCGCGG TCGACGGCGC TCCGATCAAc TCGGCCACCG CGATGGCGGA 300 CGCGCTTMAC GGGCATCATC CCGGTGACGT CATCTCGGTG AACTGGCAAA CCAAGTCGGG 360 CGGCACGCGT ACAGGGAACG TGACATTGGC CGAGGGACCC CCGGCCTGAT TTCGTCGYGG 420 ATACCACCCG CCGGCCGGCC M]TGGA 447 INFORMATION FOR SEQ ID SEQUENCE
CHARACTERISTICS:
LENGTH: 604 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID GTCCCACTGC GGTCGCCGAG TATGTCGCCC AGCAAATGTC TGGCAGCCGC CCAACGGAAT **CCGGTGATCC GACGTCGCAG GTTGTCGAAC CCGCCGCCGC GGAAGTATCG GTCCATGCCT 120 *:AGCCCGGCGA CGGCGAGCGC CGGAATGGCG CGAGTGAGGA GGCGGGCAAT TTGGCGGGGC 180 CCGGCGACGG NGAGCGCCGG AATGGCGCGA GTGAGGAGGT GGNCAGTCAT GCCCAGNGTG 240 ATCCAATCAA CCTGNATTCG GNCTGNGGGN CCATTGACA ATCGAGGTAG TGAGCGCAAA 300 TGAATGATGG AAAACGGGNG GNGACL2TCCG NTGTTCTGGT GGTGNTAGGT GNCTGNCTGG 360 NGThGNGGNT ATCAGGATGT TC1TCGNCGA AANCTGATGN CGAGGMACAG GGTGThCCCG NNANNCCNAN GGNGTCCNAN CCCNNNNTCC TCGNCGANAT CANANAGNCG
NTTGATGNGA
NAAAAGGGTG GANCAGNNNN AANThGNGGN CCNAANAANC NNNANNGNNG NNAGNThGNT NNNTNTTNNC ANNNNNNNTG NNGNNGNNCN NNNCAANCNN NTNNNNGNAA
NNGGNTTNT
NAAT
INFORMATION FOR SEQ 10 NO:6: 0i) SEQUENCE
CHARACTERISTICS:
LENGTH: 633 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 420 480 540 600 604 0000 00 (xi SEQUENCE DESCRIPTION: SEQ ID NO: 6: TTGCANGTCG MACCACCTCA CTAAAGGGAA CAAAAGCTNG AGCTCCACCG
CGGTGGCGGC
CGCTCTAGMA CTAGTGKATh YYYCKGGCTG CAGSAATYCG GYACGAGCAT TAGGACAGTC TAACGGTCCT GTTACGGTGA TCGAATGACC GACGACATCC TGCTGATCGA CACCGACGAA CGGGTGCGMA CCCTCACCCT CAACCGGCCG CAGTCCCGYA ACGCGCTCTC
GGCGGCGCTA
CGGGATCGGT TT1TCGCGGY GTTGGYCGAC GCCGAGGYCG ACGACGACAT
CGACGTCGTC
ATCCTCACCG GYGCCGATCC GGTGTTCTGC GCCGGACTGG ACCTCAAGGT AGCTGGCCGG GCAGACCGCG CTGCCGGACA TCTCACCGCG GTGGGCGGCC ATGACCAAGC
CGGTGATCGG
CGCGATCAAC GGCGCCGCGG TCACCGGCGG GCTCGAACTG GCGCTGTACT
GCGACATCCT
GATCGCCTCC GAGCACGCCC GCTTCGNCGA CACCCACGCC CGGGTGGGGC TGCTGCCCAC CTGGGGACTC AGTGTGTGCT TGCCGCAAAA GGTCGGCATC GGNCTGGGCC GGTGGATGAG CCTGACCGGC GACTACCTGT CCGTGACCGA CGC 120 180 240 300 360 420 480 540 600 633 INFORMATION FOR SEQ ID NO:7: SEQUENCE CHARACTERISTICS: LENGTH: 1362 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: CGACGACGAC GGCGCCGGAG AGCGGGCGCG AACGGCGATC
GACGCGGCCC
CGGCACCACC CAGGAGGGAG TCGAATCATG AAATTTGTCA
ACCATATTGA
CCCCGCCGAG CCGGCGGCGC GGTiCGCCGAG GTCTATGCCG
AGGCCCGCCG
CGGCTGCCCG AGCCGCTCGC CATGCTGTCC CCGGACGAGG
GACTGCTCAC
GCGACGTTGC GCGAGACACT GCTGGTGGGC CAGGTGCCGC
GTGGCCGCAA
GCCGCCGCCG TCGCGGCCAG CCTGCGCTGC CCCTGGTGCG
TCGACGCACA
CTGTACGCGG CAGGCCAAAC CGACACCGCC GCGGCGATCT
TGGCCGGCAC
GCCGGTGACC CGAACGCGCC GTATGTGGCG TGGGCGGCAG
GAACCGGGAC
CCGCCGGCAC CGTTCGGCCC GGATGTCGCC GCCGAATACC
TGGGCACCGC
CACTTCATCG CACGCCTGGT CCTGGTGCTG CTGGACGAAA CCTTCCTGCC
G
CGCGCCCAAC AGCTCATGCG CCGCGCCGGT GGACTGGTGT TCGCCCGCAA
G
GAGCATCGGC CGGGCCGCTC CACCCGCCGG CTCGAGCCGC GAACGCTGCC
C
GCATGGGCMA CACCGTCCGA GCCCATAGCA ACCGCGTTCG CCGCGCTCAG
C
GACACCGCGC CGCACCTGCC GCCACCGACT CGTCAGGTGG TCAGGCGGGT
C
TGGCACGGCG AGCCAATGCC GATGAGCAGT CGCTGGACGA ACQAGCACAC Ci
TGGCCAGAGT
GCCCGTCGCG
CGAGTTCGGC
CGCCGGCTGG
GGAAGCCGTC
CACCACCATG
kGCACCTGCC kCCGGCGGGA iGTGCM1TTC
;GGGGGCCCG
IGTGCGCGCG
GACGATCTG
CACCACCTG
GTGGGGTCG
GCCGAGCTG
120 180 240 300 360 420 480 540 600 660 720.
780 840 900 CCCGCCGACC TGCACGCGCC CACCCGTCTT GCCCTGCTGA CCGGCCTGGC CCCGCATCAG GTGACCGACG ACGACGTCGC CGCGGCCCGA TCCCTGCTCG ACACCGATGC GGCGCTGG1T GGCGCCCTGG CCTGGGCCGC CTTCACCGCC GCGCGGCGCA TCGGCACCTG GATCGGCGCC GCCGCC?'GAGG GCCAGGTGTC GCGGCAAMAC CCGACTGGGT GAGTGTGCGC
GCCCTGTCGG
TAGGGTGTCA TCGCTGGCCC GAGGGATCTC GCGGCGGCGA ACGGAGGTGG
CGACACAGGT
GGAAGCTGCG CCCACTGGCT TGCGCCCCAA CGCCGTCGTG GGCGTTCGGT
TGGCCGCACT
GGCCGATCAG GTCGGCGCCG GCCCTTGGCC GAAGGTCCAG CTCAACGTGC CGTCACCGAA GGACCGGACG GTCACCGGGG GTCACCCTGC GCGCCCAAGG
A
INFORMATION FOR SEQ ID NO:8: SEQUENCE CHARACTERISTICS: LENGTH: 1458 base pairs TYPE: nucleic acid STRANJEINESS: single TOPOLOGY: linear 960 1020 1080 1140 1200 1260 1320 1362 9 9 9 9999** 9 .9 9 9* 9* 99 9* 9.
(xi SEQUENCE DESCRIPTION: SEQ ID NO:8: GCGACGACCC CGATATGCCG GGCACCGTAG CGAAAGCCGT CGCCGACGCA GTATCGCTCC CGTTGAGGAC ATTCAGGACT GCGTGGAGGC CCGGCTGGGG TGGATG.ACGT GGCCCGTGTT TACATCATCT ACCGGCAGCG GCGCGCCGAG CTAAGGCCTT GCTCGGCGTG CGGGACGAGT TAAAGCTGAG CTTGGCGGCC TGCGCGAGCG CTATCTGCTG CACGACGAGC AGGGCCGGCC GGCCGAGTCG TGATGGACCG ATCGGCGCGC TGTGTCGCGG CGGCCGAGGA CCAGTATGAG CGAGGCGGTG GGCCGAGCGG TTCGCCACGC TATTACGCAA CCTGGAATTC CGCCCACGTT GATGAACTCT GGCACCGACC TGGGACTGCT CGCCGGCTGT
CTCGGGCGCG
GAAGCCGGTC
CTGCGGACGG
GTGACGGTAC
ACCGGCGAGC
CCGGGCTCGT
CTGCCGAATT
TTTGTTCTGC
120 180 240' 300 360 420 480 CGA1TGAGGA TTCGCTGCAA TCGATCTTTG CGACGCTGGG ACAGGCCGCC GAGCTGCAGC 540 GGGCTGGAGG CGGCACCGGA TATGCGTTCA GCCACCTGCG ACCCGCCGGG GATCGGGTGG 600 CCTCCACGGG CGGCACGGCC AGCGGACCGG TGTCGT1TCT ACGGCTGTAT GACAGTGCCG 660 CGGGTGTGGT CTCCATGGGC GGTCGCCGGC GTGGCGCCTG TATGGCTGTG CTTGATGTGT .720 CGCACCCGGA TATCTGTGAT TTCGTCACCG CCAAGGCCGA ATCCCCCAGC GAGCTCCCGC 780 ATTTCAACCT ATCGGTTGGT GTGACCGACG CGTTCCTGCG GGCCGTCGAA CGCAACGGCC 840 TACACCGGCT GGTCAATCCG CGAACCGGCA AGATCGTCGC GCGGATGCCC GCCGCCGAGC 900 TGTTCGACGC CATCTGCAAA GCCGCGCACG CCGGTGGCGA TCCCGGGCTG GTG1TVCTCG 960 ACACGATCAA TAGGGCAAAC CCGGTGCCGG GGAGAGGCCG CATCGAGGCG ACCAACCCGT 1020 GCGGGGAGGT CCCACTGCTG CCTTACGAGT CATGTAATCT CGGCTCGATC AACCTCGCCC 1080 .9GGATGCTCGC CGACGGTCGC GTCGACTGGG ACCGGCTCGA GGAGGTCGCC GGTGTGGCGG 1140 TGCGGTTCCT TGATGACGTC ATCGATGTCA GCCGCTACCC CTTCCCCGAA CTGGGTGAGG 1200 9CGGCCCGCGC CACCCGCAAG ATCGGGCTGG GAGTCATGGG TTTGGCGGAA CTGC1TGCCG 1260 CACTGGGTAT TCCGTACGAC AGTGMAGAAG CCGTGCGGTT AGCCACCCGG CTCATGCGTC 1320 *GCATACAGCA GGCGGCGCAC ACGGCATCGC GGAGGCTGGC CGAAGAGCGG GGCGCATTCC 1380 CGGCGTTCAC CGATAGCCGG TTCGCGCGGT CGGGCCCGAG GCGCAACGCA CAGGTCACCT 1440 CCGTCGCTCC GACGGGCA 1458 9. INFORMATION FOR SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH: 862 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: ACGGTGTAAT CfGTGCTGGAT CTGGAACCGC GTGGCCCGCT ACCTACCGAG ATCTACTGGC GGCGCAGGGG GCTGGCCCTG GGCATCGCGG TCGTCGTAGT CGGGATCGCG GTGu-CCATCG 120 TCATCGCCTT CGTCGACAGC AGCGCCGGTG CCAAACCGGT CAGCGCCGAC AAGCCGGCCT 180 CCGCCCAGAG CCATCCGGGC TCGCCGGCAC CCCAAGCACC CCAGCCGGCC GGGCAAACCG 240 MAGGTAACGC CGCCGCGGCC. CCGCCGCAGG GCCAAAACCC CGAGACACCC ACGCCCACCG 300 CCGCGGTGCA GCCGCCGCCG GTGCTCAAGG AAGGGGACGA TTGCCCCGAT TCGACGCTGUG 360 CCGTCAAAGG" TTTGACCAAC GCGCCGCAGT ACTACGTCGG CGACCAGCCG AAGTTCACCA 420 TGGTGGTCAC CMACATCGG3C CTGGTGTCCT GTAAACGCGA CG1TGGGGCC GCGGTGTTGuG 480 CCGCCTACGT TT 'ACTCGCTG GACAACAAGC GGTTGTGGTC CAACCTGGAC TGCGCGCCCT 540 CGAATGAGAC GCTGGTCAAG ACGTTTTCCC CCGGTGAGCA GGTAACGACC GCGI3TGACCT 600 GGACCGGGAT GGGATCGGCG CCGCGCTGCC CATTGCCGCG GCCGGCGATC GGGCCGGGC-A 660 V066CCTACAATCT CGTGGTACAA CTGGGCAATC TGCGCTCGCT GCCGGTTCCG TTCATCCTGA 720 ATCAGCCGCC GCCGCCGCCC GGGCCGGTAC CCGCTCCGGG TCCAGCGCAG GCGCCTCCGC 780 *CGGAGTCTCC CGCGCMAGGC GGATAATTAT TGATCGCTGA TGGTCGATTC CGCCAGCTGT 840 S GACAACCCCT CGCCTCGTGC CG 862 0:600:(2) INFORMATION FOR SEQ ID w* SEQUENCE CHARACTERISTICS: LENGTH: 622 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID FFGATCAGCA CCGGCAAGGC GTCACATGCC TCCCTGGGTG TGCAGGTGAC GACACCCCGG GCGCCAAGAT CGTCGAAGTA GTGGCCGGTG
GTGCTGCCGC
GTGCCGAAGG GCGTCGTTGT CACCAAGGTC GACGACCGCC CGATCAACAG TTGG1TGCCG CCGTGCGGTC CAAAGCGCCG GGCGCCACGG TGGCGCTAAC CCCTCGGGCG GTAGCCGCAC AGTGCAAGTC ACCCTCGGCA AGGCGGAGCA TCGCCGCGCA GTGTTCAAAG CTCGGATATA CGGTGGCACC
CATGGAACAG
TGGTGGTTGG CCGGGCACTT GTCGTCGTCG TTGACGATCG
CACGGCGCAC(
ACCACAGCGG GCCGCTTGTC ACCGAGCTGC TCACCGAGGC
CGGGTTTGTTC
TGGTGGCGGT GTCGGCCGAC GAGGT CGAGA TCCGAAATGC GCTGAACACA GCGGGGTGGA CCTGGTGGTG TCGGTCGGCG GGACCGGNGT GACGNCTCGC CGGAAGCCAC CCGNGACATT CT INFORMATION FOR SEQ ID NO:11: SEQUENCE CHARACTERISTICS: LENGTH: 1200 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear
CAATGACAAA
GAACGCTGGA
CGCGGACGCG
CTTTCAGGAT
GTGATGAAGG
CGTGCGGAGT
GGCGATGAAG
3TCGACGGCG
CGGTGATCG
~ATGTCACCC
120 180 240 300 360 420 480 540 600 622 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: GGCGCAGCGG TAAGCCTGTT GGCCGCCGGC ACACTGGTGT TGACAGCATG CGGCGGTGGC ACCAACAGCT CGTCGTCAGG CGCAGGCGGA ACGTCTGGGT CGGTGCACTG CGGCGGCAAG AAGGAGCTCC ACTCCAGCGG CTCGACCGCA CMAGAAAATG CCATGGAGCA GTTCGTCTAT 120 180
GCCTACGTC
GGGGTGACC
CCGTCGACC
CCGACGGTG
CTTGACGGA(
CAGATCCAA(
CGCAGCGAC?
GGGGCGTGGG
GGGAACAACG
TGGTCGTTTG
GATCCAGTCG
GGACAAGGCA
TCTTACCCGA
ACCGGTACTG
GACCAATACG
AATGCTATTT
GGGTCGCAAT
~C GATCGTGCCC GGGCTACACG TTGGACTACA C AGTF7CTCAA CAACGAACC GATTTCGCCG G GTCAACCTGA CCGGTCGGCG
GAGCGGTGCG
T TCGGCCCGAT CGCGATCACC
TACAATATCA
CCACTACCGC CMAGATTTTC AACGGCACCA SCCCTCAACTC CGGCACCGAC
CTGCCGCCAA
AGTCCGGTAC GTCGGACAAC
TTCCAGAAAT
GCAAAGGCGC CAGCGAAACG
TTCAGCGGGG
GAACGUTGGGC CCTACTGCAG
ACGACCGACGC
CGGTGGGTAA GCAG1TGMAC ATGGCCCAGA 1 CGATCACCAC CGAGTCGGTC GGTAAGACAA
T
ACGACCTGGT ATTGGACACG TCGTCGTTCT
A
TCGTGCTGGC GACCTATGAG ATCGTCTGCT
C'
CGGTAAGGGC GTTTATGCAA GCCGCGATTG G GCTCCATTCC GTTGtrCCCAAA TCGTTCCAAG
C
CTTGACCTAG TGAAGGGAAT TCGACGGTGA
G(
TTGGGCCGTA TCAGCTATTG CGGCTGCTGG
GC
ACGCCAACC
GCTCGGATG
G]TCCCCGG
AGGGCGTGAI
TCACCGTGT(
CAGCCGATTAC
KCCTCGACGC
3CGTCGGCG7 iGTCGATCAC
CATCACGTC
CGCCGGGGC
CAGACCCAC
GAAATACCC
TCCAGGCCA
AAATTGGC
:GATGCCGT
CGAGGCGG
iG GTCCGGTGCC T CCCGTTGMAT C ATGGGACCTG 3CACGCTGAAT 3GAATGATCCA 3CGTTATCTTC
TGTATCCAAC
CGGCGCCAGC
CTACAACGAG
GGCGGGTCCG
CAAGATCATG
CCAGCCTGGC
GGATGCGACG
AGAAGGCCTG
GGCCGCGGTG
TCCGCAGGTA
GATGGGCGAG
240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 INFORMATION FOR SEQ ID NO:12: (1 SEQUENCE CHARACTERISTICS: LENGTH: 1155 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 58 (xi SEQUENCE DESCRIPTION: SEQ ID NO:12: GCAAGCAGCT GfCAGGTCGTG CTGTTCGACG AACTGGGCAT GCCGAAGACC AAACGCACCA AGACCGGCTA CACCACGGAT GCCGACGCGC TGCAGTCGTT G1TCGACAAG ACCGGGCATC 120 CGTF7CTGCA ACATCTGCTC GCCCACCGCG ACGTCACCCG GCTCAAGGTC ACCGTCGACG 180 GGTTGCTCCA AGCGGTGGCC GCCGACGGCC GCATCCACAC CACGTTCAAC CAGACGATCG 240 CCGCGACCGG CCGGCTCTCC TCGACCGAAC CCMACCTGCA GAACATCCCG ATCCGCACCG 300 ACGCGGGCCG GCGGATCCGG GACGCGTTCG TGGTCGGGGA CGGTTACGCC GAGTTGATGA 360 CGGCCGACTA CAGCCAGATC GAGATGCGGA TCATGGGGCA CCTGTCCGGG GACGAGGGCC 420 TCATCGAGGC GTTCMACACC GGGGAGGACC TGTATTCGTT CGTCGCGTCC CGGGTGTTCG 480 GTGTGCCCA CGACGAGGTC ACGCGG TGCGGCGCCG GTCuiu'G.uuu ATGTCCTACG 540 GGCTGGTTTA CGGGTTGAGC GCCTACGGCC TGTCGCAGCA GTTGI4AAATC TCCACCGAGG 600 MAGCCAACGA GCAGATGGAC GCGTATTTCG CCCGATTCGG CGGGGTGCGC GACTACCTGC 660 GCGCCGTAGT CGAGCGGGCC CGCAAGGACG GCTACACCTC GACGGTGCTG GGCCGTCGCC 720 *GCTACCTGCC CGAGCTGGAC AGCAGCAACC GTCAAGTGCG GGAGGCCGCC GAGCGGGCGG 780 CGCTGAACGC GCCGATCCAG GGCAGCGCGG CCGACATCAT CAAGGTGGCC ATGATCCAGG 840 TCGACAAGGG GCTCAACGAG GCACAGCTGG CGTCGCGCAT GCTGCTGCAG GTCCACGACG 900 *.AGCTGCTG1-r CGAAATCGCC CCCGGTGAAC GCGAGCGGGT CGAGGCCCTG GTGCGCGACA 960 AGATGGGCG CGCTTACCCG CTCGACGTLL CGCTGGAGGT Giuuuiuuuu TACGGCCGCA 1020 GCTGGGACGC GGCGGCGCAC TGAGTGCCGA GCGTGCATCT GGGGCGGGAA 1TCGGCGATT 1080 TTTCCGCCCT GAGTTCACGC TCGGCGCAAT CGGGACCGAG TVGTCCAGC GTGTACCCGT 1140 CGAGTAGCCT CGTCA 1155 INFORMATION FOR SEQ ID NO:13: (i SEQUENCE
CHARACTERISTICS:
LENGTH: 1771 base pai rs TYPE: nucleic acid STRANDEDNESS: single' TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: GAGCGCCGTC TGGTGTTTGA ACGGTTTTAC CGGTCGGCAT CGGCACGGGC GTTGCCGGGT TCGGGCCTCG GGTTGGCGAT CGTCAAACAG GTGGTGCTCA ACCACGGCGG ATTGCTGCGC 120 ATCGMAGACA CCGACCCAGG CGGCCAGCCC CCTGGMACGT CGAT1TACG"T GCTGCTCCCC 180 GGCCGTCGGA TGCCGATTCC GCAGCTTCCC GGTGCGACGG CTGGCGCTCG GAGCACGGAC 240 *ATCGAGAACT CTCGGGGTTC GGCGAACGTT ATCTCAGTGG AATCTCAGTC CACGCGCGCA 300 ACCTAGTTGT GCAGTTACTG TTGAAAGCCA CACCCATGCC AGTCCACGCA TGGCCAAGTT 360 GGCCCGAGTA GTGGGCCTAG TACAGGAAGA GCAACCTAGC GACATGACGA ATCACCCACG 420 GTATTCGCGA CCGCCGCAGC AGCCGGGAAC CCCAGGT-TAT GCTCAGGGGC AGCAGCAAAC 480 GTACAGCCAG CAGTTCGACT GGCGTTACCC ACCGTCCCCG CCCCCGCAGC CAACCCAGTA 540 CCGTCMACCC TACGAGGCGT TGGGTGGTAC CCGGCCGGGT CTGATACCTG GCGTGATTCC 600 GACCATGACG CCCCCTCCTG GGATGGTTGG CCAACGCCCT CGTGCAGGCA TGTTGGCCAT 660 CGGCGCGGTG ACGATAGCGG TGGTGTCCGC CGGCATCGGC GGCGCGGCCG CATCCCTGGT 720 2.CGGGTTCAAC CGGGCACCCG CCGGCCCCAG CGGCGGCCCA GTGGCTGCCA GCGCGGCGCC 780 AAGCATCCCC GCAGCAAACA TGCCGCCGGG GTCGGTCGAA CAGGTGGCGG CCAAGGTGGT 840 GCCCAGTGTC GTCATGTTGG MAACCGATCT GGGCCGCCAG TCGGAGGAGG GCTCCGGCAT 900 CATTCTGTCT GCCGAGGGGC TGATCTTGAC CAACAACCAC GTGATCGCGG CGGCCGCCAA 960 GCCTCCCCTG GGCAGTCCGC CGCCGAAAAC GACGGTAACC TTCTCTGACG GGCGGACCGC 1020 ACCCTTCACG GTGGTGGGGG CTGACCCCAC CAGTGATATC GCCGTCGTCC GTGTTCAGGG 1080 CGTCTCCGGG CTCACCCCGA TCTCCCTGGG TTCCTCCTCG GACCTGAGGG TCGGTCAGCc 1140 GGTGCTGGCG ATCGGGTCGC CGCTCGGTTT GGAGGGCACC GTGACCACGG GGATCGTCAG 1200 CGCTCTCAAC CGTCCAGTGT CGACGACCGG CGAGGCCGGC AACCAGAACA CCGTGCTGGA 1260 CGCCATTCAG ACCGACGCCG CGATCAACCC CGGTAACTCC GGGGGCGCGC TGGTGAACAT 1320 GAACGCTCAA CTCGTCGGAG TCAACTCGGC CATTGCCACG CTGGGCGCGG ACTCAGCCGA 1380 TGCGCAGAGC GGCTCGATCG GTCTCGGTTT7 TGCGATTCCA GTCGACCAGG CCAAGCGCAT 1440 CGCCGACGAG TTGATCAGCA CCGGCAAGGC GTCACATGCC TCCCTGGGTG TGCAGGTGAC 1500 CAATGACAA GACACCCCGG GCGCCAAGAT CGTCGAAGTA GTGGCCGGTG GTGCTGCCGC 1560 *:GAACGCTGGA GTGCCGAAGG GCGTCGTTGT CACCAAGGTC GACGACCGCC CGATCAACAG 1620 CGCGGACGCG TTGGTTGCCG CCGTGCGGTC CAAAGCGCCG GGCGCCACGG TGGCGCTAAC 1680 CTFFCAGGAT CCCTCGGGCG GTAGCCGCAC AGTGCAAGTC ACCCTCGGCA AGGCGGAGCA 1740 GTGATGAAGG TCGICCGCGCA, GTGTTCAAAG C 1771 INFORMATION FOR SEQ ID NO:14: SEQUENCE CHARACTERISTICS: LENGTH: 1058 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi SEUNEDSRPIN.E DN:4 CTCCCC GTGGGC *TTGATATGTC CGGTC GAT C6 CTCACC GTGCGCCG CTCTACGAAAGACC CCG GGCGCA GGAATTCG AGC 620 61 AGCCCGGCGA CGGCGAGCGC CGGAATGGCG CGAGTGAGGA GGCGGGCAAT TTGGCGGGGC 180 CCGGCGACGG CGAGCGCCGG AATGGCGCGA. GTGAGGAGGC GGGCAGTCAT GC11CCAGCGTG 240 ATCCAATCAA CCTGCA1TCG GCCTGCGGGC CCATTTGACA ATCGAGGTAG TGAGCGCAAA 300 TGAATGATGG AAAACGGGCG GTGACGTCCG CTGTTCTGGT GGTGCTAGGT GCCTGCCTGG 360 CGTTGTGGCT ATCAGGATGT TCTTCGCCGA AACCTGATGC CGAGGAACAG. GGTGTTCCCG 420 TGAGCCCGAC GGCGTCCGAC CCCGCGCTCC TCGCCGAGAT CAGGCAGTCG CTTGATGCGA 480 CAAAAGGGTT GACCAGCGTG CACGTAGCGG TCCGAACAAC CGGGAAAGTC GACAC-CTTGC 540 TGGGTATTAC CAGTGCCGAT GTCGACGTCC GGGCCAATCC GCTCGCGGCA AAGGGCGTAT 600 GCACCTACMA CGACGAGCAG GGTGTCCCGT TTCGGGTACA AGGCGACAAC ATCTCGGTGA 660 AACTGTTCGA CGACTGGAGC AATCTCGGCT CGAT17CTGA ACTGTCAACT TCACGCGTGC 720 TCGATCCTGC CGCTGGGGTG ACGCAGCTGC TGTCCGGTGT CACGAACCTC CAAGCGCAAG 780 GTACCGAAGT GATAGACGGA ATTTCGACCA CCAAAATCAC CGGGACCATC CCCGCGAGCT 840 CTGTCAAGAT GCTTGATCCT GGCGCCMAGA GTGCAAGGCC GGCGACCGTG TGGATTGCCC 900 AGGACGGCTC GCACCACCTC GTCCGAGCGA GCATCGACCT CGGATCCGGG TCGATTCAGC 960 *TCACGCAGTC GAAATGGAAC GAACCCGTCA ACGTCGACTA GGCCGAAGTT GCGTCGACGC 1020 G1TGNTCGMA ACGCCCTTGT GMACGGTGTC AACGGNAC 1058 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 542 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID GAATTCGGCA CGAGAGGTGA TCGACATCAT CGGGACCAGC CCCACATCCT GGGAACAGGC GGCGGCGGAG GCGGTCCAGC GGGCGCGGGA TAGCGTCGAT GACATCCGCG
TCGCTCGGGT
CA1TGAGCAG GACATGGCCG TGGACAGCGC CGGCAAGATC ACCTACCGCA TCAAGCTCGA AGTGTCGTTC AAGATGAGGC CGGCGCAACC GCGCTAGCAC GGGCCGGCGA
GCAAGACGCA
AAATCGCACG GT1TGCGGTT GATFCGTGCG ATT1TGTGTC TGCTCGCCGA GGCCTACCAG GCGCGGCCCA GGTCCGCGTG CTGCCGTATC CAGGCGTGCA TCGCGATTCC
GGCGGCCACG
CCGGAGTTAA TGCTTCGCGT CGACCCGAAC TGGGCGATCC GCCGGNGAGC TGATCGATGA CCGTGGCCAG CCCGTCGATG CCCGAGTTGC CCGAGGAAAC GTGCTGCCAG GCCGGTAGGA AGCGTCCGTA GGCGGCGGTG CTGACCGGCT CTGCCTGCGC CCTCAGTGCG GCCAGCGAGC
GG
INFORMATION FOR SEQ ID NO:16: 0i) SEQUENCE CHARACTERISTICS: LENGTH: 913 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 120 180 240 300 360 420 480 540 542 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: CGGTGCCGCC CGCGCCTCCG TTGCCCCCAT TGCCGCCGTC GCCGATCAGC TGCGCATCGC- CACCATCACC GCCTTTGCCG CCGGCACCGC CGGTGGCGCC GGGGCCGCCG ATGCCACCGC TTGACCCTGG CCGCCGGCGC CGCCATTGCC ATACAGCACC CCGCCGGGGG
CACCGTTACC
GCCGTCGCCA CCGTCGCCGC CGCTGCCGTT TCAGGCCGGG GAGGCCGAAT
GAACCGCCGC
CAAGCCCGCC GCCGGCACCG TTGCCGCCTT TTCCGCCCGC CCCGCCGGCG CCGCCAA1TG 120 180 240 300 CCGAACAGCC AMGCACCGTT GCCGCCAGCC GCCGCCGGAC CCGCCATTAC CGCCGTTCCC GT1TGCCGCC MTATTCGGC GGGCACCGCC CACCGAAACA ACAGCCCAAC GGTGCCGCCG TCACCGCCAG CACCGCCGTT AATGTTTATG CCGGGCGCCG GAGNGCGTGC
CCGCCGGCGC
CGGCCCCGCC GGACCCACCG
GTCCCGCCGA
TGGTGCTGCT GAAGCCGTTA
GCGCCC-GTTC
CGGCCCCGCC GTTGCCGTAC AGCCACCCCC TGCCGCCGTT GCCGCCATTG CCGCCGTTCC CGCCGGCGGC CGC INFORMATION FOR SEQ ID NO:17: Ci) SEQUENCE CHARACTERISTICS LENGTH: 1872 base pi TYPE: nucleic acid STRANDEDNESS: singlE TOPOLOGY: linear CCGCCGCCGT TMACGGCGCT GCCGGGCGCC GTTCGGTGCC CCGCCG1TAC CGGCGCCGCC AGACCCGCCG GGGCCACCAT TGCCGCCGGG GCCCCGCCGT 1TGCCGCCAT CACCGGCCAT MACCCGGTAC CGCCAGCGCG GCCCCTATTG CGCCAACGCC CAAAAGCCCG GGGTTGCCAC TCCCCCCGTT GCCGCCGGTG CCGCCGCCLAT CGCSGGTTCC GGCGGTGGCG CCNTGGCCGC CGGTGGCGCC G1TGCCGCCA TTGCCGCCAT CGCCGCCACC GCCGGNTTGG CCGCCGGCGC 360 420 480 540 600 660 720 780 840 900 913 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: GACTACGTTG GTGTAGAAAA ATCCTGCCGC CCGGACCCTT MGGCTGGGA CAATTTCTGA TAGCTACCCC GACACAGGAG GTTACGGGAT GAGCAATTCG CGCCGCCGCT CACTCAGGTG GTCATGGTTG CTGAGCGTGC TGGCTGCCGT CGGGCTGGGC CTGGCCACGG CGCCGGCCCA GGCGGCCCCG CCGGCCTTGT CGCAGGACCG GTTCGCCGAC TTCCCCGCGC TGCCCCTCGA 120 180 240
S
S. Se
C.
C. 55
S.
CCCGTCCGCG ATGGTCGCCC AAGTGGCGCC ACAGGTGGTC AACATCAACA CCAAACTGGG 300 CTACAACAAC GCCGTGGGCG CCGGGACCGG CATCGTCATC GATCCCAACG GTGTCGTGCT 360 GACCAACAAC CACGTGATCG CGGGCGCCAC CGACATCAAT GCGTFCAGCG TCGGCTCCGG 420 CCAAACCTAC GGCGTCGATG TGGTCGGGTA TGACCGCACC CAGGATGTCG CGGTGCTGCA 480 GCTGCGCGGT GCCGGTGGCC TGCCGTCGGC GGCGATCGGT GGCGGCGTCG CGGTTGGTGA 540 GCCCGTCGTC GCGATGGGCA ACAGCGGTGG GCAGGGCG ACGCCCCGTG CGGTGCCTGG 600 CAGGGTGGTC .GCGCTCGGCC AQ4ACCGTGCA GGCGTCGGAT TCGCTGACCG GTGCCGAAGA 660 GACATTGAAC GGGTTGATCC AGTTCGATGC CGCMATCCAG CCCGGTGATT CGGGCGGGCC 720 CGTCGTCAAC GGCCTAGGAC AGGTiGGTCGG TATGMCACG GC'CGCGTCCG ATAACTTCCA 780 GCTGTCCCAG GGTGGGCAGG GAJTCGCCAT TCCGATCGGG CAGGCGATGG CGATCGCGGG 840 CCAAATCCGA TCGGGTGGGG GGTCACCCAC CGTTCATATC GGGCCTACCG CCTTCCTCGG 900 CTTGGrGTGTT GTCGACAACA ACGGCAACGG CGCACGAGTC CAACGCGTGG TCGGAAGCGC 960 TCCGGCGGCA AGTCTCGGCA TCTCCACCGG CGACGTGATC ACCGCGGTCG ACGGCGCTCC 1020 GATCAACTCG GCCACCGCGA TGGCGGACGC GCTTMACGGG CATCATCCCG GTGACGTCAT 1080 CTCGGTGAAC TGGCAAACCA AGTCGGGCGG CACGCGTACA GGGMACGTGA CA1TGGCCGA 1140 GGGACCCCCG GCCTGATG TCGCGGATAC CACCCGCCGG CCGGCCAATT GGATTGGCGC 1200 CAGCCGTGAT TGCCGCGTGA GCCCCCGAGT TCCGTCTCCC GTGCGCGTGG CA1TGTGGAA 1260 GCAATGMACG AGGCAGAACA CAGCG1TGAG. CACCCTCCCG TGGAGGGCAG TTACGTCGAA 1320 GGCGGTGTGG. TCGAGCATCC GGATGCCAAG GACTTCGGCA GCGCCGCCGC CCTGCCCGCC 1380 GATCCGACCT GGTFAAGCA CGCCGTCTTC TACGAGGTGC TGGTCCGGGC GTTCTTCGAC 1440 GCCAGCGCGG ACGGTTCCGN CGATCTGCGT GGACTCATCG ATCGCCTCGA CTACCTGCAG 1500 TGGC1TGGCA TCGACTGCAT CTGTTGCCGC CGTTCCTACG ACTCACCGCT GCGCGACGGC 1560 GGTTACGACA TTCGCGACTT CTACAAGGTG CTGCCCGMAT TGGGCACCGT CGACGATTTC 1620 GTCGCCCTGG TCGACACCGC TCACCGGCGA GGTATCCGCA TCATCACCGA
CCTGGTGATG
AATCACACCT C-GGAGTCGCA CCCCTGGTTT GAGGAGTCCC GC'CGCGACCC
AGACGGACCG
TACGGTGACT ATTACGTGTG GAGCGACACC AGCGAGCGCT ACACCGACGC
CCGGATCATC
TTCGTCGACA CCGAAGAGTC GAACTGGTCA TTCGATCCTG TCCGCCGACA
GTTNCTACTG
GCACCGA1TC
TT
INFORMATION FOR SEQ ID NO:18: SEQUENCE
CHARACTERISTICS:
LENGTH: 1482 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 1680 1740 1800 1860 1872 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: CTTCGCCGMA ACCTGATGCC GAGGAACAGG GTGTTCCCGT GAGCCCGACG
GCGTCCGACC
CCGCGCTCCT CGCCGAGATC AGGCAGTCGC TTGATGCGAC AAAAGGGTTG
ACCAGCGTGC
ACGTAGCGGT CCGAACAACC GGGAAAGTCG ACAGCTTGCT GGGTATTACC
AGTGCCGATG
TCGACGTCCG GGCCAATCCG CTCGCGGCAA AGGGCGTATG CACCTACAAC
GACGAGCAGG
GTGTCCCGTT TCGGGTACAA GGCGACAACA TCTCGGTGAA ACTGTTCGAC
GACTGGAGCA
ATCTCGGCTC GATTTCTGAA CTGTCAACTT CACGCGTGCT CGATCCTGCC
GCTGGGGTGA
CGCAGCTGCT GTCCGGTGTC ACGAACCTCC MAGCGCAAGG TACCGAAGTG
ATAGACGGAA
TTTCGACCAC CAAAATCACC GGGACCATCC CCGCGAGCTC TGTCMAGATG
CTTGATCCTG
GCGCCAAGAG TGCAAGGCCG GCGACCGTGT GGATTGCCCA GGACGGCTCG
CACCACCTCG
TCCGAGCGAG CATCGACCTC GGATCCGGGT CGATTCAGCT CACGCAGTCG
AAATGGAACG
120 180 240 300 360 420 480 540 600 AACCCGTCAA CGTCGACTAG GCCGAGTTG CGTCGACGCG TTGCTCGAAA CGCCCTT7GTG 660 MCGGTGTCA ACGGCACCCG AAAACTGACC CCCTGACGGC ATCTGAMAAT TGACCCCCTA 720 GACCGGGCGG TTGGTGGTTA TTCTTCGGTG GTTCCGGCTG GTGGGACGCG GCCGAGGTCG 780 CGGTCTTTGA GCCGGTAGCT GTCGCCTTTG AGGGCGACGA CTTCAGCATG GTGGACGGG 840 CGGTCGATCA TGGCGGCAGC AACGACGTCG TCGCCGCCGA AAACCTCGCC CCACCGGCCG 900 AAGGCCTTAT TGGACGTGAC GATCAAGCTG GCCCGCTCAT ACCGGGAGGA CACCAGCTGG 960 AAGAAGAGGT TGGCGGCCTC GGGCTCAAAC GGAATGTAAC CGACTTCGTC AACCACCAGG 1020 AGCGGATAGC GGCCAAACCG GGTGAGTTCG GCGTAGATGC GCCCGGCGTG GTGAGCCTCG 1080 GCGAACCGTG CTACCCATTC GGCGGCGGTG GCGAACAGCA CCCGATGACC C-GCCTGACAC 1140 GCGCr.GTATCG CCAGGCCGAC CGCAAGATGA GTCTTCCCGG TGCCAGGCGG GGCCCAAAAA 1200 CACGACGTTA TCGCGGGCGG TGATGAAATC CAGGGTGCCC AGATGTGCGA TGGTGTCGCfG 1260 TTTGAGGCCA CGAGCATGCT CAAAGTCGAA CTCTTCCAAC GACTTCCGAA CCGGGAAGCG 1320 GGCGGCGCC-G ATGCGGCCCT CACCACCATG GGACTCCCGG GCTGACACTT CCCG.CTGCAG 1380 GCAGGCGGCC AGGTATTCTT CGTGGCTCCA GTTCTCGGCG CGGGCGCGAT CGGCCAGCCG 1440 eg.. GGACACTGAC TCACGCAGGG TGGGAGCTTT CAATGCTCTT GT 1482 Sn. INFORMATION FOR SEQ ID NO:19: se** 4 SEQUENCE
CHARACTERISTICS:
LENGTH: 876 base pairs TYPE: nucleic acid 5 STRANDEDNESS: single CD) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: GAATTCGGCA CGAGCCGGCG ATAGCTTCTG GGCCGCGGCC GACCAGATGG CTCGAGGGTT CGTGCTCGGG GCCACCGCCG GGCGCACCAC CCTGACCGGT GAGGGCCTGC AACACGCCGA 120 CGGTCACTCG TTGCTGC TGG ACGCCACCAA CCCGGCGGTG t^1TGCCTACG ACCCGGCCTTj 180 CGCCTACGAA ATCGGCTACA TCGNGGAAAG CGGACTGGCC AGGATGTGC'G GGGAGAACCC 240 GGAGAACATC TTCTTCTACA TCACCGTCTA CAACGAGCCG TACGTGCAGC CGCCGGAGCC 300 GGAGAACTTC GATCCCGAGG GCGTGCTGGG GGGTATCTAC CGNTATCACG CGGCCACCGA 360 GCAACGCACC MACAAGGNGC AGATCCTGGC CTCCGGGGTA GCGATGCCCG CGGCGCTGCG 420 GGCAGCACAG ATGCTGGCCG CCGAGTGGGA TGTCGCCGCC GACGTGTGGT CGGTIGACCAG 480 TTGGGGCGAG CTAAACCGCG ACGGGGTGGT CATCGAGACC GAGAAGCTCC GCCACCCCGA 540 TCGGCCGGCG GGCGTGCCCT ACGTGACGAG AGCGCTGGAG AATGCTCGGG GCCCGt3TGAT 600 CGCGGTGTCG" GACTGGATGC GCGCGGTCCC CGAGCAGATC CGACCGTGGG TG-CCGGGCAC 660 ATACCTCACG TTGGCCACCG ACGGG1TCGG TTTTTCCGAC ACTCGGCCCG CCGGTCGTCG 720 .*TTACTTCAAC ACCGACGCCG AATCCCAGGT TGGTCGCGGT TITGGGAGGG GTTGGCCGGG 780 *.TCGACGGGTG MATATCGACC CATTCGGTGC CGGTCGTGGG CCGCCCGCCC AGTTACCCGG 840 ATTCGACGAA GGTGGGGGGT TGCGCCCGAN TAAGTT 876 INFORMATION FOR SEQ ID Ci) SEQUENCE CHARACTERISTICS: LENGTH: 1021 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi SEUNEDSRPIN*E DN:0 *TCCCG .CGAGA *TGCCAGGCAA CCCCT AGA A6 CAGA1TCATA ACGAATTCAC AGCGGCACA AGCGAAGACC TGCCGCAGTT GGCGAAGCA CATGCMATGA TGCTCGTGCA ACACCTGCT GTAGACACGG TGCGAAACCA GTTCGACAGJ CAGGAACGCA CAGTCACCGA CCAGGTCGG' GAT'TTCCTCG GCGAGCAGTT CATGCAGTG( TTGATGGCAA CCCTGGTGCG GGTTGCCGA] AACTTCGTCG CACGTGAAGT GGATGTGGCG GGGGGCCGCC TCTAGATCCC TGGGGGGGAT TCCAGCCAGG CCTFGGTGCG
GCCGGGGTGG
CGGNAAAAGT CGATGTCCTC GTACTCATCG GCTGCCGAGC GGTCAACGAG TTGCGGATAT GCGGTTGGCC CGACCGCCGT GGCCGCACTG AACAACGTCG GCAGGAGGGG TGGAGCCCGC CATCAACACC GCACGGGATC GATCTGCGGA GAGCGCCAGC AGTTGTTT1T CCACCAGCGA A CAATATGTCG CGATCGCGGT TTATTTCGAC T T17ACAGCC AAGCGGTCGA GGAACGAAAC C GACCGCGACC TTCGTGTCGA AA1TCCCGGC A CCCCGCGAGG CACTGGCGCT GGCGCTCGAT T CGGCTGACAC CGGTGGCCCG CGACGAGGGC 3 TTCTGCAGG AACAGATCGA AGAGGTGGCC CGGGCCGGGG CCAACCTGTT CGAGCTAGAG CCGIGCCGCAT CAGGCGCCCC GCACGCTGCC *CAGCGAGTGG TCCCGTTCGC CCGCCCGTCT TGAGTACCAA TCCAGGCCAC CCCGACCTCC ACGTTCCAGG AGTACACCGC CCGGCCCTGA TCCTT'TMCG GAGGCAGTGA GGGTCCCACG CTGGTCAGGT ATCGGGGGGT C1TGC-CGAGC CGGATCCGCA GACCGGGGGG GCGAAAACGA GGGGGGTGCG GGMATACCGA ACCGGTGTAG AGCGTT1CG GGTCATCGGN GGCNNTTAAG 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1021
T
INFORMATION FOR SEQ ID NO:-21: SEQUENCE CHARACTERISTICS: LENGTH: 321 base pairs TYPE: nucleic acid STRANOEDNESS: single TOPOLOGY: linear 69 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: CGTGCCGACG AACGGAAGAA CACAACCATG AAGATGGTGA AATCGATCGC CGCAGGTCTG ACCGCCGCGG CTGCAATCGG CGCCGCTGCG GCCGGTGTGA CTTCGATCAT GGCTGGCGGN CCGGTCGTAT ACCAGATGCA GCCGGTCGTC TTCGGCGCGC CACTGCCGTT GGACCCGGNA TCCGCCCCTG ANGTCCCGAC CGCCGCCCAG TGGACCAGNC TGCTCAACAG NCTCGNCGAT CCCAACGTGT CGTTTGNGAA CAAGGGNAGT CTGGTCGAGG GNGGNATCGG NGGNANCGAG GGNGNGNATC GNCGANCACA A INFORMATION FOR SEQ ID NO:22: SEQUENCE CHARACTERISTICS: LENGTH: 373 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 120 180 240 300 321 120 180 240 300 360 373 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: TCTTATCGGT TCCGGTTGGC GACGGGTTTT GGGNGCGGGT GGTTAACCCG CGATCGACGG GCGCGGAGAC GTCGACTCCG ATACTCGGCG CGCGCTGGAG CCTCGGTG1GT GNACCGGCAA GGCGTGAAGG AGCCGTTGNA GACCGGGATC ACGCGATGAC CCCGATCGGC CGCGGGCAGC GCCAGCTGAT CATCGGGGAC GCAAAAACCG CCGTCTGTGT CGGACACCAT CCTCAAACCA GCGGGAAGAA GGTGGATCCC MAGAAGCAGG TGCGCTTGTG TATACGTTGG CCATCGGGCA CTTACCATCG CCG INFORMATION FOR SEQ ID NO:23:
CTCGGCCAGC
CTCCAGGCGC
AAGGCGATTG
CGCAAGACCG
CTGGGAGTCC
AGAAGGGGAA
SEQUENCE CHARACTERISTICS: LENGTH: 352 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi SEQUENCE DESCRIPTION: SEQ ID NO: 23: GTGACGCCGT GATGGGATTC CTGGGCGGGG CCGGTCCGCT GGCGGTGGTG TGGTTACCCG GGTGCCGCAA GGCTGGTCGT TTGCTCAGGC AGCCGCTGTG TCTTGACGGC CTGGTACGGG TTGGCCGA1T TAGCCGAGAT CAAGGCGGGC TGATCCATGC CGGTACCGGC GGTGTGGGCA TGGCGGCTGT GCAGCTGGCT GCGTGGAGGT TTTCGTCACC GCCAGCCGTG GNAAGTGGGA CACGCTGCGC 1TGACGACGA NCCATATCGG NGATTCCCNC ACAThCGMAG TTCCGANGGA INFORMATION FOR SEQ ID NO:24: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 726 base pairs TYPE: nucleic acid STRANOEDNESS: single TOPOLOGY: linear
GATCAGCAAC
CCGGTGGTGT
GAATCGGTGC
CGCCAGTGGG
GCCATNGNGT
GA
120 180 240 300 352 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: GAAATCCGCG TTCATTCCGT TCGACCAGCG GCTGGCGATA ATCGACGAAG TGATCMAGCC GCGGTTCGCG GCGCTCATGG GTCACAGCGA GTAATCAGCA AGTTCTCTGG TATATCGCAC CTAGCGTCCA GTTGCTTGCC AGATCGCTTT CGTACCGTCA TCGCATGTAC CGGTTCGCGT GCCGCACGCT CATGCTGGCG GCGTGCATCC TGGCCACGGG TGTGGCGGGT CTCGGGGTCG 120 180 240 GCGCGCAGTC CGCAGCCCAA ACCGCGCCGG TGCCCGACTA CTACTGGTGC CCGGGGCAGC C1TTCGACCC CGCATGGGGG CCCAACTGGG ATCCCTACAC CTGCCATGAC GACTTCCACC GCGACAGCGA CGGCCCCGAC CACAGCCGCG ACTACCCCGG ACCCATCCTC GAAGGTCCCG TGCTTGACGA TCCCGGTGCT GCGCCGCCGC CCCCGGCTGC CGGTGGCGGC GCATAGCGCT CGTTGACCGG GCCGCATCAG CGMATACGCG TATAAACCCG GGCGTGCCCC CGGCMAGCTA CGACCCCCGG CGGGGCAGAT TTACGCTCCC GTGCCGATGG ATCGCGCCGT CCGATGACAG AAAATAGGCG ACGGT17GG CAACCGCTTG GAGGACGCTT GMAGGGAACC TGTCATGAAC GGCGACAGCG CCTCCACCAT CGACATCGAC AAGGTTGTTA CCCGCACACC CGTTCGCCGG
ATCGTG
INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 580 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 4.
*4 .4 300 360 420 480 540 600 660 720 726 120 180 240 300 360 (xi) SEQUENCE DESCRIPTION: SEQ ID CGCGACGACG ACGAACGTCG GGCCCACCAC CGCCTATGCG TTGATGCAGG GGTCGCCGAC CATATCCAAG CATGCTGGGT GCCCACTGAG CGACCTTTTG CTGCCCGATG GCGGCCCGGT GMAGTCATTG CGCCGGGGCT TGTGCACCTG ATAGGGAACA ATAGGGGGGT GATTTGGCAG TTCAATGTCG GGTATGGCTG GGCGGGGCAT GCTCGGCGCC GACCAGGCTC GCGCAGGCGG GCCAGCCCGA AGCACTCMAT GGCGGCGATG AAGCCCCGGA CCGGCGACGG TCCTTTGGAA
CGACCGGGAT
ACCAGCCGGG
ATGAACCCGA
GAAATCCAAT
ATCTGGAGGG
GCAACTAAGG
72 AGGGGCGCGG CA1TGTGATG CGAGTACCAC TTGAGGGTGG CGGTCGCCTG GTCGTCGAGC 420 TGACACCCGA CGAAGCCGCC GCACTGGGTG ACGMACTCAA AGGCGTVACT AGCTAAGACC 480 AGCCCAACGG CGMATGGTCG GCGTTACGCG CACACCTFCC GGTAGATGTC CAGTGTCTGC 540 TCGGCGATGT ATGCCCAGGA GAACTC1TGG ATACAGCGCT 580 INFORMATION FOR SEQ ID NO:26: 0I) SEQUENCE
CHARACTERISTICS:
LENGTH: 160 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear Cxi) SEQUENCE DESCRIPTION: SEQ ID NO:26: :AACGGAGGCG CCGGGGGT TGGCGGGGCC GGGGCGGTCG GCGGCAACGG CGGGGCCGGC GGTACCGCCG GGTTGT TCGG TGTCGGCGGG GCCGGTGGGG CCGGAGGCAA CGGCATCGCC 120 GGTGTCACGG GTACGTCGGC CAGCACACCG GGTGGATCCG 160 INFORMATION FOR SEQ ID NO:27: Ci) SEQUENCE
CHARACTERISTICS:
LENGTH: 272 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi SEUNEDSRPIN*E DN:7 GAACGT *GTGTA *TCCACGTTGC GTGGC*TCC C6 GCACCAA CCGAGGTGAT GACCCAAC GTTGTCGCA CGCTCGAGGC. GTCAGCGTC 620 AAGGCGATGG GAATCGACAA GCTGCGGGTA ATTCATACCG GATGGACCC CGTCGTCGCT 180 GAACGCGMAC AGTGGGACGA CGGCAACMAC ACGTTGGCGT TGGCSCCCGG TGTCGTTGTC 240 GCCTACGAGC GCMACGTACA GACCAACGCC CG 272 INFORMATION FOR SEQ ID NO:28: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 317 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: GCAGCCGGTG GTTCTCGGAC TATCTGCGCA CGGTGACGCA GCGCGACGTG CGCGAGCTGA AGCGGATCGA GCAGACGGAT CGCCTGCCGC GGTTCATGCG CTACCTGGCC GCTATCACCG 120 CGCAGGAGCT GMACGTGGCC GAAGCGGCGC GGGTCATCGG GGTCGACGCG GGGACGATCC 180 GTTCGGATCT GGCGTGGTTC GAGACGGTCT ATCTGGTACA TCGCCTGCCC GCCTGGTCGC 240 ***GG.AATCTGAC CGCGAAGATC AAGAAGCGGT CAAAGATCCA CGTCGTCGAC AGTGGCTTCG 300 CGGCCTGGTT GCG 317 INFORMATION FOR SEQ ID NO:29: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 182 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: GATCGTGGAG CTGTCGATGA ACAGCGTTGC CGGACGCGCG GCGGCCAGCA CGTCGGTGTA GCAGCGCCGG ACCACCTCGC CGGTGGGCAG CATGGTGATG ACCACGTCGG
CCTCGGCCAC
CGCTTCGGGC GCGCTACGAA ACACCGCGAC ACCGTGCGCG GCGGCGCCGG
ACGCCGCCGT
GG
INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 308 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 120 180 182 S. S S
S.
S
S
S.
(xi) SEQUENCE DESCRIPTION: SEQ ID GATCGCGAAG TTGGTGAGC AGGTGGTCGA CGCGAAAGTC TGGGCGCCTG
CGAAGCGGGT
CGGCGTTCAC GAGGCGAAGA CACGCCTGTC CGAGCTGCTG CGGCTCGTCT
ACGGCGGGCA
GAGGTTGAGA TTGCCCGCCG CGGCGAGCCG GTAGCAAAGC TTGTGCCGCT
GCATCCTCAT
GAGACTCGGC GGTTAGGCAT TGACCATGGC GTGTACCGCG TGCCCGACGA TTT7GGACGCT CCGTTGTCAG ACGACGTGCT CGAACGCTT CACCGGTGAA GCGCTACCTC
ATCGACACCC
ACGT1TGG INFORMATION FOR SEQ ID NO:31: SEQUENCE CHARACTERISTICS: LENGTH: 267 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 120 180 240 300 308 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: CCGACGACGA GCAACTCACG TGGATGATGG TCGGCAGCGG CATTGAGGAC GGAGAGMATC CGGCCGAAGC TGCCGCGCGG CAAGTGCTCA TAGTGACCGG CCGTAGAGGG CTCCCCCGAT 120 GGCACCGGAC TATTCTGGTG TGCCGCTGGC CGGTAAGAGC GGGTAAAAGA ATGTGAGGGG 180 ACACGATGAG CMATCACACC TACCGAGTGA TCGAGATCGT CGGGACCTCG CCCGACGGCG 240 TCGACGCGGC AATCCAGGGC GGTCTGG 267 INFORMATION FOR SEQ ID NO:32: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 189 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: CTCGTGCCGA AAATGG GGGGACACGA TGAGAATC CACCTACCGA 'GiuiTwwAwA TCGTCGGGAC CTCGCCCGAC GGCGTCGACG CGGCAATCCA GGGCGGTCTG GCCCGAGCTG 120 *a.CGCAGACCAT GCGCGCGCTG GACTGGTTCG MAGTACAGTC AATTCGAGGC CACCTGGTCG 180 ACGGAGCGG 189 INFORMATION FOR SEQ ID NO:33: a' SEQUENCE CHARACTERISTICS: LENGTH: 851 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: CTGCAGGGTG GCGTGGATGA GCGTCACCGC GGGGCAGGCC GAGCTGACCG CCGCCCAGGT CCGGGTTGCT GCGGCGGCCT ACGAGACGGC GTATGGGCTG ACGGTGCCCC CGCCGGTGAT 120 CGCCGAGMAC CGTGCTGAAC TGATGATVCT GATAGCGACC AACCTCTTGG GGCAAAACAC 180 CCCGGCGATC GCGGTCAACG AGGCCGAATA CGGCGAGATG TGGGCCCAAG ACGCCGCCGC 240 GATGTTFGGC TACGCCGCGG CGACGGCGAC GGCGACGGCG ACGTTGCTGC CGTTCGAGGA 300 GGCGCCGGAG ATGACCAGCG CGGGTGGGCT CCTCGAGCAG GCCGCCGCGG TCGAGGAGGC 360 CTCCGACACC GCCGCGGCGA ACCAGTTGAT GAACAATGTG CCCCAGGCGC TGAAACAGIT 420 GGCCCAGCCC ACGCAGGGCA CCACGCC1TC TTCCAAGCTG GGTGGCCTGT GGAAGACGGT 480 CTCGCCGCAT CGGTCGCCGA TCAGCAACAT GGTGTCGATG GCCAACAACC ACATGTCGAT 540 GACCAACTCG GGTGTGTCGA TGACCAACAC CTTGAGCTCG ATGTTGAAGG GCT1TGCTCC 600 be GGCGGCGGCC GCCCAGGCCG TGCAAACCGC GGCGCAAAAC GGGGTCCGGG CGATGAGCTC 660 o GCTGGGCAGC TCGCTGGGUT CTTCGGGTCT GGGCGGTGGG GTGGCCGCCA AC1TGGGTCG 720 GGCGGCCTCG GTACGGTATG GTCACCGGA TGGCGAAA TATGCANAGT CTGGTCGGCG 780 GAACGGTGGT CCGGCGTAAG GTTACCCCC GTTTCTGGA TGCGGTGAMC TTCGTCAACG 840 GAAACAGTVA **b851 INFORMATION FOR SEQ ID NO:34: i) SEQUENCE
CHARACTERISTICS:
LENGTH: 254 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: GATCGATCGG GCGGAAA11 GGACCAGATT CGCCTCCGGC GATAACCCMA TCAATCGAAC CTAGAT1TAT TCCGTCCAGG GGCCCGAGTA ATGGCTCGCA GGAGAGGMAC CTTACTGCTG CGGGCACCTG TCGTAGGTCC TCGATACGGC GGA/GGCGTC GACATTTTCC ACCGACACCC CCATCCAMAC GTTCGAGGGC CACTCCAGCT TGTGAGCGAG GCGACGCAGT CGCAGGCTGC GC1TGGTCMA GATC INFORMATION FOR SEQ ID Ci) SEQUENCE CHARACTERISTICS; LENGTH: 408 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 120 180 240 254
S
S.
S.
*S
S
*06O
S.
S.
(xi SEQUENCE DESCRIPTION: SEQ ID CGGCACGAGG ATCCTGACCG AAGCGGCCGC CGCCAAGGCG AAGTCGCTGT GGGACGGGAC GATCTGGCGC TGCGGATCGC GG1TCAGCCG GGGGGGTGCG CTATAACCTT TrCTTCGACG ACCGGACGCT GGATGGTGAC CAAACCGCGG TGTCAGGTTG ATCGTGGACC GGATGAGCGC GCCGTATGTG GAAGGCGCGT CGTCGACACT ATTGAGAAGC AAGGNTTCAC CATCGACAAT CCCMCGCCA CGCGTGCGGG GATTCGTTCA ACTGATAAAA CGC TAGTACG ACCCCGCGGT TACQAGGACA CCAAGACCTG ACCGCGCTGG AAAAGCAACT GAGCGATG INFORMATION FOR SEQ ID, NO:36: SEQUENCE CHARACTERISTICS:
TGGACCAGGA
CTGGATTGCG
AGTTCGGTGG
CGATCGATTT
CCGGCTCCTG
GCGCAACACG
120 180 240 300 360 408 SS S 0O 0S S. S S @5 S 55 78 LENGTH: 181 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: GCGGTGTCGG CGGATCCGGC GGGTGGTGA ACGGCAACG CGGGGCCGGC GGGGCCGGCG GGACCGGCGC TAACGGTGGT GCCGGCGCCA ACGCCTGGHT GTTCGGGGCC GGCGGGTCCG 120 GCGGNGCCGG CACCAATGGT GGNGTCGGCG GGTCCGGCGG ATTTGTCTAC GGCAACG .GCG 180 G 181 INFORMATION FOR SEQ ID NO:37: SEQUENCE
CHARACTERISTICS:
LENGTH: 290 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: GCGGTG:CG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGTGTCGGC GGCCGGGGCG **GCGACGGCGT CTTTGCCGGT GCCGGCGGCC AGGGCGGCCT CGGTGGGCAG GGCGGCAATG 120 GCGGCGGCTC CACCGGCGGC AACGGCGGTC TTGGCGGCGC GGGCGGTGGC GGAGGCAACG 180 CCCCGGACGG CGGCTTCGGT GGCAACGGCG GTAAGGGTGG CCAGGGCGGN ATTGGCGGCG 240 GCACTCAGAG CGCGACCGGC CTCGGNGGTG ACGGCGGTGA CGGCGGTGAC 290 INFORMATION FOR SEQ ID NO:38: 79 SEQUENCE CHARACTERISTICS: LENGTH: 34 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: GATCCAGTGG CATGGNGGGT GTCAGTGGAA GCAT 34 INFORMATION FOR SEQ ID NO:39: SEQUENCE CHARACTERISTICS: LENGTH: 155 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: GATCGCTGCT CGTCCCCCCC TTGCCGCCGA CGCCACCGGT CCCACCGTTA CCGAACAAGC TGGCGTGGTC GCCAGCACCC CCGGCACCGC CGACGCCGGA GTCGAACAAT GGCACCGTCG 120 TATCCCCACC ATTGCCGCCG GNCCCACCGG CACCG 155 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 53 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear so (xi) SEQUENCE DESCRIPTION: SEQ ID ATGGCGTrCA CGGGGCGCCG GGGACCGGGC AGCCCGGNGG GGCCGGGGGG TGG 53 INFORMATION FOR SEQ ID NO:41: SEQUENCE
CHARACTERISTICS:
LENGTH: 132 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: GATCCACCGC GGGTGCAGAC GGTGCCCGCG GCGCCACCCC GACCAGCGGC GGCAACGGCG GCACCGGCGG CAACGGCGCG AACGCCACCG TCGTCGGNGG GGCCGGCGGG GCCGGCGGCA 120 AGGGCGGCAA CG 132 INFORMATION FOR SEQ ID NO:42: C0) SEQUENCE
CHARACTERISTICS:
LENGTH: 132 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: GATCGGCGGC CGGNACGGNC GGGGACGGCG GCAAGGGCGG NMACGGGGGC GCCGNAGCCA CCNGCCAAGA ATCCTCCGNG TCCNCCAATG GCGCGMATGG CGGACAGGGC GGC.AACGGCG 120 GCANCGGCGG CA 132 INFORMATION FOR SEQ ID NO:43: SEQUENCE CHARACTERISTICS: LENGTH: 702 base pairs (B8) TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:43: CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GATCCGTTCG CGATGCCGGC ATGAACGGGC GGCATCAAAT TAGTGCAGGA ACCTTCAGT TTAGCGACGA TAATGGCTAT 120 180
AGCACTAAGG
AGATTTTGAA
CCATCACACC
CCGACAACAT
CGCTGCGCAA
ACAACGACGG
CGGCCGAACT
TCAAAGAAGC
AGGATGATCC
CAGGGCCAAC
GTGCGAACTC
GCGGGAATAC
CGCGGCCAAG
CGMAGGAACT
MCCGATACG
GGCAAGGAAG
GATATGACGC
GAGGTGGAGG
ACGGNGGNTA
CTGGCGGCCG
GNGTATGGCG
GTGCAGGCAG
CCGAGGGTGG
CTCGAAACGG
AGTCGCAGAC
CCCCGATGGC
AAAACGCCGC
GTGCCAAAGA
AGGTTGATGA
AATCGGCCGG
CCACGGCCGG
GCGACCAAGG
CGTGACGGTG
GGACCCACCG
CCAACAGNTG
GCGGCAGCGT
GGAGGCTGCG
GGCCGTCGGA
TGAACCCAAC
CGCATCGCTC
GATCAGCAAG
ACTGATGTCC
GTNTTGTCCG
CTGGCGACCT
ACCGCGCTGG
GGGGACAGTT
TTCATGGATC
GCGCACTGNG
240 300 360 420 480 540 600 660 702 GGGATGGGTG GMACACTTNC ACCCTGACGC TGCAAGGCGA CG INFORMATION FOR SEQ ID NO:44: (M SEQUENCE CHARACTERISTICS: LENGTH: 298 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: GAAGCCGCAG CGCTGTCGGG CGACGTGGCG GTCAAAGCGG CATCGCTCGG
TGGCGGTGGA
GGCGGCGGGG TGCCGTCGGC GCCGTTGGGA TCCGCGATCG GGGGCGCCGA
ATCGGTGCGG
CCCGCTGGCG CTGGTGACAT TGCCGGCTTA GGCCAGGGAA GGGCCGGCGG
CGGCGCCGCG
CTGGGCGGCG GTGGCATGGG AATGCCGATG GGTGCCGCGC ATCAGGGACA
AGGGGGCGCC
AAGTCCAAGG GTTCTCAGCA GGAAGACGAG GCGCTCTACA CCGAGGATCC
TCGTGCCG
INFORMATION FOR SEQ ID 0I) SEQUENCE CHARACTERISTICS: LENGTH: 1058 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 6 0 120 180 240 298 (xi) SEQUENCE DESCRIPTION: SEQ ID CGGCACGAGG ATCGAATCGC GTCGCCGGGA GCACAGCGTC
GCACTGCACC
CCATGACCTA CTCGCCGGGT AACCCCGGAT ACCCGCAAGC
GCAGCCCGCA
GAGGCGTCAC ACCCTCGTTC GCCCACGCCG ATGAGGGTGC
GAGCAAGCTA
TGAACATCGC GGTGGCAGTG CTCGGTCTGG CTGCGTACTT
CGCCAGCTTC
TCACCCTCAG TACCGMACTC GGGGGGGGTG ATGGCGCAGT
GTCCGGTGAC
CGGTCGGGGT GGCTCTGCTG GCTGCGCTGC TTGCCGGGGT
GGTTCTGGTG
AGAGCCATGT GACGGTAGTT GCGGTGCTCG GGGTACTCGG
CGTATTCTG
AGTGGAGGAG
GGCTCCTACG
CCGATGTACC
GGCCCAATGT
ACTGGGCTGC
CC TAAGGC CA
ATGGTCTCGG
120 180 240 300 360 420 CGACGTTTAA CAAGCCCAGC GCCTATTCGA CC TCATCGTGTT CCAGGCGGTT GCGGCAGTCC TG CCGCGCCGGC GCCGCGGCCC AAGTTCGACC CG ACGGGCAGTA CGGGGTGCAG CCGGGTGGGT AC CGGGACTGCA GTCGCCCGGC CCGCAGCAGT CTi ACGGCGGCTA TTCGTCCAGT CCGAGCCAAT CG CCCAGCCGCC GGCGCAGTCC GGGTCGCAAC AA1 CCGGCTTTCC GAGCTTCAGC CCACCACCAC CG( GTTCGGCTCC AGTCAACTAT TCMAACCCCA GCC GGGCGCCGGT CTMACCGGGC GTTCCCGCGT CC~ GGGTGTCAGC AAGCGCGGAC GATCCTCGTG CCG INFORMATION FOR SEQ ID NO:46: Ci SEQUENCE CHARACTERISTICS: LENGTH: 327 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear
:GGTTGGGC
GCGCTCTT
TATGGACA
TACGGTCA
CGCAGCC
3GCAGTGG
CGCACCA
TCAGTGC
~GGGGCGA
GTCGCGC
AATTC
ATTGTGGG1T GTG1TGGCTT GGTGGAGACC GGCGCTATCA GTACGGGCGG TACGGGCAGT GCAGGGTGCT CAGCAGGCCG TCCCGGATAT GGGTCGCAGT ATACACTGCT CAGCCCCCGG GGGCCCATCC ACGCCACCTA CGGGACGGGG TCGCAGGCTG GCAGTCGTCG TCCCCCGGGG GTGTGCGCGA AGAGTGAACA 480 540 600 660 720 780 840 900 960 1020 1058 oo..
0*0* 0 0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:46: CGGCACGAGA GACCGATGCC GCTACCCTCG CGCAGGAGGC AGGTMATTTC GAGCGGATCT CCGGCGACCT GAAAACCCAG ATCGACCAGG TGGAGTCGAC GGCAGGTTCG TTGCAGGGCC AGTGGCGCGG CGCGGCGGGG ACGGCCGCCC AGGCCGCGGT GGTGCGCTTC CAAGAAGCAG CCAATAAGCA GAAGCAGGAA CTCGACGAGA TCTCGACGMA TATTCGTCAG GCCGGCGTCC AATACTCGAG GGCCGACGAG GAGCAGCAGC AGGCGCTGTC CTCGCAAATG GGCTTCTGAC 120 180 240 300 84 CCGCTAATAC GAAAAGAAAC GGAGCAA 327 INFORMATION FOR SEQ ID NO:47: SEQUENCE
CHARACTERISTICS:
LENGTH: 170 base pairs TYPE: nucleic acid STRANDEDNESS; single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:47: CGGTCGCGAT GATGGCGTTG TCGMACGTGA CCGA1TCT1GT ACCGCCGTCG TTGAGATCAA CCAACAACGT G1TGGCGTCG GCAAATGTGC CGNACCCGTG GATCTCGGTG ATCTTGTTCT 120 TCTTCATCAG GAAGTGCACA CCGGCCACCC TGCCCTCGGN TACCTITCGG 170 INFORMATION FOR SEQ ID NO:48: Ci) SEQUENCE
CHARACTERISTICS:
LENGTH: 127 base pairs TYPE: nucleic acid STRANOEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: GATCCGGCGG CACGGGGGGT GCCGGCGGCA GCACCGCTGG CGCTGGCGGC.AACGGCGGGG. CCGGGGGTGG CGGCGGAACC GGTGGGTTGC TCTTCGGCAA CGGCGGTGCC GGCGGGCACG 120 GGGCCGT 127 INFORMATION FOR SEQ ID NO:49: SEQUENCE CHARACTERISTICS: LENGTH: 81 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: CGGCGGCAAG GGCGGCACCG CCGGCAACGG GAGCGGCGCG GCCGGCGGCA ACGGCGGCAA CGGCGGCTCC GC- TCAACG G 81 INFORMATION FOR SEQ 10 SEQUENCE CHARACTERISTICS: LENGTH: 149 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID GATCAGGGCT GGCCGGCTCC GGCCAGAAGG GCGGTAACGG AGGAGCTGCC GGATTGTTTG GCAACGGCGG GGCCGGNGGT GCCGGCGCGT CCAACCMAGC CGGTAACGGC GGNGCCGGCG 120 GAAACGGTGG TGCCGGTGGG CTGATCTGG 149 INFORMATION FOR SEQ ID NO:51: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 355 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi SEQUENCE DESCRIPTION: SEQ ID NO: 51: CGGCACGAGA TCACACCTAC CGAGTGATCG AGATCGTCGG GACCTCGCCC GACGGTGTCG ACGCGGNAAT CCAGGGCGGT CTGGCCCGAG CTGCGCAGAC CATGCGCGCG CTGGACTGGT 120 TCGAAGTACA GTCMATTCGA GGCCACCTGG TCGACGGAGC GGTCGCGCAC TTCCAGGTGA 180 CTATGAAAGT CGGCTTCCGC CTGGAGGATT CCTGAACC1T CAAGCGCGGC CGATAACTGA 240 GGTGCATCAT TAAGCGACTT 1TCCAGAACA TCCTGACGCG CTCGMACGC GGT-TCAGCCG 300 ACGGTGGCTC CGCCGAGGCG CTGCCTCCAA AATCCCTGCG ACAATTCGTC GGCGG 355 INFORMATION FOR SEQ ID NO:52: SEQUENCE
CHARACTERISTICS:
LENGTH: 999 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: ATGCATCACC ATCACCATCA CATGCATCAG GTGGACCCCA ACTTGACACG TCGCMAGGGA *CGATTGGCGG CACTGGCTAT CGCGGCGATG GCCAGCGCCA GCCTGGTGAC CGTTGCGGTG 120 CCCGCGACCG CCAACGCCGA TCCGGAGCCA GCGCCCCCGG TACCCACAAC GGCCGCCTCG 180 CCGCCGTCGA CCGCTGCAGC GCCACCCGCA CCGGCGACAC CTGTTGCCCG. CCCACCACCG 240 GCCGCCGCCA ACACGCCGAA TGCCCAGCCG GGCGATCCCA ACGCAGCACC TCCGCCGGCC 300 GACCCGAACG CACCGCCGCC ACCTGTCATT GCCCCAAACG CACCCCAACC TGTCCGGATC 360 GACAACCCGG TTGGAGGATT CAGCTTCGCG CTGCCTGCTG GCTGGGTGGA GTCTGACGCC 420 GCCCACTTCG ACTACGGTTC AGCACTCCTC AGi GGACAGCCGC CGCCGGTGGC CAATGACACC
CG
CTTTACGCCA GCGCCGAAGC CACCGACTCC
AC
GGTGAGTTCT ATATGCCCTA CCCGGGCACC
CGC
GCCAACGGGG TGTCTGGMAG CGCGTCGTAT
TAG
CCGMACGGCC AGATCTGGAC GGGCGTAATC
GGC
GGGCCCCCTC AGCGCTGGTT TGTGGTATGG
CTC
GGCGCGGCCA AGGCGCTGGC CGAATCGATC
CGG(
GCACCGGCTC CTGCAGAGCC CGCTCCGGCG
CCGC
CCGACGACAC CGACACCGCA GCGGACCT TA CCGG INFORMATION FOR SEQ ID NO: 53: (i SEQUENCE
CHARACTERISTICS:
LENGTH: 332 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear
CAAAAC,
FATCGTGC
GCCGCGG
ATCAACC
GAAGTCA
TCGCCCG
GGGACCG
'CTTTGG
CGCCGG
CCTGA
CCGGGGACCC
GCCATTTCCC
TCGGCCGGCT
AGACCAAAAG
CCCGG1TGGG
CTCGGACATG
AGGAAACCGT
CTCGCTCGAC
AGTTCAGCGA
TCCGAGTAAG
CGGCGAACGC
ACCGGACGCC
CCAACAACCC GGTGGACAAG TCGCCCCGCC
GCCGGCGCCG
CCGGGGMAGT
CGCTCCTACC
480 540 600 660 720 780 840 900 960 999 b b S
S.
S
0
S.
0* S S S
S.
*0 S
S.
(xi SEQUENCE DESCRIPTION: SEQ ID NO: 53: Met His His His His His His Met His Gin Val Asp Pro Asn Leu Thr 1 5 10 Arg Arg Lys Gly Arg Leu Ala Ala L eu Ala Ile Ala Ala Met Al a Ser 25 Ala Ser Leu Val Thr Val Ala Val Pro Ala Thr Ala Asn Ala Asp Pro 40 Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro Ser Thr 55 Ala Ala Ala Ala Ali Ala Pro Pro Pro Asn Ala Pro 115 Phe Ala Leu 130 Tyr Gly Ser A 145 Gly Gin Pro P Leu Asp Gin L 1 Ala Ala Arg Le 195 Gly Thr Arg II 210 Ser Gly Ser Al 225 Pro Asn Gly Glr Ala Pro Asp Ala 260 Thr Ala Asn Asn 275 Ser Ile Arg Pro 290 Pro Pro Asn Thr Ala Asp 100 Gin Pro Pro Ala C la Leu L 1 ro Pro V 165 ys Leu Th 30 .u Gly Se e Asn Gi a Ser Tyi 23( n Ile Tr
F
245 SGly Pro Pro Val Leu Val v r a
I
r r 3 Ala 70 Pro Pro tal A ly T 1 eu S 50 1i A 'r Al r As 1 Gl 21 Tyr Thr Pro Asp Ala 295 Pro Ala Thr Pro Asn Ala Gin Pro 90 Asn Ala Pro Pro 105 \rg Ile Asp Asn I 120 rp Val Glu Ser 35 er Lys Thr Thr G 1 la Asn Asp Thr A 170 a Ser Ala Glu A 185 p Met Gly Glu Pt 200 u Thr Val Ser Le 5 r Glu Val Lys Ph 23! Gly Val Ile GI 250 G1n Arg Trp Phe 265 Lys Gly Ala Ala 280 Pro Pro Pro Ala
F
r
E
I
Val Ala 75 Gly Asp Pro Pro 'ro Val sp Ala I 140 ly Asp P 55 -g Ile V a Thr A e Tyr Me 2C u Asp Al 220 SSer As SSer Pr Val Val Lys Ala 285 Pro Ala 300 Pro Pro Va Gly 125 Ala H 'ro P al L sp SE 19 ;t Pr a As p Pr o Al ITrp 270 Leu Pro 1
I
3 r e c n Pro Pro Asn Ala A le Ala P ly Phe S( is Phe As o Phe Pr 16 u Gly Ar 175 r Lys Ala 0 J Tyr Pro SGly Val Ser Lys 240 Ala Asn 255 Leu Gly Ala Glu Ala Pro Pro la ro er
*P
0 o 0 Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala Gly 305 310 315 Glu Val Ala Pro Thr 320 Pro Thr' Thr Pro Thr Pro Gin Arg Thr Leu Pro Ala 325 330 INFORMATION FOR SEO ID NO:54: SEQUENCE CHARACTERISTICS: LENGTH: 20 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:54: Asp 1 Pro Val Asp Ala Val Ile Asn Thr Thr Xaa Asn Tyr Gly Gin Val 10 Val Ala Ala Leu INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 1 5 10 INFORMATION FOR SEQ ID NO:56: SEQUENCE
CHARACTERISTICS:
LENGTH: 19 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 1 5 10 Glu Gly Arg INFORMATION FOR SEQ ID NO:57: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear S* 1 5 10 INFORMATION FOR SEQ ID NO:58: SEQUENCE
CHARACTERISTICS:
LENGTH: 14 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: Asp Ile Gly Ser Glu Ser Thr Glu 1 5 Asp Gin Gin Xaa Ala Val INFORMATION FOR SEQ ID NO:59: SEQUENCE CHARACTERISTICS: LENGTH: 13 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: a. a.
a Ala 1 Glu Glu Ser Ile Ser Thr Xaa Glu Xaa Ile Val Pro INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 17 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Ala Ala Ala Ala Pro Pro 1 5 10 a.
a.
a a.
a.
Ala INFORMATION FOR SEQ ID NO:61: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:61: Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 1 5 10 INFORMATION FOR SEQ ID NO:62: SEQUENCE
CHARACTERISTICS:
LENGTH: 30 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:62: Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Gin Thr Ser 1 5 10 Leu Leu Asn Asn Leu Ala Asp Pro Asp Val Ser Phe Ala Asp 25 INFORMATION FOR SEQ ID NO:63: SEQUENCE CHARACTERISTICS: LENGTH: 187 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:63: Thr Gly Ser Leu Asn Gin Thr His Asn Arg 1 5 10 Asn Thr Thr Met Lys Met Val Lys Ser Ilie 25 Ala Ala Ala Ilie Gly Ala Ala Ala Ala Gly 40 Gly Gly Pro Val Val Tyr Gin Met Gin Pro 55 Leu Pro Leu Asp Pro Ala Ser Ala Pro Asp 70 Leu Thr Ser Leu Leu Asn Ser Leu Ala Asp F 85 90 Asn Lys Gly Ser Leu Val Glu Gly Gly Ile G 100 105 Ilie Ala Asp His Lys Leu Lys Lys Ala Ala G 115 120 Leu Ser Phe Ser Val Thr Asn Ilie Gin Pro A 130 135 Thr Ala Asp Val Ser Val Ser Gly Pro Lys L 145 1501 Gin Asn Val Thr Phe Val Asn Gin Gly Gly Tr 165 170 Ser Ala Met Glu Leu Leu Gin Ala Ala Gly Xa 180 185 INFORMATION FOR SEQ ID NO:64: SEQUENCE CHARACTERISTICS: Ar Al Va Va Val1 15 'ro ly Ilu 1la 55 -p a g Ala a Ala 1 Thr 1 Val Pro Asn Gly His G Ala A 140 Ser S4 Met L As
GI
Sea PhE Thr tal -hr ly la ;n Glu Arg y Leu Thr r Ilie Met SGly Ala Ala Ala Ser Phe Glu Ala A 110 Asp Leu P Gly Ser A Pro Val Tt Ser Arg A] 175 Lys Al a Al a Pro 31 n 1i a rg ro Ila ir Ia
(A)
(8)
(C)
LENGTH: 148 amino acids.
TYPE: amino acid STRANDEDNESS; single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:64: Asp Giu Val Thr Val Glu Thr Thr Sen Val Phe Arg Ala Asp 1 .5 in Phe Leu Ser Glu Leu Asp Ala Pro Ala Gin Ala Gly Thr Glu Ser Ala Val Sen Val Lys Arg Giy Val Giu Gly Leu Pro Pro Gly Sen Ala Leu Leu Gly Pro Asn Ala Gly Ser Arg 55 Phe Leu Leu Asp Gin Ala Ilie Thr Ser Giy Ang His Pro Asp 70 Sen Asp Ilie Phe Leu Asp Asp Val Thr Sen Arg Arg His Al a Giu Phe Arg Leu Glu Asn Msn Glu Phe Asn Vai 90 Val Asp Va] Gly Ser Leu Asn 100 Gly Thr Tyr Va] Asn Ang. Giu Pro Vai 105 -110 Asp Ser Ala 115 Val Leu Ala Msn Gly 120 Asp Giu Vai Gin Ilie Gly Lys Leu 125 Ang Leu 130 Val Phe Leu Thr Gly 135 Pro Lys Gin Gly Giu Asp Asp Gly Sen Thr Giy Gly Pro 145 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 230 amino acids TYPE: amino acid STRANOEDNESS: single TOPOLOGY: linear (xi Th 1 Gl2 Gi r Asr :Phe .:65 Leu Gi u 0. .Asp 0 Sen 0. 145 Ile Al a SEQUENCE DESCRIPTION: SEQ ID r Sen Asn Arg Pro Ala Arg Arg Gly Arg 5 10 y Pro Asp Ang Sen Ala Sen Leu Sen Leu 25 iAng Asp Ala Leu Cys Leu Ser Ser Thr 40 Leu Pro Pro Ala Ala Gly Giy Ala Ala 55 Asp Val Arg Ilie Lys Ilie Phe Met Leu 70 Cys Cys Sen Gly Val Ala Thr Ala Ala 90 Leu Lys Gly Thn Asp Thr Gly Gin AlaC 100 105 Pro Ala Tyr Asn Ilie Asn Ilie Ser Leu P 115 120 Lys Sen Leu Glu Asn Tyr Ile Ala Gin T 130 135 Ala Ala Thn Sen Sen Thn Pro Ang Giu A 150 1 Thn Sen Ala Thr Tyr Gin Sen Ala Ilie Pi 165 170 Val Val Leu Xaa Val Tyr His Msn Ala GI 180 185 Arg Al Val Ar Gin Il Asn Tyr V'al Thr 75 'no Lys .ys Gin 'no Sen hr Ang 140 la Pro 55 ro Pro ly Gly a Pro Arg Asp g His Arg Ang e Sen Ang Gin Sen Ang Ang Ala Val Val Thr Tyr Cys Ilie Gin Met I 110 Tyr Tyn Pro 125 Asp Lys Phe L Tyr Giu Leu A 1 Ang Gly Thr G 175 T hr His Pro T 190 Thn Gi n Sen Msn Leu B0 31 u )en eu sn i n hr Thr Thr Tyr Lys Ala Phe Asp Trp Asp Gln Ala Tyr Arg Lys Pro Ile 195 200 205 Thr Tyr'Asp Thr Leu Trp Gln Ala Asp Thr Asp Pro Leu Pro Val Val 210 215 220 Phe Pro Ile Val Ala Arg 225 230 INFORMATION FOR SEQ ID NO:66: SEQUENCE
CHARACTERISTICS:
LENGTH: 132 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: Thr Ala Ala Ser Asp Asn Phe Gln Leu Ser Gln Gly Gly Gln Gly Phe 1 5 10 Ala Ile Pro Ile Gly Gin Ala Met Ala Ile Ala Gly Gin Ile Arg Ser 25 Gly Gly Gly Ser Pro Thr Val His I1e Gly Pro Thr Ala Phe Leu Gly 40 Leu Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gln Arg Val 55 Val Gly Ser Ala Pro Ala Ala Ser Leu Gly Ile Ser Thr Gly Asp Val 65 70 75 Ile Thr Ala Val Asp Gly Ala Pro Ile Asn Ser Ala Thr Ala Met Ala 90 Asp Ala Leu Asn Gly His His Pro Gly Asp Val Ile Ser Val Asn Trp 100 105 110 Gin Thr Lys Sen Gly Gly Thr Arg 115 120 Thr Gly Asn Val Thr Leu Ala Glu 125 Gly Pro Pro Ala INFORMATION FOR SEQ ID NO:67: Ci) SEQUENCE
CHARACTERISTICS:
CA) LENGTH: 100 amino acids TYPE: ami no aci d STRANDEDNESS: single TOPOLOGY; linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: Val1 1 Pro Leu 2 .ng Sen Pro Sen Met Sen Pro Ser Lys Cys Leu Ala Ala Ala Gin Ang Asn Pro Val Ile Ang Arg Arg 25 Arg Leu Sen Asn Pro Pro Sen Ala Gly Pro Ang Lys Tyr Ang Sen Met Pro Sen Pro Ala Thr .40 Ala Met Al a 50 Ang ValI Ang Arg Ang 55 Ala Ilie Trp Arg Gly Pro Ala Thn Xaa Sen Ala Gly Met Ala Arg Val Ang Ang Trp Val Met Pro Xaa Ie Gin Sen Thn Sen Giu Ang Lys 100 Xaa Ilie Ang Xaa Xaa Gly Pro Phe Asp Asn 90 Ang Gly INFORMATION FOR SEQ ID NO:68: Ci SEQUENCE
CHARACTERISTICS:
LENGTH: 163 amino acids TYPE: amino acid STRANfJEDNESS: single TOPOLOGY: linear S S
S
S
S.
S
S
S
S. S S *5 *5
S.
(xi Met Leu Arg Ile Leu 65 Thr A Arg A Asp A Ala Al His Ar 145 Asp Ar
INFORMA
SEQUENCE DESCRIPTION: SEQ ID NO:68: Thr Asp Asp Ile Leu Leu Ile Asp Thr Asp 10 Thr Leu Asn Arg Pro Gin Ser Arg Asn Al a 25 Asp Arg Phe Phe Ala Xaa Leu Xaa Asp Ala 40 sp Val Val Ilie Leu Thr Gly Ala Asp Pro 30 sp Leu Lys Vai Ala Gly Arg Ala Asp Arg 70 75 ]ia VIal Gly Gly His Asp Gin Ala Gly Asp 90 rg Gly His Aig Arg Ala Arg Thr Gly Ala V 100 105 rg Leu Arg Ala Arg Pro Leu Arg Arg His P 115 120 Ia Ala His Leu Gly Thr Gin Cys Val Leu A 30 1351 g Xaa Gly Pro Val Asp Giu Pro Asp Arg Ai 150 155 g Arg TION FOR SEQ ID NO:69:
GI
Le
GL
Val1 kl a ~rg al ro la u Arg Val u Ser Ala Xaa Asp Phe Cys Ala Gly Arg Asp Leu Arg 110 Arg Pro C 125 Ala Lys G Leu Pro
V
Ar Al As Al His Gin -iis ily al g Thr a Leu p Asp Gly Leu Arg Pro Gly Arg Arg 160 SEQUENCE CHARACTERISTICS: LENGTH: 344 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi Me 1 G1 Lei Al S C 0 65 Cys Gi y Gly S0* Pro Leu 145 Leu i) SEQUENCE DESCRIPTION: SEQ ID NO:69: ?t Lys Phe Val Asn His Ile Glu Pro Val 5 10 y Ala Val Ala Glu Val Tyr Ala Glu Ala 25 u Pro Glu Pro Leu Ala Met Leu Ser Pro 40 a Gly Trp Ala Thr Leu Arg Glu Thr Leu 50 55 Gly Arg Lys Glu Ala Val Ala Ala Ala 70 Pro Trp Cys Val Asp Ala His Thr Thr 85 90 Thr Asp Thr Ala Ala Ala lle Leu Ala C 100 105 Asp Pro Asn Ala Pro Tyr Val Ala Trp A 115 120 Ala Gly Pro Pro Ala Pro Phe Gly Pro A 130 135 Gly Thr Ala Val Gin Phe His Phe Ile A 150 1 Leu Asp Glu Thr Phe Leu Pro Gly Gly Pi 165 170 Al Ar As Le Val 75 let 1ly la sp la 55 ro la Pro Ar g Arg G1 p Glu G1 u Val G1 SAla Ali SLeu Tyr Thr Ala Ala Gly 125 Val Ala 140 Arg Leu Arg Ala ,g Arg Ala Gly u Phe Gly Arg y Leu Leu Thr y Gin Val Pro i Ser Leu Arg SAla Ala Gly Pro Ala Ala 110 Thr Gly Thr Ala Glu Tyr Val Leu Val 160 G1n Gin Leu 175 Met Arg Arg Ala Gly Gly Leu Val 180 Phe Ala Arg Lys Val 185 Ang Ala Glu 190 Thr Leu Pro His Arg Pro 195 Asp Asp Leu 210 Gly Arg Ser Thr Arg 200 Arg Leu Giu Pro Arg 205 Ala Trp Ala Thr 215 Pro Ser Glu Pro Ala Thr Ala Phe Al a 225 Ala Leu Ser His His Leu Asp Thr Ala 230 Pro His 235 Leu Pro Pro Pro 240 Thr Ang Gln Val Ang Arg Val Val Gly 250 Ser Trp His Gly Glu Pro 255 Met Pro Met Ala Asp Leu 275 Ser 260 Ser Arg Trp Thr As n 265 Glu His Thr Ala Glu Leu Pro 270 Gly Leu Ala His Ala Pro Thr Arg 280 Leu Ala Leu Leu Thr 285 Pro His 290 Gln Val Thr Asp Asp Val Ala Ala Ala Arg Sen Leu Leu Asp 305 Thr Asp Ala Ala Leu 310 Val Gly Ala Leu Al a 315 Trp Ala Ala Phe Thr 320 Ala Ala Arg Arg Ile 325 Gly Thr Trp Ilie Gly Ala Ala Ala Glu Giy Gin 330 335 Val Sen Arg Gin Asn Pro Thr Gly 340 INFORMATION FOR SEQ ID Ci) SEQUENCE CHARACTERISTICS: LENGTH: 485 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY; linear
L
A
I.
GI
Ar Th Asj Thr Asn 145 Ie Gi u Arg Pro xi SEQUENCE DESCRIPTION: SEQ ID NO: \sp Asp Pro Asp Met Pro Gly Thr Val Ala 10 .eu Gly Arg Gly Ile Ala Pro Val Giu Asp 25 ia Arg Leu Gly Giu Ala Gly Leu Asp Asp 40 le Tyr Arg Gin Arg Arg Ala Giu Leu Arg 55 Fy Val Arg Asp Glu Leu Lys Leu Ser Leu g Giu Arg Tyr Leu Leu His Asp Glu Gin G 90 r Gly Glu Leu Met Asp Arg Ser Ala Arg C 100 105 P Gin Tyr Glu Pro Gly Ser Ser Arg Arg T 115 120 Leu Leu Arg Asn Leu Glu Phe Leu Pro Ai 130 135 ISer Gly Thr Asp Leu Gly Leu Le u Ala Gi 150 .15 Glu Asp Ser Leu Gin Ser Ilie Phe Ala Th 165 170 Leu Gin Arg Ala Giy Gly Gly Thr Gly Ty 180 185 Pro Ala Gly Asp Arg Va] Ala Ser Thr Gi: 195 200 Val Ser Phe Leu Arg Leu Tyr Asp Ser Al~ 210 215
L
I
T1 y '5 r r .ys Ala Va 1 le Gin Asp al Ala Arg ir Ala Lys a Ala Val y Arg Pro s Val Ala A 1 Ala Giu A 125 Ser Pro Tf 140 Cys Phe V~ Leu Gly GI Ala Phe Se 19 Gly Thr Al 205 Ala Gly Va1 220 3
VI
Al Th hr n 0 a ]ia Asp ys Va]I 0 a I Tyr a Leu ir Va] L a Glu S iAla G Phe A Leu ME Leu Pr 16 Ala Al 175 His Le Ser Gi Val Set Al a Gi u Ile Leu eu ;er lu la 0 a
U
Y
Met Gly 225 His Pro Glu Leu Arg Ala Gly Lys I 290 Cys, Lys A 305 Thr Ile A~ *Thr Asn Pr .Leu Gly Se *35 Trp Asp Ar 370 Asp Val 11E 385 *Ala Arg Ala *Leu Leu Ala Leu Ala Thr 435 Ser Arg Arg 450 Gly Arg Arg Ang 230 Asp Ile Cys Asp 245 Pro His Phe Asn 260 Val Giu Arg Asn le Val -Ala Ang I la Ala His Ala G 310 sn Arg Ala Asn P 325 0o Cys Gly Glu Vj 340 r Ile Asn Leu Al 5 g Leu Glu Giu Va 37, SAsp Val Sen Ar( 390 *Thr Arg Lys IIE 405 Ala Leu Gly Ilie 420 Ang Leu Met Ang Leu Ala Glu Glu 455 Gly Phe Leu I Giy L 2 let P ~95 ly Gl ro Va 1l Pr a Arc 36( 1 Ala 5 g Tyr Gly Pro Arg 440 Arg 102 Ala Cys Met Val Thr Ala 250 er Val Gly 265 eu His Arg 80 ro Ala Ala G ly Asp Pro G 3 1 Pro Gly Ai 330 o Leu Leu Pr 345 1 Met Leu Al IGly Val Al] Pro Phe Prc 395 Leu Gly Val 410 Tyr Asp Ser 425 Ilie Gin Gin Gly Ala Phe Al a 235 Lys ValI eu ;iu L 3 ly L4 15 rg GI o Ty a As a Val 38( Glu Met Gi u Al a Pro 460 Val Leu Asp Val Ala Giu Sen Pro 255 Thr Asp Al a Phe 270 Asn Pro Arg 285 eu Phe Asp Ala I 00 eu Val Phe LeuA 3 y Arg Ilie Glu A 335 'r Giu Ser Cys Ai 350 p Gly Ang Val As 365 Arg Phe Leu As Leu Gly Giu Al 40( Gly Leu Ala Glu 415 Glu Ala Val Ang 430 Ala His Thr Ala 445 Ala Phe Thr Asn Ser 240 Ser Leu Thr Sp ia p p
P
a Ser 465 Arg Phe Ala Arg Ser Gly Pro Arg Arg 470 Asn Ala Gin Val 475 Thr Ser 480 Val Ala Pro Thr Gly 485 INFORMATION FOR SEQ ID NO:71: SEQUENCE CHARACTERISTICS: LENGTH: 267 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: Gly Val Ile Val Leu Asp Leu Glu Pro Arg Gly Pro Leu Pro Thr Glu Val Val lle Tyr Trp Val Gly Ile 35 Arg Arg Arg Gly Leu Ala 20 Leu Gly Ile Ala Ala Val Ala IleVal 40 Ile Ala Phe Val Asp Ser Ser Ala Gly Ala 50 Lys Pro Val Ser Ala 55 Asp Lys Pro Ala Ser Ala Gin Ser His Pro Gly Ser Pro Ala Gin Ala Pro Gin Ala Gly Gin Thr Glu Gly Asn Ala Ala Ala Ala Pro Pro G1n Gly 90 Gin Asn Pro Glu Thr Pro Thr Pro Thr Asp Cys Pro 115 Ala 100 Ala Val Gin Pro Pro Pro Val Leu Lys Glu Gly Asp 110 Asp Ser Thr Leu Ala 120 Val Lys Gly Leu Thr Asn Ala Pro 125 Gin Tyr Tyr Val Gly Asp Gin Pro Lys Phe Thr Met Val Val Thr Asn Ile Gly Leu Val Sen Cys Lys 150 Arg Asp Val G1 y 155 Ala Ala Val Leu Al a 160 Ala Tyr Val Tyr Sen 165 Leu Asp Asn Lys Leu Trp Sen Asn Leu Asp 175 Cys Ala Pro Gin Val Thr 195 Asn Glu Thr Leu Val 185 Lys Thr Phe Sen Pro Gly Glu 190 Ala Pro Arg Thr Ala Val Thr Thr Gly Met Gly Ser 205 Cys Pro 210 Leu Pro Ang Pro Al a 215 Ile Gly Pro Gly Tyr Asn Leu Va] Val1 225 Gin Leu Gly Asn Leu 230 Arg Ser Leu Pro Val 235 Pro Phe Ile Leu As n 240.
Gin Pro Pro Ala Pro Pro Pro Pro 245 Pro Gly Pro Val Ala Pro Gly Pro Ala Gin 255 Pro Glu Sen Pro Ala Gin Gly Gly 260 265 INFORMATION FOR SEQ ID NO:72: SEQUENCE CHARACTERISTICS: LENGTH: 97 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear a. a.
a.
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: Leu Ile Ser Thr Gly Lys Ala Sen His Ala 1 5 10 Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys 25 Sen Leu Gly Val Gin Val Ie Val Glu Val Val Ala Gly Gly Ala Ala Ala Asn Ala Gly 40 Val Pro Lys Gly Val Val Val Thr Lys VaT Asp Asp Arg Pro Ile Asn Ser Ala Asp Leu Val Ala Ala Ang Ser Lys Ala Pro Gly Ala Thr Val Al a 75 Leu Thr Phe Gin Asp Pro Sen Gly Gly Ser Arg Thr Val Gln Gin INFORMATION FOR SEQ ID NO:73: Thr Leu Gly Lys Ala Glu Ci) SEQUENCE CHARACTERISTICS: CA) LENGTH: 364 amino acids TYPE: amino acid CC) STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: Gly Ala Ala Val Sen Leu Leu Ala Ala Gly 1 5 10 Cys Gly Gly Gly Thr Asn Sen Ser Sen Ser 25 Thr Leu Val Leu Thr Al a Gly Ala Gly Gly Thr Sen Ser Gly Sen Gly Sen Val His Cys Gly Gly Lys Lys Glu Leu His Sen 40 Thn Al a Gin Giu Asn Ala Glu Gin Phe Val Ala Tyr Val Ang Sen Cys Pro Gly Tyr Th r 70 Leu Asp Tyr Asn Al a Asn 75 Gly Ser Gly Al a Gly Val Thr Gin Phe Leu Asn Asn Glu Thr Asp Phe Ala Gly Ser Asp 90 Val Pro Leu Asn Pro Ser Thr Gly Gin Pro Asp Arg Ser Ala Glu Arg 100 105 110 Cys Gly Ser Pro Ala Trp Asp Leu Pro Thr Val Phe Gly Pro Ile Ala 115 120 125 Ile Thr Tyr Asn Ile Lys Gly Val Ser Thr Leu Asn Leu Asp Gly Pro 130 135 140 Thr Thr Ala Lys I.le Phe Asn Gly Thr Ile Thr Val Trp Asn Asp Pro 145 150 155 160 Gin Ile Gin Ala Leu Asn Ser Gly Thr Asp Leu Pro Pro Thr Pro lle 165 170 175 Ser Val le Phe Arg Ser Asp Lys Ser Gly Thr Ser Asp Asn Phe Gin 180 185 190 Lys Tyr Leu Asp Gly Val Ser Asn Gly Ala Trp Gly Lys Gly Ala Ser 195 200 205 Glu Thr Phe Ser Gly Gly Val Gly Val Gly Ala Ser Gly Asn Asn Gly 210 215 220 Thr Ser Ala Leu Leu Gin Thr Thr Asp Gly Ser Ile Thr Tyr Asn Glu 225 230 235 240 Trp Ser Phe Ala Val Gly Lys Gin Leu Asn Met Ala Gin Ile Ile Thr 245 250 255 Ser Ala Gly Pro Asp Pro Val Ala Ile Thr Thr Glu Ser Val Gly Lys 260 265 270 Thr Ile Ala Gly Ala Lys Ile Met Gly Gin Gly Asn Asp Leu Val Leu 275 .280 285 Asp Thr Ser Ser Phe Tyr Arg Pro Thr Gin Pro Gly Ser Tyr Pro Ile 290 295 300 Val Leu Ala Thr Tyr Glu Ile Val Cys Ser Lys Tyr Pro Asp Ala Thr 305 310 315 320 Thr Gly Thr Ala Val Arg Ala Phe Met Gin 325 330 Ala Ala Ile Gly Pro Gly 335 Gin Glu Gly G1n Ala Lys 355 Leu 340 Asp Gln Tyr Gly Ser 345 Ile Pro Leu Pro Lys Ser Phe 350 Leu Ala Ala Ala Val 360 Asn Ala Ile Ser INFORMATION FOR SEQ ID NO:74: SEQUENCE CHARACTERISTICS: LENGTH: 309 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: Gin Ala Ala Ala Gly Arg Ala Val Arg Arg 1 5 10 Thr Gly His Ala Glu Asp Ala Val Gin Thr His Val Val Arg Gin Asp Arg Leu His His 25 Gly Cys Arg Arg Ala Gin Asp Arg Ala Ser Val 40 Ser Ala Thr Ser Ala Arg Pro Pro Arg Arg His Pro Ala G1n 55 Gly His Arg Arg Arg Val Ala Pro Ser 9 Gly 65 Gly Arg Arg Arg Pro 70 His Pro His His Gin Pro Asp Asp Arg Arg Asp Arg Pro Leu Leu Asp Arg Thr Gin Pro Ala Glu His Pro Asp Pro His Arg 100 Arg Gly Pro Ala Asp Pro Gly Arg Val 105 Arg Gly Arg 110 Gly Arg Leu Arg Arg Val Asp Asp Gly Arg Leu Gin Pro Asp Arg Asp 9* 9.
99** 9 9**9 99 999 9 9* 9* .9 9.
99 9.
115 120 Ala Asp His Gly Ala Pro Val Arg 130 135 Gin His Arg Gly Gly Pro Val Phe 145 150 Cys Ala His Arg Arg Gly His Arg 165 Asp Val Leu Arg Ala Gly Leu Arg 180.
Ala Val Glu Asn Leu His Arg Gly 195 200 Phe Arg Pro Ilie Arg Arg Gly Ala 210 215 Ala Gly Pro Gin Gly Arg Leu His L.
225 230 Leu Pro Ala Arg Ala Gly Gin Gin G 245 Arg Ala Gly Gly Ala Glu Arg Ala A 260 2 His Gin Gly Gly His Asp Pro Gly A 275 280 Ala Gly Val Ala His Ala Ala Ala G 290 295 Asn Arg Pro Arg Arg 305 INFORMATION FOR SEQ ID Ci) SEQUENCE CHARACTERISTICS: CA) LENGTH: 580 amino acids CB) TYPE: amino acid CC) STRANDEDNESS: single TOPOLOGY: linear Gly Arg Gly Val Arg Arg 155 Arg Val Ala 170 Val Giu'Arg 185 Ser Gi n Arg krg Leu Pro .eu Asp GlyA 235 1n Pro Ser S 250 sp Pro Gly G 65 rg Gin Gly A ly Pro Arg Ai 34 Pr 14 Va Al Let Aa kla 1 a er In Ila rg 00 125 o His *0 1 Pro a Pro .i Arg Asp 205 Arg A Gly P Ala G Arg G 2 Gin Ai 285 Ala Al Ar Gil
GI!
Prc 190 ii y ~rg ro rg a g Gi y Va 'Gli 17~ )Val Arc Ser Ser Gly 255 Arg Gly Val1 y Val I Arg 160 ni Gly IAla Val Arg Pro 240 Arg Hi s, Thr Arg (xi Sen Ang Sen Sen Arg SEQUENCE DESCRIPTION: SEQ ID Ala Val Trp Cys Leu Asn Gly Phe Thn 10 Cys Arg Val Arg Ala Sen Gly Trp Arg 25 Thn Thr Ala Asp Cys Cys Al a Ser Lys 40 Pro Leu Glu Ang Arg Phe Thr Cys Cys 55 Phe Arg Sen Phe Pro Val Arg Arg- Leu Gly Sen Th n Sen A~la Ang Sen Pro Pro Leu Hi s Asn Thn Al a Gly Ang Ang Gin Va I Al a His Tnp, Al a Gly Arg Gly Cys Al a Cys Thr a.
S
S.
.5*S a
S
a a aS asa S a. a -S a a.
S. S
S
S*
Sen Ang Thn Leu Gly Val Arg Arg Thn Leu Sen Gin 90 Pro Al a Gi u Pro 145 Tyr Pro Arg Sen Gi u 130 Gi n Sen Thr Al a Pro 115 Gin Gin Gin Gin Gin 100 Arg Pro Pro Gin Tyr 180 Pro Met Sen Gly Phe 165 Arg Sen Al a Asp Thr 150 Asp .Gl n Cys Lys Met 135 Pro Trp Pro Al a Leu 120 Thr Gly Ang Tyr Val1 105 Al a Asn Tyr Tyr Gl u 185 Thr Arg His Al a Pro 170 Al a Val1 Val1 Pro Gin 155 Prno Leu Gi u Val1 Ang 140 Gly Sen Gly Trp, Sen Gly 125 Tynr Gin Pro Gly Asn His 110 Leu Sen Gin Pro Thr 190 Leu Thn Val1 Pro Gi n Pro 175 Ang Sen Hi s Gi n Pro Th n 160 Gin Pro Giy Leu Ilie Pro Gly Val Ie Pro Thn Met Thn Pro Pro Pro Giy Met 195 20020 205 Val Arg Gin Arg Pro Arg Ala Gly Met Leu Ala Ile Gly Ala Val Thr 210 215 220 Ile Ala Val Val Ser Ala Gly Ile Gly Gly Ala Ala Ala Ser Leu Val 225 230 235 240 Gly Phe Asn Arg Ala Pro Ala Gly Pro Ser Gly Gly Pro Val Ala Ala 245 250 255 Ser Ala Ala Pro Ser Ile Pro Ala Ala Asn Met Pro Pro Gly Ser Val 260 265 270 Glu Gin Val Ala Ala Lys Val Val Pro Ser Val Val Met Leu Glu Thr 275 280 285 Asp Leu Gly Arg Gin Ser Glu Glu Gly Ser Gly Ile Ile Leu Ser Ala 290 295 300 Glu Gly Leu Ile Leu Thr Asn Asn His Val Ile Ala Ala Ala Ala Lys 305 310 315 320 ro Pro Leu Gly Ser Pro Pro Pro Lys Thr Thr Val Thr Phe Ser Asp 325 330 335 0 Gly Arg Thr Ala Pro Phe Thr Val Val Gly Ala Asp Pro Thr Ser Asp 340 345 350 Ile Ala Val Val Arg Val Gin Gly Val Ser Gly Leu Thr Pro Ile Ser 355 360 365 Leu Gly Ser Ser Ser Asp Leu Arg Val Gly Gin Pro Val Leu Ala Ile 370 375 380 Gly Ser Pro Leu Gly Leu Glu Gly Thr Val Thr Thr Gly Ile Val Ser 385 390 395 400 Ala Leu Asn Arg Pro Val Ser Thr Thr Gly Glu Ala Gly Asn Gin Asn 405 410 415 Thr Val Leu Asp Ala Ile Gin Thr Asp Ala Ala Ile Asn Pro Gly Asn 420 425 430 Ser Gly Gly Ala Leu Val Asn Met Asn Ala Gin Leu Val Gly Val Asn 435 440 445 Ser Ala Ile Ala Thr Leu Gly Ala 450 455 Asp Ser Ala Asp Ala Gin Ser Gly 460 Ser 465 Ile Gly Leu Gly Phe Ala Ile Pro Val Asp 470 475 Gin Ala Lys Arg lle 480 Ala Asp Glu Val Gin Val Val Val Ala 515 Leu lle 485 Ser Thr Gly Lys Ala 490 Ser His Ala Ser Leu Gly 495 Thr 500 Asn Asp Lys Asp Thr 505 Pro Gly Ala Lys Ile Val Glu 510 Lys Gly Val Gly Gly Ala Ala Asn Ala Gly Val Pro 525 Val Val 530 Thr Lys Val Asp Ala Val Arg Ser 550 Arg Pro lle Asn Ser 540 Ala Asp Ala Leu Val 545 Ala Lys Ala Pro Gly Thr Val Ala Leu Thr 560 Phe Gln Asp Pro Ser Gly Gly Ser Arg 565 Thr 570 Val Gin Val Thr Leu Gly 575 Lys Ala Glu Gin 580 INFORMATION FOR SEQ ID NO:76: SEQUENCE CHARACTERISTICS: LENGTH: 233 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: Met Asn Asp Gly Lys Arg Ala Val Thr Ser Ala Val Leu Val Val Leu 1 5 10 Gly Ala Cys Leu Ala Leu Trp Leu Ser Gly Cys Ser Ser Pro Lys Pro 25 Val Ser Pro Thr Ala Ser Asp Pro Asp Al a Ala Giu Leu Leu G13 Ile Thr Leu Al a Val1 Leu Al a 145 Gly Ile Arg Ser G13 Lys Gi n Gly 130 Gly Thr Pro Pro Val le Gly Gly 115 Se r Val Gi u Ala Ala His~ Thr Val 100 Asp Ile Thr Val1 Ser 180 Va Sei Cys Asr Ser Gi n Ie 165 Ser Val1 1 Ala 70 rAla ~Thr Ilie Giu Leu 150 Asp Val1 Trp Val Pro 40 Arg Gin 55 ValI Arg Asp Val Tyr Asn Ser Val 120 Leu Sen 135 Leu Ser Gly IlieI Lys Met L Ser Thr Asp Asp 105 Lys Th r LeL Thr Val1 90 Glu Leu Ser Asp Ala Thr Gly Lys Val 75 Ang Ala Asn Gin Gly Val Phe Asp Asp 125 Arg Val Leu 140 Lys AspF Pro Pro 110 Trp Asp Gi n Fh r ys i s hr G12 pSer LeL Phe Sen Pro Al a Gly 175 Sen Leu Gin ~Leu Leu Ala Arg Asn Al a Gi n 160 Th n Al a Val1 Ser .00.
00..
0 31y Val Thr 155 ;er Thr Thn 170 .eu Asp Pro 85 Vln Asp Gly ly Sen Ilie Asn Lys Gly Ser Gl n
I.
I
I-
-I
Ilie Giy 195 Al a 200 Ser 1
G
G
Ang Ala Se Ilie Asp Leu 210 215 Lys Trp Msn Giu Pro Val Asn Va] Asp 225 230 INFORMATION FOR SEQ ID NO:77: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 66 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: Val Ile Asp Ile Ile Gly Thr Ser Pro Thr Ser Trp Glu Gin 1 5 10 Ala Ala Ala Glu Ala Ala Arg Val Thr Tyr Arg Gin Arg Ala Arg Asp Ser Val Asp Asp Ile Arg Val Gly Lys lle Ile Glu Gin Asp Met Ala Val Asp Ser Ile Lys Leu Glu 55 Val Ser Phe Lys Met Arg Pro Ala Gin r Pro Arg INFORMATION FOR SEQ ID NO:78: SEQUENCE CHARACTERISTICS: LENGTH: 69 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: Val 1 Pro Pro Ala Pro Pro Leu Pro Pro Leu Pro Pro Ser Pro 5 10 Ile Ser Cys Ala Ser Pro Pro Ser Pro Pro Leu Pro 25 Pro Ala Pro Pro Val Ala Pro Gly Pro Pro Met Pro Pro Leu Asp Pro 40 Trp Pro Pro Ala Pro Pro Leu Pro Tyr Ser Thr Pro Pro Gly 55 Ala Pro Leu Pro Pro Ser Pro Pro Ser Pro Pro Leu Pro INFORMATION FOR SEQ ID NO:79: SEQUENCE
CHARACTERISTICS:
LENGTH: 355 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:79: Met Ser Asn Ser Arg Arg Arg Ser Leu Arg Trp Ser Trp Leu
S.
Leu Ser Val Leu Ala Ala Pro Pro Val Gly Leu Gly Leu Asp Arg 40 Ala Thr Ala Pro Ala Gln Ala Pro Ala Leu Ala Leu Ser Gln Phe Ala Asp Phe 555...
S
S. S Pro Leu Asp Pro Ser Ala Met 55 Val Ala Gln Val Al a Pro Gln Val Val As n 65 Ile Asn Thr Lys Gly Tyr Asn Asn Val Gly Ala Gly Thr Gly Ile Val Ile Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Val Ile Ala Gly Ala Thr Asp Ile Asn Ala 100 105 Phe Ser Va] Gly Ser Gly Gln 110 Thr Val Gly 145 Gly Gly Leu Gly G 2 Ala A 225 Ile P Gly G Gly Va Gly Se 29 Thr Al 305 Ala Le Thr Ly
T
L
1,
G
G1 G1 As 1 ro ly 1l r 10 a u s yr G 1 eu G 30 ly V In G n Th n Gl 19 y Pr 0 a Sei
II
SSer Val 275 Ala Val Asn Ser ily V< 15 In Le al Al ly G1 r Va 181 y Le 5 o Val r Asp e Gly Pro 260 Asp Pro Asp Gly Gly 340 al Asp Val eu Arg Gly a Val Gly 150 y Thr Pro 165 1 Gin Ala 0 u Ile Gin SVal Asn E SAsn Phe G 230 Gin Ala M 245 Thr Val H Asn Asn G Ala Ala S 2 Gly Ala Pr 310 His His Pr 325 Gly Thr Ar Val Ala 135 Glu Arg Ser Phe Gly 215 In L et A is I ly A 2 er L 95 ro I o G1 g Th
G
1
G
P
Al As As 20 Lei .Le la le sn 80 eu le ly ir 115 ly Tyr 20 ly Gly ro Val a Val p Ser I 185 p Ala 0 u Gly G i Ser G Ile A 2! SGly P 265 Gly Al Gly II Asn Se Asp Va 33 Gly As 345
A
Le Va Pr 17 Le lr In la 50 ro la e r 1 0 n sp A eu PI I1 Al 15 o G1 0 u Th a 11 SVal SGly 235 Gly Thr Arg Ser Ala 315 Ile Val rg Thr Gin 125 ro Ser Ala 140 a Met Gly 5 y Arg Val r Gly Ala i e Gin Pro 205 SVal Gly M 220 Gly Gin G Gin Ile A Ala Phe L 2 Val Gin Ar 285 Thr Gly A 300 Thr Ala Me Ser Val As Thr Leu Al 35
A
Al As Va Gl 191 G1 let ly rg eu rg sp t ;n a 0 sp Val la Ile ;n Ser 1 Ala 175 u Glu 0 y Asp SAsn 1 Phe A 2 Ser G 255 Gly L Val V Val I Ala Ac 3 Trp G1 335 Glu G1 Ala Gly Gly 160 Leu Thr Ser rhr la ly eu al le sp n y Pro Pro Ala 355 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 205 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID Ser Pro Lys Pro Asp Ala Glu Giu Gin Gly Val Pro Val Ser Pro Thr 1 5 10 Ala Ser Asp Pro Ala Leu Leu Ala Glu Ile Arg Gin Ser Leu Asp Ala 25 :Thr Lys Gly Leu Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys 35 40 Val Asp Ser Leu Leu Gly Ilie Ihr Ser Ala Asp Val Asp Val Arg Ala 55 Asn Pro Leu Ala Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly .965 70 75 Val Pro Phe Arg Val Gin Gly Asp Msn Ilie Ser Val Lys Leu Phe Asp 9985 90 Asp Trp Ser Asn Leu Gly Ser Ilie Ser Glu Leu Ser Thr Ser Arg Val 9..100 105 110 .Leu Asp Pro Ala Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn 115 120 125 Leu Gin Ala Gin Gly Thr Glu Vai Ilie Asp Gly Ilie Ser Thr Thr Lys 130 135 140 Ilie Thr Gly Thr Ilie Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly 117 145 150 155 160 Ala Lys Ser Ala Arg Pro Ala Thr Val Trp Ile Ala Gin Asp Gly Ser 165 170 175 His His Leu Val Arg Ala Ser lie Asp Leu Gly Ser Gly Ser Ile Gin 180 185 190 Leu Thr Gin Ser Lys Trp Asn Glu Pro Val Asn Val Asp 195 200 205 INFORMATION FOR SEQ ID NO:81: SEQUENCE CHARACTERISTICS: LENGTH: 286 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:81: *o Gly Asp Ser Phe Trp Ala Ala Ala Asp Gin Met Ala Arg Gly Phe Val 1 5 10 Leu Gly Ala Thr Ala Gly Arg Thr Thr Leu Thr Gly Glu Gly Leu Gin 20 25 His Ala Asp Gly His Ser Leu Leu Leu Asp Ala Thr Asn Pro Ala Val 40 Val Ala Tyr Asp Pro Ala Phe Ala Tyr Glu Ile Gly Tyr Ile Xaa Glu 55 Ser Gly Leu Ala Arg Met Cys Gly Glu Asn Pro Glu Asn Ile Phe Phe 65 70 75 Tyr Ile Thr Val Tyr Asn Glu Pro Tyr Val Gin Pro Pro Glu Pro Glu 90 Asn Phe Asp Pro Glu Gly Val Leu Gly Gly Ile Tyr Arg Tyr His Ala 100 105 110 Ala Thr Glu Gin Arg Thr Asn Lys Xi 115 120 Ala Met Pro Ala Ala Leu Arg Ala Al 130 135 Asp Val Ala Ala Asp Val Trp Ser Va 145 150 Arg Asp Gly Val Val Ile Glu Thr Gli 165 Pro Ala Gly Val Pro Tyr Val Thr Arc 180 18E Pro Val Ilie Ala Val Ser Asp Trp Met 195 200 Arg Pro Trp Val Pro Gly Thr Tyr Leu 210 215 Gly Phe Ser Asp Thr Arg Pro Ala Gly 225 230 Ala Glu Ser Gin Val Gly Arg Gly Phe 245 Arg Val Asn Ilie Asp Pro Phe Gly Ala 260 265 Leu Pro Gly Phe Asp Glu Gly Gly Gly 275 280 INFORMATION FOR SEQ ID NO:82: SEQUENCE
CHARACTERISTICS:
LENGTH: 173 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear ia Gin Ilie a Gin Met I Thr Ser 155 ji Lys Leu 170 1Ala Leu Arg Ala Thr Leu G 2 Arg Arg T 235 Gly Arg G 250 Gly Arg Gi Leu Arg Pr Leu Al Leu Al 140 Trp G I Arg Hi~ 31u Asr a I Pro 205 ly Thr yr Phe ly Trp, y Pro o Xaa 285 a Ser Gly a Ala Giu y Gi u Leu sPro Asp 175 1Ala Arg 190 Giu Gin I Asp Gly P Asn Thr A 2 Pro Gly Ai 255 Pro Ala Gi 270 Lys Val1 Trp Asn 160 Ar'g he In a.
119 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:82: Thr Lys Phe His Ala Leu Met Gin Glu Gin Ile His Asn Glu Phe Thr 1 5 10 Ala Ala Gin Gin Tyr Val Ala Ile Ala Val Tyr Phe Asp Ser Glu Asp 25 Leu Pro Gin Leu Ala Lys His Phe Tyr Ser Gin Ala Val Glu Glu Arg 40 Asn His Ala Met Met Leu Val Gin His Leu Leu Asp Arg Asp Leu Arg 55 Val Glu Ile Pro Gly Val Asp Thr Val Arg Asn Gin Phe Asp Arg Pro 70 75 Arg Glu Ala Leu Ala Leu Ala Leu Asp Gin Glu Arg Thr Val Thr Asp 90 Gin Val Gly Arg Leu Thr Ala Val Ala Arg Asp Glu Gly Asp Phe Leu 100 105 110 Gly Glu Gin Phe Met Gin Trp Phe Leu Gin Glu Gin Ile Glu Glu Val 115 120 125 Ala Leu Met Ala Thr Leu Val Arg Val Ala Asp Arg Ala Gly Ala Asn 130 135 140 Leu Phe Glu Leu Glu Asn Phe Val Ala Arg Glu Val Asp Val Ala Pro 145 150 155 160 Ala Ala Ser Gly Ala Pro His Ala Ala Gly Gly Arg Leu 165 170 INFORMATION FOR SEQ ID NO:83: SEQUENCE CHARACTERISTICS: LENGTH: 107 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 120 (xi SEQUENCE DESCRIPTION: SEQ ID NO: 83: Arg Ala Asp 1 Ala Ala Gly Val Thr Sen Glu Arg Lys 5 Asn Thr Thr Met Lys Met Val Lys 10 Se Ilie Leu Thr Ala Ala Ala Ile Gly Ala Ala Ala Ala Gly Met Gin Pro Ile Met Ala Gly Gly 40 Pro Val Val Tyr Gin Val Val Phe Gly Ala Pro Pro Leu Asp Pro Thr Xaa Leu Leu 75 Sen Ala Pro Xaa Val1Pro Thr Ala Ala Gin Trp 70 Asn Xaa Leu Xaa Asp Pro Asn Val Ser Phe 85 Xaa Asn Lys Gly Ser Leu Val Giu Gly Gly Ile Gly Gly Xaa Glu Gly Xaa Xaa Arg Arg Xaa Gin 100 105 INFORMATION FOR SEQ ID NO:84: SEQUENCE CHARACTERISTICS: LENGTH: 125 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:84: Val Leu Sen Val Pro Val Gly Asp Gly Phe Trp Xaa Ary Val 1 5 9* *9 Asn Pro Leu Gly Gin Pro Ilie Asp Gly Any Gly 25 Asp Val Asp Sen Asp Thr Arg Arg Ala Leu Glu Leu Gin Ala 40 Val LYS Giu Pro Leu Xaa Thr Gly 55 Pro Ilie Gly Arg Gly Gin Arg Gin 70 Gly Lys Asn Arg Arg Leu Cys Arg Glu Leu Gly Val Arg Trp Ilie Pro 100 Val Giy His Arg Ala Arg Arg Gly1 115 120 INFORMATION FOR SEQ ID SEQUENCE
CHARACTERISTICS:
LENGTH: 117 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 121 Pro Ser Val Ilie Lys Ala Leu Ilie Ilie 75 Thr Pro Ser 90 krg Ser Arg 105 rhr Tyr His Va Gly Ser Cys ~rg I Xaa Arg Gin Gly Asp Ala Met Thr Asp Arg Lys Thr Asn Gin Arg Giu Ala Cys Val Tyr 110 Arg 125 S S
S.
SO
S S
OSS*
S.
OS.
S.
S
*0S*
S
a S 5O 0
S
*5 54 s-S.
*5 (xi SEQUENCE DESCRIPTION: SEQ ID NO: Cys Asp Ala Val Met Gly Phe Leu Gly Gly 1 5 10 Val Asp Gin Gin Leu Val Thr Arg Val Pro 20 25 Gin Ala Ala Ala Val Pro Val Val Phe Leu 40 Ala Asp Leu Ala Glu Ile Lys Ala Gly Glu 55 Gly Thr Gly Gly Val Gly Met Ala Ala Val Ala Gin Thr Ser Gin Gly Pro Gly Trp Al a Trp Val Leu Leu Al a Leu Ser Ty r Ilie Arg Al a Phe Gly His Gin Val1 Al a Leu Al a Trp Gly Val Glu Val Phe Val Thr Ala Arg Ala Xaa Xaa Phe Asp Asp Xaa 100 Arg Ser Ser Xaa Gly 115 INFORMATION FOR SEQ ID N0:86: SEQUENCE CHARACTERISTICS: LENGTH: 103 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear Ser Arg Gly Lys Trp Asp Thr Leu 90 Pro 105 Tyr Arg Xaa Phe Pro His Xaa 110 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:86: Met 1 Tyr Arg Phe Ala Cys Arg Thr Leu Met Leu Ala Ala Cys Ile Leu Ala Thr Gly Thr Ala Pro Ala Gly Leu Gly Gly Ala Gin Ser Ala Ala Gin Pro Phe Asp Val Pro Asp Tyr Tyr 40 Trp Cys Pro Gly Gin Pro Ala Trp Gly Pro Asn Asp Ser Asp Gly 70 Trp Asp Pro Tyr Thr Cys His Asp Asp Phe His Arg Pro Asp His Ser Arg 75 Asp Tyr Pro Gly Pro Ile Leu Glu Gly Pro Val Leu Asp Asp Pro Gly Ala Ala Pro Pro Pro Pro Ala Ala Gly Gly Gly Ala 100 123 INFORMATION FOR SEQ ID NO:87: SEQUENCE CHARACTERISTICS: LENGTH: 88 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID, NO:87: Val Gin Cys Arg Val Tr-p Leu Glu Ilie Gln 1 5 10 Trp Arg Gly Met Leu Gly Ala Asp Gin Ser Met Ala Ala Arg Ala Gly Giy Pro Ala 20 25 Arg Ilie Trp Arg Giu His Leu GiU Ala Ala Met Lys Pro Arg Thr 40 Gly Asp Gly Pro Thr Lys 50 Glu Gly Arg Gly Ile 55 Val Met Arg Val Pro Leu Giu Gly Giy Gly Arg Leu Asp Giu Leu Val Val Giu 70 Leu Thr Pro Asp Glu Ala Ala Ala Leu Giy 75 Lys Gly Val Thr Ser INFORMATION FOR SEQ ID NO:88: SEQUENCE CHARACTERISTICS: LENGTH: 95 amino acids TYPE: amino acid.
STRANDEDNESS: single CD) TOPOLOGY: 'linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:88: Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg Ile I Ser Gly Asp Ser Leu Gin Lys Thr Gin Ile Gin Val Glu Ser Thr Ala Gly Ala Gin Ala Gly Gin Trp Arg Gly 40 Ala Ala Gly Thr Ala Ala Val Asp Glu Val Arg Phe Gin Glu Ala Ala Asn Lys Lys Gin Glu Leu Ile Ser Thr Asn Ile Arg Gin Ala Gly 75 Val Gin Tyr Ser Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser 90 Ser Gin Met Gly Phe a..
a INFORMATION FOR SEQ ID NO:89: SEQUENCE CHARACTERISTICS: LENGTH: 166 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:89: Met Thr Gin Ser Gin Thr Val Thr Val Asp 1 5 10 Arg Ala Asn Glu Val Glu Ala Pro Met Ala 25 Gin Gin Glu IleLeu Asn Asp Pro Pro Thr Asp Val Ala Gin Gin Pro Ile Thr Pro Cys Glu Xaa Val Leu Ser Ala Asp Leu Thr Xaa Xaa Lys Asn Ala 40 Asn Met Arg Glu Tyr Leu Ala Ala Gly Ala Glu Arg Gin Arg Leu Ala Thr Ser Leu Arg 75 Asn Ala Ala Lys Xaa Leu Asp Asn Asp Gly Tyr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala 90 Glu Gly Thr Ser Ala Glu 115 Gin Ala Glu Ser Ala 105 Gly Ala Val Gly Gly Asp Ser 110 Ala Gly Glu Pro 125 Leu Thr Asp Thr Pro 120 Arg Val Ala Thr Asn Phe 130 Met Asp Leu Lys Glu 135 Ala Ala Arg Lys Leu Glu Thr Gly Asp 140 Gln 145 Gly Ala Ser Leu His Xaa Gly Asp Gly 155 Trp Asn Thr Xaa Thr 160 Leu Thr Leu Gin Gly Asp 165 INFORMATION FOR SEO ID SEQUENCE CHARACTERISTICS: LENGTH: 5 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID Arg Ala Glu Arg Met 1 INFORMATION FOR SEQ ID NO:91: SEQUENCE CHARACTERISTICS: LENGTH: 263 amino acids TYPE: amino acid STRANDEDNESS: single 126 TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:91: Val Ala Trp Met Ser Val Thr Ala Gly Gin Ala Glu Leu Thr Ala Ala a..
S
S.
a.
S
S.
1 Gin Val1 Ilie Gi u Gi y Gi u Al a Asn Thr 145 His Sen Leu Val1 Pro Al a Al a Tyr Gi u Al a Asn 130 Thr Arg Met Lys Arg Pro Thr Gi u Al a Al a Val1 115 Val1 Pro Sen Thr Gi y Val1 Pro Asn Tyr Al a Pro 100 Gi u Pro Sen Pro Asn 180 Phe 5 Al a Val1 Leu Gly Al a Gl u Gl u Gi n Sen Ile 165 Sen Al a Al a Ile Leu Gi u 70 Thr Met Al a Al a Lys 150 Sen Gly Pro Al a Al a Gly 55 Met Al a Thr Ser Leu 135 Leu As n Val1 Al a Al a Gi u 40 Gin Trp Thr Ser Asp 120 Lys Gi y Met Sen Al a Tyr 25 Asn Asn Al a Al a Al a 105 Thr Gin Gly Val1 Met 185 Al a .Lu Giu Arg Thr Gi n Thr 90 Gly Al a Leu Leu Ser 170 Th r Al a Thr Al a Pro Asp 75 Al a Gi y Al a Al a Trp 155 Met Asn Gin Al a Gi u Al a Al a Thr Leu Al a Gin 140 Lys Al a Thr Al a Tyr Leu Ilie Al a Leu Leu Asn 125 Pro Thr Asn Leu Val1 Gly Met Al a Al a Leu Gi u 110 Gi n Th r Val1 As n Sen 190 Gin Leu Ie Val1 Met Pro Gi n Leu Gi n Sen Hi s 175 Sen Thr Th r Leu Asn Phe Phe Al a Met Gly Pro 160 Met Met Al a 195 Ala Gin Asn Gly Val Arg Ala 210 215 200 Met Ser Ser Leu Gly Ser Ser Leu Gly 220 Ser 225 Ser Gly Leu Gly Gly 230 Gly Val Ala Ala Asn 235 Leu Gly Arg Ala Ala 240 Ser Val Arg Tyr Gly 245 His Arg Asp Gly Gly 250 Lys Tyr Ala Xaa Ser Gly 255 Arg Arg Asn Gly Gly Pro Ala 260 INFORMATION FOR SEO ID NO:92: SEQUENCE CHARACTERISTICS: LENGTH: 303 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear o (xi) SEQUENCE DESCRIPTION: SEQ ID NO:92: r r Met 1 Thr Tyr Ser Pro Gly Asn Pro Gly Tyr Pro Gln Ala G1n 5 10 Pro Ala Gly Ser Tyr Ala Ser Lys Gly Gly Val Thr Pro Ser 25 Phe Ala His Ala Asp Glu Gly Val Leu Gly Leu Pro Met Tyr Leu 40 Asn Ile Ala Val Ala Leu Ala Ala Tyr Phe Ala Ser Phe Gly Pro Met Thr Leu Ser Thr Glu Leu Gly Gly Gly Asp 70 Gly Ala Val Ser Gly Asp Thr Gly Leu 75 Pro Val Gly Val Ala Leu Leu Ala Ala Leu Leu Ala Gly Val Val Leu Val Pro Lys Ala Lys Ser His Val 100 Gly Val Phe Leu Met Val Ser 115 Sen Thr Gly Trp Ala Leu Trp 130 135 Ala Val Ala Ala Val Leu Ala L 145 150 Ala Pro Ala Pro Arg Pro Lys P 165 Tyr Gly Gin Tyr Gly Gin Tyr G 180 Gin Gin Gly Ala Gin Gin Ala Al 195 2 Gin Ser Pro Gin Pro Pro Gly Ty 210 215 Sen Ser Pro Sen Gin Sen Gly Se 225 230 Gin Pro Pro Ala Gin Sen Gly Sei 245 Thr Pro Pro Thr Gly Phe Pro Ser 260 Ala Gly Thr Gly Sen Gin Ala Gly 275 1280 Pro Ser Gly Gly Giu Gin Sen Sen 290 295 INFORMATION FOR SEQ ID NO:93: SEQUENCE
CHARACTERISTICS:
LENGTH: 28 amino acids TYPE: amino acid
A
1 Vt h r r Tr Val 105 1 a Thr 20 al Val ~u Leu e Asp F 1 y Val G 185 Gly L Gly SE Gly Ty Gin GlI 254 Phe Ser 265 Sen Ala Sen Pro Val Ala Phe Asn L Leu Ala P 1I /al Giu Tf 155 'no Tyr GI 70 in Pro GI.
au GI n Sei rn Gin Tyr 220 r Thr Ala 235 ni Sen His 0 Pro Pro IPro Val Gly Gly 300 /ai Leu Gly Val 110 .ys Pro Ser Ala 125 he Ilie Val Phe rn Gly Ala Ilie y Gin Tyr Gly 175 y Gly Tyr Tyr C 190 rPro Gly Pro G 205 Gly Gly Tyr S Gin Pro Pro Al Gin Gly Pro Se 255 Pro Pro Val Se 270 Asn Tyr.Sen Asi 285 Ala Pro Val Leu Tyr Gin Thr 160 krg i n en Ia n 129 STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:93: Gly Cys Gly Glu Thr Asp Ala Ala Thr 1 5 Phe Glu Arg Ile Ser Gly Asp Leu Lys pc Leu 10 Ala Gln Glu Ala Gly Asn Thr Gin Ile INFORMATION FOR SEQ ID NO:94: SEQUENCE CHARACTERISTICS: LENGTH: 16 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:94: Asp Gin Val Glu Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly 1 5 10 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 27 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 130 (xi) SEQUENCE DESCRIPTION: SEQ ID Gly Cys Gly Ser Thr Ala Gly Ser Leu Gin 1 5 10 Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala Ala INFORMATION FOR SEQ ID NO:96: SEQUENCE CHARACTERISTICS: LENGTH: 27 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear Val Arg
S..
S
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:96: Gly Cys Gly Gly Thr Ala Ala Gin Ala Ala Val Val Arg Phe Gin Glu 1 5 10 Ala Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu INFORMATION FOR SEQ ID NO:97: SEQUENCE CHARACTERISTICS: LENGTH: 27 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:97: Gly Cys Gly Ala Asn Lys Gin Lys Gin Glu 1 5 10 Leu Asp Glu Ile Ser Thr 131 Asn Ie Arg Gin Ala Gly Val Gin Tyr Ser Arg INFORMATION FOR SEQ ID NO:98: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 28 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION; SEQ ID NO:98: Gly Cys Gly Ilie Arg Gin Ala Gly Val Gin Tyr Ser Arg Ala Asp Glu 1 5 10 :Glu Gin Gin Gin Ala Leu Ser Sen Gin Met Gly Phe 20 INFORMATION FOR SEQ ID NO:99: Ci) SEQUENCE CHARACTERISTICS: CA) LENGTH: 507 base pairs CB) TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear SEQUENCE DESCRIPTION: SEQ ID NO:99: ATGAAGATGG TGAAATCGAT CGCCGCAGGT CTGACCGCCG CGGCTGCAAT CGGCGCCGCT GCGGCCGGTG TGACTTCGAT CATGGCTGGC GGCCCGGTCG TATACCAGAT GCAGCCGGTC 120 GTCTTCGGCG CGCCACTGCC GTTGGACCCG GCATCCGCCC CTGACGTCCC GACCGCCGCC 180 CAGTTGACCA GCCTGCTCMA CAGCCTCGCC GATCCCMCG TGTCGTTTGC GAACAAGGGC 240 AGTCTGGTCG AGGGCGGCAT CGGGGGCACC GAGGCGCGCA TCGCCGACCA CAAGCTGAAG AAGGCCGCCG AGCACGGGGA TC TGCCGCTG TCGTTCAGCG TGACGAACAT
CCAGCCGGCG
GCCGCCGGTT CGGCCACCGC CGACGT1TCC GTCTCGGGTC CGAAGCTCTC
GTCGCCGGTC
ACGCAGAACG TCACG1TCGT GMATCMAGGC GGCTGGATGC TGTCACGCGC ATCGGCGATG GAGTTGCTGC AGGCCGCAGG
GAACTGA
INFORMATION FOR SEQ ID NO: 100: 0i) SEQUENCE CHARACTERISTICS: LENGTH: 168 amino acids TYPE: amino acid STRANOEDNESS: single CD) TOPOLOGY: linear 300 360 420 480 507 0* (xi) Met 1 SEQUENCE DESCRIPTION: SEQ ID NO:100: Lys Met Val Lys Ser Ilie Ala Ala Gly Leu 5 10 Thr Al a Ilie Gly Ala Ala Ala Ala Gly Val Thr Ser Ile Met Ala 25 Val Val Tyr Gin Met Gln Pro Val Val Phe Gly Ala Pro 40 Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin 50 55 Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala 70 75 Sen Leu Val Giu Gly Gly Ilie Gly Gly Thr Glu Ala Arg 90 His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro 100 105 Ala Ala Ala Gly Gly Pro Leu Pro Leu Leu Thr Ser Asn. Lys Gly Ile Ala Asp Leu Sen Phe 110 Ser Val Thr Asn Ile Gin Pro Ala Ala Ala Gly Ser Ala Thr Ala Asp 115 120 125 Val Sen Val Sen Gly Pro Lys Leu Sen Ser Pro Val Thr Gin Asn Val 130 135 140 Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala Ser Ala Met 145 150 155 160 Glu Leu Leu Gin Ala Ala Gly Asn 165 INFORMATION FOR SEQ ID NO:101: SEQUENCE CHARACTERISTICS: LENGTH: 500 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear
I
C
I
II
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:101 CGTGGCAATG TCGTTGACCG TCGGGGCCGG GGTCGCCTCC GCAGATCCCG
TGGACGCGGT
CATTAACACC ACCTGCAATT ACGGGCAGGT AGTAGCTGCG CTCMCGCGA CGGATCCGGG GGCTGCCGCA CAGTTCAACG CCTCACCGGT GGCGCAGTCC TATTTGCGCA
ATTTCCTCGC
CGCACCGCCA CCTCAGCGCG CTGCCATGGC CGCGCAATTG CAAGCTGTGC
CGGGGGCGGC
ACAGTACATC GGCCTTGTCG AGTCGGTTGC CGGCTCCTGC AACAACTATT
AAGCCCATGC
GGGCCCCATC CCGCGACCCG GCATCGTCGC CGGGGCTAGG CCAGATTGCC
CCGCTCCTCA
ACGGGCCGCA TCCCGCGACC CGGCATCGTC GCCGGGGCTA GGCCAGATTG
CCCCGCTCCT
CAACGGGCCG CATCTCGTGC CGAATVCCTG CAGCCCGGGG GATCCACTAG
TTCTAGAGCG
GCCGCCACCG
CGGTGGAGCT
120 180 240 300 360 420 480 500 134 INFORMATION FOR SEQ ID NO:102: SEQUENCE
CHARACTERISTICS:
LENGTH: 96 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:102: Val Ala Met Ser Leu Thr Val Gly Ala Gly Val Ala Ser Ala Asp Pro Val Ala Val Asp Ala Va.l Ala Leu Asn Ala Ile Asn Thr Thr Cys Asn Tyr Gly Gin *5S* 5S5*
S
Thr Asp Pro Gly 40 Ala Ala Ala Gln Phe Asn Ala Ser Pro Val 50 Ala Gln Ser Tyr Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro Gin 65 Arg Ala Ala Met Ala Gin Leu Gin Al a Val Pro Gly Ala Ala Gin Tyr Ile Gly Leu Val Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr 90 S. S S *5
S
S S 55 INFORMATION FOR SEQ ID NO:103: SEQUENCE
CHARACTERISTICS:
LENGTH: 154 base-pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:103: ATGACAGAGc AGCAGTGGAA TFFCGCGGGT ATCGAGGCCG CGGCAAGCGC AATCCAGGGA MTGTCACGT CCATTCATTC CCTCCTTGAC GAGGGGAAGC AGTCCCTGAC
CAAGCTCGCA
GCGGCCTGGG GCGGTAGCGG TTCGGAAGCG
TACC
120 154 INFORMATION FOR SEQ ID NO: 104: Ci) SEQUENCE
CHARACTERISTICS:
LENGTH: 51 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:104: S. S 0S
S.
S S
*SS*
S.
S
Met Thi- Glu Gin Gin 1 5 Ti-p Asn Phe Ala Gly Ile 10 Glu Ala Ala Ala Sen Ala Ile Gin Gly Asn Val Thi- 20 Ser Ile His Ser 25 Leu Leu Asp Glu Gly Lys Gin Sen 35 Glu Ala Tyr Leu Thr Lys Leu Ala Ala Ala Ti-p Gly Gly 40 Ser Gly Sen 55 5 5* *5 S S S 55 INFORMATION FOR SEQ ID NO:105: Ci) SEQUENCE CHARACTERISTICS: CA) LENGTH: 282 base pairs TYPE: nucleic acid STRANDEDNESS: single CD) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:105: CGGTCGCGCA CTTCCAGGTG ACTATGAAAG TCGGCTTCCG NCTGGAGGAT TCCTGAACCT TCAAGCGCGG CCGATAACTG AGGTGCATCA TTAAGCGACT T1CCAGAAC ATCCTGACGC 120 GCTCGAAACG CGGCACAGCC GACGGTGGCT CCGNCGAGGC GCTGNCTCCA AAATCCCTGA 180 GACAATTCGN CGGGGGCGCC TACAAGGAAG TCGGTGCTGA ATTCGNCGNG TATCTGGTCG 240 ACCTGTGTGG TCTGNAGCCG GACGAAGCGG TGCTCGACGT CG 282 INFORMATION FOR SEQ ID NO:106: SEQUENCE
CHARACTERISTICS:
LENGTH: 1565 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear *0 see GTATGCGGCC ACTGAAGTCG CCAATGCGGC GGCGGCCAGC TAAGCCAGGA ACAGTCGGCA goes a 00 CGAGAAACCA CGAGAAATAG GGACACGTMA TGGTGGATTr CGGGGCGTTA CCACCGGAGA 120 *00 TCAACTCCGC GAGGATGTAC GCCGGCCCGG GTTCGGCCTC GCTGGTGGCC GCGGCTCAGA 180 0TGTGGGACAG CGTGGCAGT GACCTGTTF7 CGGCCGCGTC GGCGTT7CAG TCGGTGGTCT 240 GGGGTCTGAC GGTGGGGTCG TGGATAGGTT CGTCGGCGGG TCTGATGGTG GCGGCGGCCT 300-.
:CGCCGTATGT GGCGTGGATG AGCGTCACCG CGGGGCAGGC CGAGCTGACC GCCGCCCAGG 360 TCCGGGTTGC TGCGGCGGCC TACGAGACGG CGTATGGGCT GACGGTGCCC CCGCCGGTGA 420 TCGCCGAGMA CCGTGCTGMA CTGATGATTC TGATAGCGAC CAACCTCTTG GGGCAAAACA 480 CCCCGGCGAT CGCGGTCAAC GAGGCCGAAT ACGGCGAGAT GTGGGCCCAA-GACGCCGCCG 540 CGATGTT7GG CTACGCCGCG GCGACGGCGA CGGCGACGGC GACGTTGCTG CCGTTCGAGG 600 AGGCGCCGGA GATGACCAGC GCGGGTGGGC
T(
CCTCCGACAC CGCCGCGGCG MCCAGTTGA
T
TGGCCCAGCC CACGCAGGGC ACCACGCC1T
CT
TCTCGCCGCA TCGGTCGCCG ATCAGCAACA
TG
TGACCAACTC GGGTGTGTCA ATGACCAACA
CC
CGGCGGCGGC CGCCCAGGCC GTGCAAACCG
CG(
CGCTGGGCAG CTCGCTGGGT TC1 TCGGGTC TG( GGGCGGCCTC GGTCGGTFCG JTGTCGGTGC
CG(
TCACCCCGGC GGCGCGGGCG CTGCCGCTGA
CCA
CCGGGCAGAT GCTGGGCGGG CTGCCGGTGG
GGC
TCAGTGGTGT GCTGCGTGTT CCGCCGCGAC
CCT
GCTAGGAGAG GGGGCGCAGA CTGTCGTTAT
TTG&
CGCGGCCGGC TATGACAACA GTCMATGTGC
ATG'
CAACAAGGAG ACAGGCAACA TGGCCTCACG 1f CATGGCGGGC CGTTTTGAAG TGCACGCCCA
GACC
GGCGTCCGCG CAAAACAMi CCGGTGCGGG
CTGC
AGACA
INFORMATION FOR SEQ ID NO:107: SEQUENCE
CHARACTERISTICS:
LENGTH: 391 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear
'CTCGAGC
;AACAATG
TCCAAGC
GTGTCAA
TTGAGCT
~CGCAAA)
~GCGGTG(.
:AGGCCTG
GCCTGAC
AGATGGG
ATGTGAT
~CCAGTG
~CAAGTT
FATGACG
iGTGGAG
AGTGGC
;A GGCCGCCGCG GTCGAGGAGG T GCCCCAGGCG CTGCAACAGC T GGGTGGCCTG TGGAAGACGG T GGCCAACAAC CACATGTCAA CGATGTTGAAG GGC1TTGCTC \CGGGGTCCGG
GCGATGAGCT
GGTGGCCGCC AACTTGGGTC GGCCGCGGCC AACCAGGCAG CAGCGCCGCG GAAAGAGGGC CGCCAGGGCC GGTGGTGGGC GCCGCATTCT CCGGCGGCCG ATCGGCGuGTC TCGGTGT17C ACAGGTATTA GGTCCAGGTT GATCCGCACG CGATGCGGGA GACGAGGCTC GCCGGATGTG ATGGCCGAGG CGACCTCGCT 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1565 138
V
L
Al G I
GJI
Trp 145 Thr Se' Asp Gin Xi) SEQUENCE DESCRIPTION: SEQ 10 NO:107: Met VaT Asp Phe Gly Ala Leu Pro Pro Giu Ile Asn Ser 1 5 10 Tyr Ala Gly Pro Gly Set' Ala Ser Leu Val Ala Ala Ala 25 ~sp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala 40 45 'al Val Trp Gly Leu Thr Val Gly Set' Trp Ilie Gly Ser 55 60 eu Met Val Ala Ala Ala Set' Pro Tyr Vai Ala Trp Met 570 15 Ia Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val A 85 90 a Tyr Giu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro V 100 105 1 u Asn Arg Ala Giu Leu Met Ilie Leu Ile Ala Thr Asn LE 115 120 125 1 Asn Thr Pro Ala Ilie Ala Val Asn Giu Ala Giu Ty r Gi 130 135 140 Ala Gin Asp Ala Ala Ala Met Phe Gly Ty r Ala Ala Al 150 155 Ala Thr Ala Thr Leu Leu Pro Phe Giu Giu Ala Pro GIt 165 170 Ala Gly Giy Leu Leu Giu Gin Ala Ala Ala Val Giu GIL 180 185 190 Thr Ala Ala Ala Msn Gin Leu Met Asn Asn Val Pro Gin 195 200 205 Gin Leu Ala Gin Pro Thr Gin Gly Thr Thr Pro Ser Ser 210 215 220
C
P
a Ii
?L
y a -l 41a Arg Met In Met Trp he Gin Set' 2t' Ala Gly ~r Val Thr a Ala Ala 1 Ilie Ala Leu Gly Giu Met Thi' Ala 160 Met Thr 175 Ala Ser Ala Leu Lys Leu Gly 225 Gly Leu Trp Lys Thr Val 230 Sen Pro Hi s Arg Ser 235 Pro Ile Ser Asn 240 Met Val Sen Met Ala Asn Asn His Met 245 Ser 250 Met Thr Asn Ser Gly Val 255 Sen Met Thr Ala Ala Ala 275 Asn 260 Thr Leu Ser Ser Met Leu 265 Lys Gly Phe Ala Pro Ala 270 Gin Ala Val Gin Thr 280 Ala Ala Gin Asn Val Arg Ala Met Ser 290 Sen Leu Gly Ser Ser 295 Leu Gly Ser Ser Gly 300 Leu Gly Gly Gly Val1 305 Ala Ala Asn Leu Gly 310 Ang Ala Ala Ser Val1 315 Gly Sen Leu Sen Val1 320 Pro Gin Ala Trp Ala Ala Msn Gin Val Thr Pro Ala Al a Ang 335 9 .9 Al a Leu Gin Met Gly Gly 370 Pro Leu 340 Thr Sen Leu Thr Sen 345 Ala Ala Giu Ang Gly Pro Gly 350 Ang Ala Gly Leu Gly Gly Leu Pro 355 Leu Sen Gly Val Leu 375 Val1 360 Gly Gin Met Gly Ala 365 Ang Val Pro Pro Arg Pro Tyr Val Met 380 Pro His Sen 385 Pro Ala Ala Gly 390 INFORMATION FOR SEQ ID NO:108: SEQUENCE CHARACTERISTICS: LENGTH: 259 base pairs TYPE: nucleic acid STRANDEONESS: single TOPOLOGY: linear 140 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:108: ACCAACACCT TGCACTCNAT GTTGAAGGGC TTAGCTCCGG CGGCGGCTCA GGCCGTGGAA ACCGCGGCGG AAAACGGGGT CTGGGCAATG AGCTCGCTGG GCAGCCAGCT GGGTTCGTCG 120 CTGGGTTCTr CGGGTCTGG CGCTGGGGTG GCCGCCAACT TGGGTCGGGC GGCCTCGGTC 180 GGTTCGTTGT CGGTGCCGCC AGCATGGGCC GCGGCCAACC AGGCGGTCAC CCCGGCGGCG 240 CGGGCGCTGC CGCTGACCA 259 INFORMATION FOR S-EQ ID NO:109: Ci) SEQUENCE
CHARACTERISTICS:
LENGTH: 86 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID, NO:109: Thr Asn Thr Leu His Sen Met Leu Lys Gly Leu Ala Pro Ala Ala Ala 10 Gin Ala Val Glu Thr Ala Ala GluAsn Gly Val Trp Ala Met Ser Ser 20 25 Leu Gly Sen Gln Leu Gly Sen Sen Leu Gly Ser Ser Gly Leu Gly Ala 3540 Gly Val Ala Ala Asn Leu Gly Ang Ala Ala Sen Val Gly Ser Leu Sen 50 55. Val Pro Pro Ala Tnp Ala Ala Ala Asn Gln Ala Val Thn Pro Ala Ala 70 75 Arg Ala Leu Pro Leu Thr INFORMATION FOR SEQ ID NO:110: Ci) SEQUENCE CHARACTERISTICS: LENGTH: 1109 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:110: TAC1TGAGAG AATTTGACCT GTTGCCGACG TTGTT7GCTG TCCATCA1TG GTGCTAGTTA TGGCCGAGCG GAAGGA]TAT CGAAGTGGTG GACTTCGGGG CG1TACCACC GGAGATCAAC TCCGCGAGGA TGTACGCCGG CCCGGG17CG GCCTCGCTGG TGGCCGCCGC GAAGATGTGUG GACAGCGTGG" CGAGTGACCT GTTTTCGGCC GCGTCGGCGT TTCAGTCGGT GGTCTGGGGT 0*
CTGACGACGG
TATGTGGCGT
GTTGCTGCG
GAGAACCGTG
GCGATCGCGG
TGGCTACG
CCACTGATCA
GACACCGCCG
CAGCCCACGA
CCGCATCTGT
AACTCGGGTG
GCGGCTCAGG
GATCGTGGAT
GGATGAGCGT
CGGCCTACGA
CTGAACTGAT
TCAACGAGGC
CCGCCACGGC
CCMACCCCGG
CGGCGAACCA
AAAGCATCTG
CGCCGCTCAG
TGTCMATGGC
CCGTGGAAAC
AGGTTCGTCG
CACCGCGGGG
GACGGCGTAT
GATTCTGATA
CGAATACGGG
GGCGACGGCG
CGGGCTCCTT
G1TGAIGAAC
GCCGTTCGAC
CMACATCGTG
CAGCACCTTG
CGCGGCGCAA
GCGGGTCTGA
CAGGCCGAGC
GGGCTGACGG
GCGACCAACC
GAGATGTGGG
ACCGAGGCGT
GAGCAGGCCG
AATGTGCCCC
CAACTGAGTG
TCGATGCTCA
CACTCAATGT
AACGGGGTCC
TGGTGGCGGC
TGACCGCCGC
TGCCCCCGCCL
TC1TGGGGCA
CCCAAGACGC
TGCTGCCGTT
TCGCGGTCGA
AAGCGCTGCA
AACTCTGGAA
GGCCT.CGCCG
CCAGGTCCGG"
GGTGATCGCC
AAACACCCCG
CGCCGCGATG
CGAGGACGCC
GGAGGCCATC
ACAACTGGCC
AGCCATCTCG
120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 ACAACCACGT GTCGATGACC
TGAAGGGCTT
AGGCGATGAG
TGCTCCGGCG
CTCGCTGGG"C
AGCCAGCTGG GTTCGTCGCT GGGTTCTTCG GGTCTGGGCG CTGGGGTGGC CGCCAACTTG GGTCGGGCGG CCTCGGTCGG TTCGTTGTCG GTGCCGCAGG CCTGGGCCGC GGCCMACCAG GCGGTCACCC CGGCGGCGCG GGCGCTGCC INFORMATION FOR SEQ ID NO: 11l: SEQUENCE CHARACTERISTICS: LENGTH: 341 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 1020 1080 1109 (xi Val1 1 SEQUENCE DESCRIPTION: SEQ ID NO:111: Vai Asp Phe Gly Ala Leu Pro Pro Giu Ile Asn Ser Ala Arg Met Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Asp Ser Val Ala Ser Asp Gly Leu Thr Val Val Leu Met Ala Gly Trp Leu Phe Ser Ala 40 Thr Gly Ser Trp 55 Ser Pro Tyr Val Al a Ser Al a Lys Met Trp Phe Gi n Ser Ser Ala Gly Ile Al a Gly Ser Val Ala Ala Ala 70 Gin Ala Glu Leu 85 Giu Thr Ala Tyr 100 Arg Ala Giu Leu Trp Arg 75 Met Ser Val Thr Val Ala Ala Ala Pro Val Ilie Ala Al a Tyr Giu Asn Thr Ala Ala Gly Leu Thr 105 Met Ile Leu 120 Gln Val Pro Pro 110 Leu Leu Gly Ile Ala Thr 11 Asn 125 WO 97/09428 Gin Asn Thr Pro Ala Ilie Ala 130 135 Trp Ala Gin Asp Ala Ala Ala 145 150 Thr Ala Thr Giu Ala Leu Leu 165 Asn Pro Giy Giy Leu Leu Giu 180 Asp Thr Ala Ala Ala Asn Gin 195 Gin Gin Leu Ala Gin Pro Thr 210 215 Sen Giu Leu Trp Lys Ala Ilie 225 230 Ilie Val Ser Met Leu Asn Asn 245 Ser Met Ala Ser Thr Leu His 260 Ala Ala Gin Ala Val Glu Thr 275 Ser Sen Leu Gly Sen Gin LeuC 290 295 Giy Aia Gly Val Ala Ala Asn L.
305 310 Leu Ser Val Pro Gin Ala Trp A 325 Ala Ala Arg Ala Leu 340 INFORMATION FOR SEQ ID NO:112: Ci) SEQUENCE CHARACTERISTICS: I W Me Pr G1i Let 20( Lys Ser Hi s Sen la ?80 ily .eu 1l a I At P' o Ph n~ Al 18 .i Me Sei Prc Val1 Met 265 Al a Sen Gi y Al a Glu Ala ~e Gly Tyr 155 e Glu Asp 170 a Val Ala 5 t Asn Asn rlie Trp )His Leu 235 Ser Met1 250 Leu Lys G Gin Asn G Ser Leu G 34 Arg Ala A 315 Ala Asn Gl 330 Gi Al Al Va Val Prc 220 ;er hr iiy ly ly 00 la In u Ty a Al a Pr 1 G1.
IPrc 205 Phe Pro Asn Phe Val1 285 Sen Sen Al a r Gly Giu a Thn Ala oLeu Ilie 175 j Giu Ala 190 Gin Ala Asp Gin Leu Sen Sen Gly 'V 255 Ala Pro A 270 Gin Ala M Ser Giy L Val Gly S Val Thn Pi Met Al a 160 Th n Ilie Leu Leu ks n ral I1 a et eu -o 144 LENGTH: 1256 base pairs TYPE: nucleic acid STRANOEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:112: CATCGGAGG AGTGATCACC ATGCTGTGGC ACGCAATGCC ACCGGAGNTA AATACCGCAC GGCTGATGGC CGGCGCGGGT CCGGCTCCMA TGCTTGCGGC GGCCGCGGGA TGGCAGACGC 120 TTTCGGCGGC TCTGGACGCT CAGGCCGTCG AGTTGACCGC GCGCCTGAAC TCTCTGGGAG 180 AAGCCTGGAC TGGAGGTGGC AGCGACMAGG CGCTTGCGGC TGCAACGCCG ATGGTGGTCT 240 GGCTACAC CGCGTCAACA CAGGcCCMGA CCCGTGCGAT GCAGGCGACG GCGCAAGCCG 300 CGGCATACAC CCAGGCCATG GCCACGACGC CGTCGCTGCC GGAGATCGCC GCCAACCACA 360 :TCACCCAGGC CGTCCTTACG GCCACCAACT TCTTCGGTAT CAACACGATC CCGATCGCGT 420 TGACCGAAT GGA1TATTTC ATCCGTATGT GGAACCAGGC AGCCCTGGCA ATGGAGGTCT 480 ACCAGGCCGA GACCGCGGTT MCACGCTTTF TCGAGAAGCT CGAGCCGATG GCGTCGATCC 540 TTGATCCCGG CGCGAGCCAG AGCACGACGA ACCCGATCTT CGGAATGCCC TCCCCTGGCA 600 GCTCAACACC GGTTGGCCAG TTrGCCGCCGG CGGCTACCCA GACCCTCGGC GAACTGGGTG 660 AGATGAGCGG CCCGATGCAG CAGCTGACCC AGCCGCTGCA GCAGGTGACG TCGTFGTTCA 720 GCCAGGTGG CGGCACCGGC GGCGGCAACC CAGCCGACGA GGMAGCCGCG CAGATGGGCC 780 :TGCTCGGCAC CAGTCCGCTG TCGAACCATC CGCTGGCTGG TGGATCAGGC CCCAGCGCGG 840 1 GCGCGGGCCT GCTGCGCGCG GAGTCGCTAC CTGGCGCAGG TGGGTCGTTG ACCCGCACGC 900 CGCTGATGTC TCAGCTGATC GAAAAGCCGG TTGCCCCCTC GGTGATGCCG GCGGCTGCTG 960 CCGGATCGTC GGCQACGGGT GGCGCCGCTC CGGTGGGTGC GGGAGCGATG GGCCAGGGTG 1020 CGCAATCCGG CGGCTCCACC AGGCCGGGTC TGGTCGCGCC GGCACCGCTC GCGCA6GAGC GTGAAGAAGA CGACGAGGAC GACTGGGACG MAGAGGACGA CTGGTGAGCT CCCGTAATGA CMACAGACTT CCCGGCCACC CGGGCCGGAA GAC1TGCCAA CATT1TGGCG AGGAAGGTAA AGAGAGAAAG TAGTCCAGCA TGGCAGAGAT GMAGACCGAT GCCGCTACCC TCGCGC INFORMATION FOR SEQ ID NO:113.
Ci) SEQUENCE CHARACTERISTICS: LENGTH: 432 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear 1080 1140 1200 1256
S
S*S*
S
5555
S
5*
S
*5 *5 S 9*
S.
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:113: CTAGTGGATG GGACCATGGC CATTTCTGC AGTCTCACTG CCTTCTGTGT GCACGCCGGC GGAAACGAAG CACTGGGGTC GAAGAACGGC TGCGCTGCCA AGCTTCCATA CCTTCGTGCG GCCGGAAGAG CTTGTCGTAG TCGGCCGCCA TCAGAGTGCG CTCAAACGTA TAAACACGAG AAAGGGCGAG ACCGACGGMA GCCCGATCCC GTGTTTCGCT ATTCTACGCG AACTCGGCGT TGCCCTATGC GTGACGTTGC CTTCGGTCGA AGCCATTGCC TGACCGGCTT CGCTGATCGT TTCTGCAGCG CGTTGTTCAG CTCGGTAGCC GTGGCGTCCC ATFFTGCTG TACGCCTCCG AA INFORMATION FOR SEQ ID NO:114: SEQUENCE CHARACTERISTICS: LENGTH: 368 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear TGACATTT7G
TATCGTCCGG
TGACAACCTC
GGTCGAACTC
GAACATCCCA
CCGCGCCAGG
GACACCCTGG
120 180 240 300 360 420 432
S
*5 S S *5
S
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:114: Met Leu Trp His Ala Met Pro Pro Giu Xaa Asn 1 5 10 Ala Gly Ala Gly Pro Ala Pro Met Leu Ala Ala 25 Thr Leu Ser Ala Ala Leu Asp Ala Gin Ala Val 40 Leu Asn Ser Leu Gly Giu Ala Trp Thr Gly Gly 55 Leu Ala Ala Ala Thr Pro Met Val Val Trp Leu G 70 75 Gin Ala Lys Thr Arg Ala Met Gin Ala Thr Ala G 90 Thr Gin Ala Met Ala Thr Thr Pro Ser Leu Pro G 100 105 His Ilie Thr Gin Ala Val Leu Thr Ala Thr Asn Ph 115 120 Thr Ilie Pro Ilie Ala Leu Thr Glu Met Asp Tyr Ph 130 135 14 Asn Gin Ala Ala Leu Ala Met Glu Val Tyr Gin Al 145 150 155 Asn Thr Leu Phe Glu Lys Leu Giu Pro Met Ala Ser 165 170 Gly Ala Ser Gin Sen Thr Thr Asn Pro Ilie Phe Gly 180 185 Gly Ser Ser Thr Pro Vai Gly Gin Leu Pro Pro Ala 195 200 Thr Ala A Glu L 4, Gly SE Iln Th In Al lu Il ~e PhE 125 e Ilie 0 a Giu Ile Met Al a 205 kia Arg Leu ~la Gly Trp, eu Thr Ala r Asp Lys ir Ala Ser a Ala Ala e Ala Ala A 110 Gly Ilie A,, Arg Met Tr Thr Ala Va 16 Leu Asp Pr 175 Pro Ser Pr( 190 Thr Gin Thr Met Gin Arg Al a Fhr Sn n 0 0 Leu Gly Gin Leu Gly Glu Met Ser Gly Pro Met Gin Gin Leu Thr Gin 210 215 220 Pro Leu Gin Gin Val 225 Thr 230 Ser Leu Phe Ser Gin 235 Val Gly Gly Thr Gly 240 Gly Gly Asn Pro Al a 245 Asp Glu Glu Ala Gin Met Gly Leu Leu Gly 255 Thr Ser Pro Ala Gly Ala 275 Leu 260 Ser Asn His Pro Leu 265 Ala Gly Gly Ser Gly Pro Ser 270 Ala Gly Gly Gly Leu Leu Arg Al a 280 Glu Ser Leu Pro Gl y 285 Ser Leu 290 Thr Arg Thr Pro Leu 295 Met Ser Gin Leu Glu Lys Pro Val Pro Ser Val Met Pro 310 Ala Ala Ala Ala Gly 315 Ser Ser Ala Thr Gly 320 Gly Ala Ala Pro Gly Ala Gly Ala Met 330 Gly Gin Gly Ala Gin Ser 335 Gly Gly Ser Thr 340 Arg Pro Gly Leu Ala Pro Ala Pro Leu Ala Gin 350 Glu Arg Glu Glu 355 Asp Asp Glu Asp 360 Asp Trp Asp Glu Gi u 365 Asp Asp Trp 9
S
*99999 9 *9 9 9* 9* 9* *9 INFORMATION FOR SEQ ID NO:115: SEQUENCE CHARACTERISTICS: LENGTH: 12 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:115: 148 Met Ala Glu Met Lys Thr Asp Ala Ala Thr Leu Ala 1 5 INFORMATION FOR SEQ ID NO:116: SEQUENCE
CHARACTERISTICS:
LENGTH: 396 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:116: GATCTCCGGC GACCTGAAAA CCCAGATCGA CCAGGTGGAG TCGACGGCAG GTTCGTTGCA GGGCCAGTGG CGCGGCGCGG CGGGGACGGC CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA 120 AGCAGCCAAT AAGCAGAAGC AGGAACTCGA CGAGATCTCG ACGAATATTC GTCAGGCCGG 180 .0 0:1CGTCCAATAC TCGAGGGCCG ACGAGGAGCA GCAGCAGGCG CTGTCCTCGC AAATGGGCTT 240 0 CTGACCCGCT AATACGAAAA GAAACGGAGC AAAAACATGA CAGAGCAGCA GTGGAATTTC 300 GCGGGTATCG AGGCCGCGGC AAGCGCAATC CAGGGAAATG TCACGTCCAT TCATTCCCTC 360 CTTGACGAGG GGAAGCAGTC CCTGACCAAG CTCGCA 396 INFORMATION FOR SEQ ID NO:117: 4# SEQUENCE CHARACTERISTICS: LENGTh: 80 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:117: 149 Ile Sen Gly Asp Leu Lys Thr Gin Ile Asp Gin Val Glu Ser Thr Ala 1 5 10 Gly Se Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin 25 Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu 40 Leu Asp Giu Ile Ser Thr Asn Ile Arg Gin Ala Gly Val Gin Tyr Sen 55 Arg Ala Asp Giu Glu Gin Gin Gin Ala Leu Sen Ser Gin Met Gly Phe 70 75 INFORMATION FOR SEQ ID NO:118: i) SEQUENCE CHARACTERISTICS: LENGTh: 387 base pairs TYPE: nucleic acid STRANEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:118: GTGGATCCCG ATCCCGTGTT TCGCTATTCT ACGCGAACTC GGCGTTGCCC TATGCGAACA TCCCAGTGAC GTTGCCTTCG GTCGAAGCCA TTGCCTGACC GGCTTCGCTG ATCGTCCGCG 120 CCAGGTTCTG CAGCGCGTTG TTCAGCTCGG TAGCCGTGGC GTCCCATT1T TGCTGGACAC 180 CCTGGTACGC CTCCGAACCG CTACCGCCCC AGGCCGCTGC GAGCTTGGTC AGGGACTGCT 240 TCCCCTCGTC MGGAGGGAA TGAATGGACG TGACATTTCC CTGGA1TGCG CTTGCCGCGG 300 CCTCGATACC CGCGAAATTC CACTGCTGCT CTGTCATGTT TTTGCTCCGT TTCTTTTCGT 360 ATTAGCGGGT CAGAAGCCCA TTTGCGA 387 150 INFORMATION FOR SEQ ID NO: 119: SEQUENCE
CHARACTERISTICS:
LENGTH: 272 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ, ID NO:119: CGGCACGAGG ATCTCGGTTG GCCCMACGGC GCTGGCGAGG GCTCCGTTCC GGGGGCGAGC TGCGCGCCGG ATGCTTCCTC TGCCCGCAGC CGCGCCTGGA TGGATGGACC AGTTGCTACc 120 TTCCCGACGT TTCGTTCGGT GTCTGTGCGA TAGCGGTGAC CCCGGCGCGC ACGTCGGGAG 180 TGTTGGGGGG CAGGCCGGGT CGGTGGTTCG GCCGGGGACG CAGACGGTCT GGACGGAACG 240 :.GGCGGGGGTT CGCCGA1TGG CATCT1TTGCC CA 272 INFORMATION FOR SEQ ID NO:120: SEQUENCE
CHARACTERISTICS:
LENGTH: 20 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear SEQUENCE DESCRIPTION: SEQ ID NO:120: Asp Pro Val Asp Ala Val Ile Asn Thr Thr Cys Asn Tyr Gly Gin Val 1 5 10 1 Val Ala Ala Leu INFORMATION FOR SEQ ID NO: 121: 151 SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:121: Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 1 5 10 INFORMATION FOR SEQ ID NO:122: SEQUENCE CHARACTERISTICS: LENGTH: 19 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:122: Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 1 5 10 :i Glu Gly Arg INFORMATION FOR SEQ ID NO:123: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear 152 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:123: Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly I r- INFORMATION FOR SEQ ID NO:124: SEQUENCE
CHARACTERISTICS:
LENGTH: 14 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID N0:124: Asp Ile Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val INFORMATION FOR SEQ ID NO:125: SEQUENCE CHARACTERISTICS: LENGTH: 13 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:125: Ala Glu Glu Ser Ile Ser Thr Xaa Glu Xaa Ile Val Pro INFORMATION FOR SEQ ID NO:126: SEQUENCE CHARACTERISTICS: 153 LENGTH: 17 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:126: Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro 1 5 10 Ser INFORMATION FOR SEQ ID NO:127: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:127: Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 1 5 10 INFORMATION FOR SEQ ID NO:128: SEQUENCE CHARACTERISTICS: LENGTH: 30 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear 154 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:128: Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser Leu Leu Asn Ser Leu Ala Asp Pro Val Ser Phe Ala Asn INFORMATION FOR SEQ ID NO:129: SEQUENCE CHARACTERISTICS: LENGTH: 22 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:129: a. Asp 1 Pro Pro Asp Pro His Gin Xaa Asp Met Thr 10 Lys Gly Tyr Tyr Pro a. a a Gly Gly Arg Arg Xaa Phe INFORMATION FOR SEQ ID NO:130: SEQUENCE CHARACTERISTICS: LENGTH: 7 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:130: Asp 1 Pro Gly Tyr Thr Pro Gly INFORMATION FOR SEQ ID NO:131: SEQUENCE CHARACTERISTICS: LENGTH: 10 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ix) FEATURE: OTHER INFORMATION: /note= Pro or Thr" "The Second Residue Can Be Either a (xi) SEQUENCE DESCRIPTION: SEQ ID NO:131: Xaa Xaa Gly Phe Thr Gly Pro Gin Phe Tyr 1 5 INFORMATION FOR SEQ ID NO:132: SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (ix) FEATURE: OTHER INFORMATION: /note= Gin or Leu" "The Third Residue Can Be Either a a. a a.
a (xi) SEQUENCE DESCRIPTION: SEQ ID NO:132: Xaa Pro Xaa Val Thr Ala Tyr Ala Gly 1 INFORMATION FOR SEQ ID NO:133: SEQUENCE CHARACTERISTICS: LENGTH: 9 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:133: Xaa Xaa Xaa Glu Lys Pro Phe Leu Arg 1 INFORMATION FOR SEQ ID NO:134: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:134: Xaa Asp Ser Glu Lys Ser Ala Thr 1 5 Ile Lys Val Thr Asp Ala 10 Ser INFORMATION FOR SEQ ID NO:135: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:135: Ala Gly Asp Thr Xaa Ile Tyr Ile Val Gly Asn Leu Thr Ala Asp 157 INFORMATION FOR SEQ ID NO:136: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:136: Ala 1 Pro Glu Ser Gly Ala Gly Leu Gly Thr Val Gln Ala *r INFORMATION FOR SEQ ID NO:137: SEQUENCE CHARACTERISTICS: LENGTH: 21 amino acids TYPE: amino acid
STRANDEDNESS:
TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:137: Xaa Tyr Ile Ala Tyr Xaa Thr Thr Ala Gly Ile Val Pro Gly Lys Ile 1 5 10 Asn Val His Leu Val

Claims (22)

1. An isolated polypeptide comprising an immunogenic portion of a M. tuberculosis antigen, or a variant of said antigen that differs only in conservative substitutions and/or modifications, wherein said antigen comprises an amino acid sequence encoded by a DNA sequence selected from the group consisting of the sequences recited in SEQ ID NOS: 4 and 17, the complements of said sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID NOS: 4 and 17 or a complement thereof under moderately stringent conditions.
2. An isolated DNA molecule comprising a nucleotide sequence encoding a polypeptide according to claim 1.
3. An expression vector comprising a DNA molecule according to claim 2.
4. A host cell transformed with an expression vector according to claim 3. The host cell of claim 4 wherein the host cell is selected from the group consisting ofE. coli, yeast and mammalian cells.
6. A pharmaceutical composition comprising one or more polypeptides i according to claim 1 and a physiologically acceptable carrier.
7. A pharmaceutical composition comprising one or more DNA molecules according to claim 2 and a physiologically acceptable carrier.
8. A vaccine comprising one more polypeptides according to claim 1 and a non-specific immune response enhancer.
9. The vaccine of claim 8 wherein the non-specific immune response enhancer is an adjuvant. P.\Op\Vp\VPA Pr-lion\2361435 nm cIainis corixa 210 dmc-2907103 159 A vaccine comprising one or more DNA molecules according to claim 2 and a non-specific immune response enhancer.
11. The vaccine of claim 10 wherein the non-specific immune response enhancer is an adjuvant.
12. A method for inducing protective immunity in a patient, comprising administering to a patient a pharmaceutical composition according to claims 6 or 7.
13. A method for inducing protective immunity in a patient, comprising administering to a patient a vaccine according to any one of claims 8-11.
14. A fusion protein comprising two or more polypeptides according to claim u A fusion protein comprising one or more polypeptides according to claim 1 and TbH9.
16. A pharmaceutical composition comprising a fusion protein according to claims 14 or 15 and a physiologically acceptable carrier.
17. A vaccine comprising a fusion protein according to claims 14 or 15 and a non-specific immune response enhancer.
18. The vaccine of claim 17 wherein the non-specific immune response enhancer is an adjuvant.
19. A method for inducing protective immunity in a patient, comprising administering to a patient a pharmaceutical composition according to claim 16. A method for inducing protective immunity in a patient, comprising administering to a patient a vaccine according to claims 17 or 18. o ••oo •oo• P:VJflWpaVPA Prosuion\236I435 -e caiimis cori-a2I0Aod-297/ 3 160
21. A method for detecting tuberculosis in a patient, comprising: contacting dermal cells of a patient with one or more polypeptides according to claim 1; and detecting an immune response on the patient's skin and therefrom detecting tuberculosis in the patient.
22. The method of claim 21 wherein the immune response is induration.
23. A diagnostic kit comprising: a polypeptide according to claim I; and apparatus sufficient to contact said polypeptide with the dermal cells of a patient.
24. A polypeptide comprising an immunogenic portion of M. tuberculosis antigen, or a variant of said antigen that differs only in conservative substitutions and/or modifications, wherein said antigen comprises an amino acid sequence selected from the group consisting of SEQ ID NOS: 66 and 79.
25. A fusion protein comprising at least one polypeptide according to claim 24 and a second polypeptide.
26. A polypeptide according to claim 1, or a DNA molecule according to claim 2, or an expression vector according to claim 3, or a host cell according to any one of claims 4 or 5, or a pharmaceutical composition according to any one of claims 6 or 7, or a vaccine according to any one of claims 8-11, or a method according to any one of claims 12 or 13, or a fusion protein according to any one of claims 14 or 15, or a pharmaceutical composition according to claim 16, or a vaccine according to any one of claims 17 or 18, or a method according to any one of claims 19-22, or a diagnostic kit according to claim 23, or a polypeptide according to claim 24, or a fusion protein according to claim 25, substantially as herein before described with reference to the figures and/or examples. P:UO~pakVVPA P-osoion\2361435 m-clim corix.210d-29/7/03 161 DATED this 29th day of July 2003 Corixa Corporation DAVIES COLLISON CAVE Patent Attorneys for the applicant
AU71762/00A 1995-09-01 2000-11-22 Compounds and methods for immunotherapy and diagnosis of tuberculosis Ceased AU765833B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU71762/00A AU765833B2 (en) 1995-09-01 2000-11-22 Compounds and methods for immunotherapy and diagnosis of tuberculosis

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
US08/523436 1995-09-01
US08/533634 1995-09-22
US08/620874 1996-03-22
US08/659683 1996-06-05
US08/680574 1996-07-12
AU71586/96A AU727602B2 (en) 1995-09-01 1996-08-30 Compounds and methods for immunotherapy and diagnosis of tuberculosis
AU71762/00A AU765833B2 (en) 1995-09-01 2000-11-22 Compounds and methods for immunotherapy and diagnosis of tuberculosis

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU71586/96A Division AU727602B2 (en) 1995-09-01 1996-08-30 Compounds and methods for immunotherapy and diagnosis of tuberculosis

Publications (2)

Publication Number Publication Date
AU7176200A AU7176200A (en) 2001-03-08
AU765833B2 true AU765833B2 (en) 2003-10-02

Family

ID=28795560

Family Applications (1)

Application Number Title Priority Date Filing Date
AU71762/00A Ceased AU765833B2 (en) 1995-09-01 2000-11-22 Compounds and methods for immunotherapy and diagnosis of tuberculosis

Country Status (1)

Country Link
AU (1) AU765833B2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995001440A1 (en) * 1993-07-02 1995-01-12 Statens Seruminstitut Diagnostic skin test for tuberculosis
WO1995001441A1 (en) * 1993-07-02 1995-01-12 Statens Serumsinstitut Tuberculosis vaccine

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995001440A1 (en) * 1993-07-02 1995-01-12 Statens Seruminstitut Diagnostic skin test for tuberculosis
WO1995001441A1 (en) * 1993-07-02 1995-01-12 Statens Serumsinstitut Tuberculosis vaccine

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LOWRIE ET AL, VACCINE 1994 12(16) 1537-1540 *

Also Published As

Publication number Publication date
AU7176200A (en) 2001-03-08

Similar Documents

Publication Publication Date Title
AU727602B2 (en) Compounds and methods for immunotherapy and diagnosis of tuberculosis
US6290969B1 (en) Compounds and methods for immunotherapy and diagnosis of tuberculosis
US8084042B2 (en) Compounds and methods for immunotherapy and diagnosis of tuberculosis
EP1012293B1 (en) Compounds for immunotherapy and diagnosis of tuberculosis and methods of their use
PT2154248E (en) Compounds and methods for diagnosis of tuberculosis
WO1997009428A9 (en) Compounds and methods for immunotherapy and diagnosis of tuberculosis
US6350456B1 (en) Compositions and methods for the prevention and treatment of M. tuberculosis infection
US6338852B1 (en) Compounds and methods for diagnosis of tuberculosis
SA99200488B1 (en) Formulations and methods for treatment and prevention of infection with the bacterium M. tuberculosis
AU765833B2 (en) Compounds and methods for immunotherapy and diagnosis of tuberculosis
MXPA99003392A (en) Compounds and methods for immunotherapy and diagnosis of tuberculosis

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)