AU780210B2 - Mammalian genes involved in viral infection and tumor suppression - Google Patents
Mammalian genes involved in viral infection and tumor suppression Download PDFInfo
- Publication number
- AU780210B2 AU780210B2 AU27484/02A AU2748402A AU780210B2 AU 780210 B2 AU780210 B2 AU 780210B2 AU 27484/02 A AU27484/02 A AU 27484/02A AU 2748402 A AU2748402 A AU 2748402A AU 780210 B2 AU780210 B2 AU 780210B2
- Authority
- AU
- Australia
- Prior art keywords
- seq
- gene
- nucleic acid
- cell
- cells
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Landscapes
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Description
P/00/011 28/5/91 Regulation 3.2
AUSTRALIA
Patents Act 1990
ORIGINAL
COMPLETE SPECIFICATION STANDARD PATENT Name of Applicant: Actual Inventor Vanderbilt University Mr Donald H Rubin Edward L Organ Raymond D Dubois WRAY ASSOCIATES 239 Adelaide Terrace Perth, WA 6000 Address for service is: Attorney code: WR Invention Title: Mammalian Genes Involved in Viral Infector and Tumor Suppression The following statement is a full description of this invention, including the best method of performing it known to me:- 1/2 MAMMALIAN GENES INVOLVED IN VIRAL INFECTION AND TUMOR
SUPPRESSION
BACKGROUND
Field of the Invention The present invention provides methods of identifying cellular genes used for viral growth or for tumor progression. Thus, the present invention relates to nucleic acids related to and methods of reducing or preventing viral infection and for suppressing tumor progression. The invention also relates to methods for screening for additional such genes.
Background art Various projects have been directed toward isolating and sequencing the genome of various animals, notably the human. However, most methodologies provide nucleotide sequences for which no function is linked or even suggested, thus limiting the o 15 immediate usefulness of such data.
The present invention, in contrast, provides methods of screening only for nucleic acids that are involved in a specific process, viral infection or tumor progression, and further, for nucleic acids useful in treatments for these processes because by this method only nucleic acids which are also nonessential to the cell are 20 isolated. Such methods are highly useful, since they ascribe a function to each isolated gene, and thus the isolated nucleic acids can immediately be utilized in various specific methods and procedures.
For, example, the present invention provides methods of isolating nucleic acids encoding gene products used for viral infection, but nonessential to the cell. Viral infections of the intestine and liver are significant causes of human morbidity and mortality. Understanding the molecular mechanisms of such infections will lead to new approaches in their treatment and control.
Viruses can establish a variety of types of infection. These infections can be generally classified as lytic or persistent, though some lytic infections are considered persistent. Generally, persistent infections fall into two categories: chronic (productive) infection, infection wherein infectious virus is present and can be recovered by traditional biological methods and latent infection, infection wherein viral genome is present in the cell but infectious virus is generally not produced except during intermittent episodes of reactivation. Persistence generally involves stages of both productive and latent infection.
Lytic infections can also persist under conditions where only a small fraction of the total cells are infected (smoldering (cycling) infection). The few infected cells release virus and are killed, but the progeny virus again only infect a small number of the total cells. Examples of such smoldering infections include the persistence of lactic dehydrogenase virus in mice (Mahy, Br. Med. Bull. 41: 50-55 (1985)) and adenovirus infection in humans (Porter, D.D. pp. 784-790 in Baron, ed. Medical Microbiology 2d ed. (Addison-Wesley, Menlo Park, CA 1985)).
Furthermore, a virus may be lytic for some cell types but not for others. For example, evidence suggests that human immunodeficiency virus (HIV) is more lytic for T cells than for monocytes/macrophages, and therefore can result in a productive :15 infection of T cells that can result in cell death, whereas HIV-infected mononuclear phagocytes may produce virus for considerable periods of time without cell lysis.
(Klatzmann, et al. Science 225:59-62 (1984); Koyanagi, et al. Science 241:1673-1675 (1988); Sattentau, et al. Cell 52:631-633 (1988)).
Traditional treatments for viral infection include pharmaceuticals aimed at 20 specific virus derived proteins, such as HIV protease or reverse transcriptase, or recombinant (cloned) immune modulators (host derived), such as the interferons.
However, the current methods have several limitations and drawbacks which include high rates of viral mutations which render anti-viral pharmaceuticals ineffective. For immune modulators, limited effectiveness, limiting side effects, a lack of specificity all limit the general applicability of these agents. Also the rate of success with current antivirals and immune-modulators has been disappointing.
The current invention focuses on isolating genes that are not essential for cellular survival when disrupted in one or both alleles, but which are required for virus replication. This may occur with a dose effect, in which one allele knock-out may confer the phenotype of virus resistance for the cell. As targets for therapeutic intervention, inhibition of these cellular gene products, including: proteins, parts of proteins (modification enzymes that include, but are not restricted to glycosylation, lipid modifiers [myriolate, lipids, transcription elements and RNA regulatory molecules, may be less likely to have profound toxic side effects and virus mutation is less likely to overcome the 'block' to replicate successfully.
The present invention provides a significant improvement over previous methods of attempted therapeutic intervention against viral infection by addressing the cellular genes required by the virus for growth. Therefore, the present invention also provides an innovative therapeutic approach to intervention in viral infection by providing methods to treat viruses by inhibiting the cellular genes necessary for viral infection.
Because these genes, by virtue of the means by which they are originally detected, are nonessential to the cell's survival, these treatment methods can be used in a subject without serious detrimental effects to the subject, as has been found with previous methods. The present invention also provides the surprising discovery that virally infected cells are dependent upon a factor in serum to survive. Therefore, the present invention also provides a method for treating viral infection by inhibiting this serum survival factor. Finally, these discoveries also provide a novel method for removing virally infected cells from a cell culture by removing, inhibiting or disrupting this serum survival factor in the culture so that non-infected cells selectively survive.
The selection of tumor suppressor gene(s) has become an important area in the 20 discovery of new target for therapeutic intervention of cancer. Since the discovery that cells are restricted from promiscuous entry into the cell cycle by specific genes that are capable of suppressing a 'transformed' phenotype, considerable time has been invested in the discovery of such genes. Some of these genes include the gene associated by rhabdomyosarcoma (Rb) and the p53 (apoptosis related) encoding gene. The present invention provides a method, using gene-trapping, to select cell lines that have transformed phenotype from cells that are not transformed and to isolate from these cells a gene that can suppress a malignant phenotype. Thus, by the nature of the isolation process, a function is associated with the isolated genes. The capacity to select quickly tumor suppressor genes can provide unique targets in the process of treating or preventing, and even for diagnostic testing of, cancer.
DETAILED DESCRIPTION OF THE INVENTION The present invention utilizes a "gene trap" method along with a selection process to identify and isolate nucleic acids from genes associated with a particular function. Specifically, it provides a means of isolating cellular genes necessary for viral infection but not essential for the cell's survival, and it provides a means of isolating cellular genes that suppress tumor progression.
The present invention also provides a core discovery that virally infected cells become dependent upon at least one factor present in serum for survival, whereas noninfected cells do not exhibit this dependence. This core discovery has been utilized in the present invention in several ways. First, inhibition of the "serum survival factor" can be utilized to eradicate persistently virally infected cells from populations of non-infected cells. Inhibition of this factor can also be used to treat virus infection in a subject, as further described herein. Additionally, inhibition of or withdrawal of the serum survival 15 factor in tissue culture allows for the detection of cellular genes required for viral replication yet nonessential for an uninfected cell to survive. The present invention further provides several such cellular genes, as well as methods of treating viral infections by inhibiting the functioning of such genes.
Furthermore, the present invention provides a method for isolation of cellular 20 genes utilized in tumor progression.
The present method provides several cellular genes that are necessary for viral i growth in the cell but are not essential for the cell to survive. These genes are important for lytic and persistent infection by viruses. These genes were isolated by generating gene trap libraries by infecting cells with a retrovirus gene trap vector, selecting for cells in which a gene trap event occurred in which the vector had inserted such that the promoterless marker gene was inserted such that a cellular promoter promotes transcription of the marker gene, inserted into a functioning gene), starving the cells of serum, infecting the selected cells with the virus of choice while continuing serum starvation, and adding back serum to allow visible colonies to develop, which colonies were cloned by limiting dilution. Genes into which the retrovirus gene trap vector inserted were then isolated from the colonies using probes specific for the retrovirus gene trap vector. Thus nucleic acids isolated by this method are isolated portions of genes.
Thus the present invention provides a method of identifying a cellular gene necessary for viral growth in a cell and nonessential for cellular survival, comprising (a) transferring into a cell culture growing in serum-containing medium a vector encoding a selective marker gene lacking a functional promoter, selecting cells expressing the marker gene, removing serum from the culture medium, infecting the cell culture with the virus, and isolating from the surviving cells a cellular gene within which the marker gene is inserted, thereby identifying a gene necessary for viral growth in a cell and nonessential for cellular survival. The present invention also provides a method of identifying a cellular gene used for viral growth in a cell and nonessential for cellular survival, comprising transferring into a cell culture growing in serumcontaining medium a vector encoding a selective marker gene lacking a functional promoter, selecting cells expressing the marker gene, removing serum from the culture medium, infecting the cell culture with the virus, and isolating from the surviving cells a cellular gene within which the marker gene is inserted, thereby identifying a gene necessary for viral growth in a cell and nonessential for cellular survival. In any selected cell type, such as Chinese hamster ovary cells, one can readily determine if serum starvation is required for selection. If it is not, serum starvation may 20 be eliminated from the steps.
Alternatively, instead of removing serum from the culture medium, a serum factor required by the virus for growth can be inhibited, such as by the administration of an antibody that specifically binds that factor. Furthermore, if it is believed that there are no persistently infected cells in the culture, the serum starvation step can be eliminated and the cells grown in usual medium for the cell type. If serum starvation is used, it can be continued for a time after the culture is infected with the virus. Serum can then be added back to the culture. If some other method is used to inactivate the factor, it can be discontinued, inactivated or removed (such as removing the anti-factor antibody, with a bound antibody directed against that antibody) prior to adding fresh serum back to the culture. Cells that survive are mutants having an inactivating insertion in a gene necessary for growth of the virus. The genes having the insertions can then be isolated by isolating sequences having the marker gene sequences. This mutational process disturbs a wild type function. A mutant gene may produce at a lower level a normal product, it may produce a normal product not normally found in these cells, it may cause the overproduction of a normal product, it may produce an altered product that has some functions but not others, or it may completely disrupt a gene function. Additionally, the mutation may disrupt an RNA that has a function but is never translated into a protein. For example, the alpha-tropomyosin gene has a 3' RNA that is very important in cell regulation but never is translated into protein. (Cell pg 1107-1117, 12/17/93).
As used herein, a cellular gene "nonessential for cellular survival" means a gene for which disruption of one or both alleles results in a cell viable for at least a period of time which allows viral replication to be inhibited for preventative or therapeutic uses or 'use in research. A gene "necessary for viral growth" means the gene product, either protein or RNA, secreted or not, is necessary, either directly or indirectly in some way 15 for the virus to grow, and therefore, in the absence of that gene product a functionally available gene product), at least some of the cells containing the virus die.
For example, such genes can encode cell cycle regulatory proteins, proteins affecting the vacuolar hydrogen pump, or proteins involved in protein folding and protein modification, including but not limited to: phosphorylation, methylation, glycosylation, 20 myrislation or other lipid moiety, or protein processing via enzymatic processing. Some examples of such genes are exemplified herein, wherein some of the isolated nucleic acids correspond to genes such as vacuolar H+ATPase, alpha tropomyosin, gas5 gene, ras complex, N-acetyl-glucosaminyltransferase I mRNA, and calcyclin.
Any virus capable of infecting the cell can be used for this method. Virus can be selected based upon the particular infection desired to study. However, it is contemplated by the present invention that many viruses will be dependent upon the same cellular genes for survival; thus a cellular gene isolated using one virus can be used as a target for therapy for other viruses as well. Any cellular gene can be tested for relevancy to any desired virus using the methods set forth herein, in general, by inhibiting the gene or its gene product in a cell and determining if the desired virus can grow in that cell. Some examples of viruses include HIV (including HIV-1 and HIV-2); 7 parvovi-us; papillomaviruses;, hantaviruses; influenza viruses influenza A, B and C viruses); hepatitis viruses A to G; caliciviruses; astrovir-uses; rotaviruses; coronaviruses, such as human respiratory coronavirus; picornaviruses, such as human rhinovirus and enterovirus; ebola virus; human herpesvirus 1-S V- human cytomegalovirus; human adenovii-us; Epstein-Barr virus; hantaviruses; for animal, the animal counterpart to any above listed human virus, animal retroviruses, such as simian immunodeficiency virus, avian immunodeficiency virus, bovine immunodeficiency virus, feline irmmunodeficiency virus, equine infectious anemia virus, caprine arthritis encephalitis virus or visna virus.
The nucleic acids comprising cellular genes of this invention were isolated by the above method and as set forth in the examples. The invention includes a nucleic acid comprising the nucleotide sequence set forth in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO.7, SEQIDNO.S9, SEQIDNO:9, SEQIDNO:iO, SEQIDNO:l1, SEQID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, NO 17, NO.22, NO:27, NO:32, NO: 3 7, NO:42, NO:47, NO:52, NO:57, NO:62, NO:67, NO:72, SEQ ID NO: I18, SEQ ID NO: 2' SEQ ID NO:28, SEQ ID NO:33.
SEQ ID NO:3 8, SEQ ID NQ:43, SEQ ID NO:48, SEQ ID NO: 53, SEQ ID NO.-58, SEQ ID NO: 63, SEQ ID NO:68, SEQ ID NO: 73, SEQ ID NO: 19, SEQ ID NO:24, SEQ ID NO:29, SEQ ID N0:34, SEQ ID NO:39, SEQ ID NO:44, SEQ ID NO:49, SEQ ID NO: 54, SEQ ID NO:59, SEQ ID N0-64, SEQ ID NO:69, SEQ ID NO:20, SEQ ID NO:25, SEQ ID NO:30, SEQ ID NO:35, SEQ IDj NO:40, SEQ ID NO:45.
SEQ ID NO: 50, SEQ ID NO.-55, SEQ ID NO:60, SEQ ID NO:65, SEQ ID NO: 70, SEQ ID NO:21, SEQ ID NO:26, SEQ ID NO:.3 1, SEQ ID NO.36, SEQ ID NO:41, SEQ ID NO:46, SEQ ID NO-S 1, SEQ ID NO:56, SEQ MD NO:6I, SEQ ID NO: 66, SEQ ID SEQ ID SEQ JID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQIDNO:71, SEQID SEQ ID NO:74 or SEQ ID NO:75 (this list is sometimes referred to herein as "SEQ ID NO:S through SEQ ID NO:75' for brevity). Thus these nucleic acids can contain, in addition to the nucleotides set forth in each SEQ ID NO in the sequence listing, additional nucleotides at either end of the molecule. Such additional nucleotides can be added by any standard method, as Known in the art, such as recombinant methods and synthesis methods. Examples of such nucleic acids comprising the nucleotide sequence set forth in any entry of the sequence listing contemplated by this invention include, but are not limited to, for example, the nucleic acid placed into a vector; a nucleic acid having one or more regulatory region promoter, enhancer, polyadenylation site) linked to it, particularly in functional manner, i.e. such that an mRNA or a protein can be produced; a nucleic acid including additional nucleic acids of the gene, such as a larger or even full length genomic fragment of the gene, a partial or full length cDNA, a partial or full length RNA. Making and/or isolating such larger nucleic acids is further described below and is well known and standard in the art.
The invention also provides a nucleic acid encoding the protein encoded by the gene comprising the nucleotide sequence set forth in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO: 17, 15 SEQ ID NO:22, SEQ ID NO:27, SEQ ID NO:32, SEQ ID NO:37, SEQ ID NO:42, SEQ ID NO:47, SEQ ID NO:52, SEQ ID NO:57, SEQ ID NO:62, SEQ ID NO:67, SEQ ID NO:72, SEQ ID NO: 18, SEQ ID NO:23, SEQ ID NO:28, SEQ ID NO:33, SEQ ID NO:38, SEQ ID NO:43, SEQ ID NO:48, SEQ ID NO:53, SEQ ID NO.58, SEQ ID NO:63, SEQ ID NO:68, SEQ ID NO:73, SEQ ID NO: 19, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:34, SEQ ID NO:39, SEQ ID NO:44, SEQ ID NO:49, SEQ ID NO:54, SEQ ID NO:59, SEQ ID NO:64, SEQ ID NO:69, SEQ ID NO:20, SEQ ID NO:25, SEQ ID NO:30, SEQ ID NO:35, SEQ ID NO:40, SEQ ID NO:45, SEQ ID NO:50, SEQ ID NO:55, SEQ ID NO:60, SEQ ID NO:65, SEQ ID NO:70, SEQ ID NO:21, SEQ ID NO:26, SEQ ID NO:31, SEQ ID NO:36, SEQ ID NO:41, SEQ ID NO:46, SEQ ID NO:51, SEQ ID NO:56, SEQ ID NO:61, SEQ ID NO:66, SEQ ID NO:71, SEQ ID NO:74 or SEQ ID NO:75 as well as allelic variants and homologs of each such gene. The gene is readily obtained using standard methods, as described below and as is known and standard in the art. The present invention also contemplates any unique fragment of these genes or of the nucleic acids set forth in any of SEQ ID NO:5 through SEQ ID NO:75. Examples of inventive fragments of the inventive genes are the nucleic acids whose sequence is set forth in any of SEQ ID NO:5 through SEQ ID NO:75. To be unique, the fragment must be of sufficient size to distinguish it from other known sequences, most readily determined by comparing any nucleic acid fragment to the nucleotide sequences of nucleic acids in computer databases, such as GenBank. Such comparative searches are standard in the art. Typically, a unique fragment useful as a primer or probe will be at least about 20 to about 25 nucleotides in length, depending upon the specific nucleotide content of the sequence. Additionally, fragments can be, for example, at least about 30, 40, 50, 100, 200 or 500 nucleotides in length. The nucleic acids can be single or double stranded, depending upon the purpose for which it is intended.
The present invention further provides a nucleic acid comprising the regulatory region of a gene comprising the nucleotide sequences set forth in SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ IDNO:8, SEQ ID NO:9, SEQ ID NO:10. SEQ ID NO:11. SEQ ID NO:12, SEQ ID NO:13, SEQ IDNO:14, SEQ ID NO:15, SEQ ID NO:16.
NO:21, 15 NO:26, NO:31, NO:36, NO:41, NO:46, NO:51, NO:56, NO:61, NO:66, NO:71, SEQ ID NO:17, SEQ ID NO:22, SEQ ID NO:27, SEQ ID NO:32, SEQ ID NO:37, SEQ ID NO:42, SEQ ID NO:47, SEQ ID NO:52, SEQ ID NO:57, SEQ ID NO:62, SEQ ID NO:67, SEQ ID NO:72, SEQ ID NO: 18, SEQ ID NO:23, SEQ ID NO:28, SEQ ID NO:33, SEQ ID NO:38, SEQ ID NO:43, SEQ ID NO:48, SEQ ID NO:53, SEQ ID NO:58, SEQ ID NO:63, SEQ ID NO:68, SEQ ID NO:73, SEQ ID NO:19, SEQ ID NO:24, SEQ ID NO:29, SEQ ID NO:34, SEQ ID NO:39, SEQ ID NO:44, SEQ ID NO:49, SEQ ID NO:54, SEQ ID NO:59, SEQ ID NO:64, SEQ ID NO:69, SEQ ID NO:74, SEQ ID NO:20, SEQ ID NO:25, SEQ ID NO:30, SEQ ID NO:35, SEQ ID NO:40, SEQ ID NO:45, SEQ ID NO:50, SEQ ID NO:55, SEQ ID NO:60, SEQ ID NO:65, SEQ ID NO:70.
SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID SEQ ID Additionally provided is a construct comprising such a regulatory region functionally linked to a reporter gene. Such reporter gene constructs can be used to screen for compounds and compositions that affect expression of the gene comprising the nucleic acids whose sequence is set forth in any of SEQ ID NO: 5 through SEQ ID NO: The nucleic acids set forth in the sequence listing are gene fragments; the entire coding sequence and the entire gene that comprises each fragment are both contemplated herein and are readily obtained by standard methods, given the nucleotide sequences presented in the sequence listing (see. Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989; DNA cloning: A Practical Approach, Volumes I and II, Glover, D.M. ed., IRL Press Limited, Oxford, 1985). To obtain the entire genomic gene, briefly, a nucleic acid whose sequence is set forth in any of SEQ ID NO:1 through SEQ ID NO:83, or preferably in any of SEQ ID NO:5 through SEQ ID NO:83, or a smaller fragment thereof, is utilized as a probe to screen a genomic library under high stringency conditions, and isolated clones are sequenced. Once the sequence of the new clone is determined, a probe can be devised from a portion of the new clone not present in the previous fragment and hybridized to the library to isolate more clones containing fragments of the gene. In this manner, by repeating this process in organized fashion, one can "walk" along the chromosome and eventually obtain nucleotide sequence for the entire gene. Similarly, one can use portions of the present fragments, or additional fragments obtained from the genomic library, that contain open reading frames to 15 screen a cDNA library to obtain a cDNA having the entire coding sequence of the gene.
Repeated screens can be utilized as described above to obtain the complete sequence S. :from several clones if necessary. The isolates can then be sequenced to determine the nucleotide sequence by standard means such as dideoxynucleotide sequencing methods (see, Sambrook et al., Molecular Cloning. A'LaboratoryManual, 2nd Ed., Cold 20 Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989).
The present genes were isolated from rat; however, homologs in any desired species, preferably mammalian, such as human, can readily be obtained by screening a human library, genomic or cDNA, with a probe comprising sequences of the nucleic acids set forth in the sequence listing herein, or fragments thereof, and isolating genes specifically hybridizing with the probe under preferably relatively high stringency hybridization conditions. For example, high salt conditions in 6X SSC or 6X SSPE) and/or high temperatures of hybridization can be used. For example, the stringency of hybridization is typically about 5"C to 20"C below the T, (the melting temperature at which half of the molecules dissociate from its partner) for the given chain length. As is known in the art, the nucleotide composition of the hybridizing region factors in determining the melting temperature of the hybrid. For 20mer probes, 11 for example, the recommended hybridization temperature is typically about 55-58 C.
Additionally, the rat sequence can be utilized to devise a probe for a homolog in any specific animal by determining the amino acid sequence for a portion of the rat protein, and selecting a probe with optimized codon usage to encode the amino acid sequence of the homolog in that particular animal. Any isolated gene can be confirmed as the targeted gene by sequencing the gene to determine it contains the nucleotide sequence listed herein as comprising the gene. Any homolog can be confirmed as a homolog by its functionality.
Additionally contemplated by the present invention are nucleic acids, from any desired species, preferably mammalian and more preferably human, having 98%, 85%, 80%, 70%, 60%. or 50% homology, or greater, in the region of homology, to a region in an exon of a nucleic acid encoding the protein encoded by the gene comprising the nucleotide sequence set forth in any of SEQ ID NO:5 through SEQ ID NO:75 of the sequence listing or to homologs thereof Also contemplated by the 15 present invention are nucleic acids, from any desired species, preferably mammalian and Se" more preferably human, having 98%, 95%, 90%, 85%, 80%, 70%, 60%, or homology, or greater, in the region of homology, to a region in an exon of a nucleic acid comprising the nucleotide sequence set forth in any of SEQ ID NO:5 through SEQ ID of the sequence listing or to homologs thereof These genes can be synthesized 20 or obtained by the same methods used to isolate homologs, with stringency of hybridization and washing, if desired, reduced accordingly as homology desired is decreased, and further, depending upon the G-C or A-T richness of any area wherein variability is searched for. Allelic variants of any of the present genes or of their homologs can readily be isolated and sequenced by screening additional libraries following the protocol above. Methods of making synthetic genes are described in U.S.
Patent No. 5,503,995 and the references cited therein.
The nucleic acid encoding any selected protein of the present invention can be any nucleic acid that functionally encodes that protein. For example, to functionally encode, allow the nucleic acid to be expressed, the nucleic acid can include, for example, exogenous or endogenous expression control sequences, such as an origin of replication, a promoter, an enhancer, and necessary information processing sites, such as ribosome binding sites, RNA splice sites, polyadenylation sites, and transcriptional terminator sequences. Preferred expression control sequences can be promoters derived from metallothionine genes, actin genes, immunoglobulin genes, CMV, adenovirus, bovine papilloma virus, etc. Expression control sequences can be selected for functionality in the cells in which the nucleic acid will be placed. A nucleic acid encoding a selected protein can readily be determined based upon the amino acid sequence of the selected protein, and, clearly, many nucleic acids will encode any selected protein.
The present invention additionally provides a nucleic acid that selectively hybridizes under stringent conditions with a nucleic acid encoding the protein encoded by the gene comprising the nucleotide sequence set forth in any sequence listed herein any of SEQ ID NO:5 through SEQ ID NO:75). This hybridization can be specific.
The degree of complementarity between the hybridizing nucleic acid and the sequence to which it hybridizes should be at least enough to exclude hybridization with a nucleic acid 15 encoding an unrelated protein. Thus, a nucleic acid that selectively hybridizes with a nucleic acid of the present protein coding sequence will not selectively hybridize under stringent conditions with a nucleic acid for a different, unrelated protein, and vice versa.
Typically, the stringency of hybridization to achieve selective hybridization involves hybridization in high ionic strength solution (6X SSC or 6X SSPE) at a temperature that 20 is about 12-25°C below the (the melting temperature at which half of the molecules dissociate from its partner) followed by washing at a combination of temperature and salt concentration chosen so that the washing temperature is about 5°C to 20 0 C below the T, of the hybrid molecule. The temperature and salt conditions are readily determined empirically in preliminary experiments in which samples of reference DNA immobilized on filters are hybridized to a labeled nucleic acid of interest and then washed under conditions of different stringencies. Hybridization temperatures are typically higher for DNA-RNA and RNA-RNA hybridizations. The washing temperatures can be used as described above to achieve selective stringency, as is known in the art. (Sambrook et al., Molecular Cloning: Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989; Kunkel et al. Methods Enzymol. 1987:154:367, 1987). Nucleic acid fragments that selectively hybridize to any given nucleic acid can be used, as primers and or probes for further hybridization or for amplification methods polymerase chain reaction (PCR), ligase chain reaction (LCR)) A preferable stringent hybridization condition for a DNA:DNA hybridization can be at about 68"C (in aqueous solution) in 6X SSC or 6X SSPE followed by washing at 68 0
C.
The present invention additionally provides a protein encoded by a nucleic acid encoding the protein encoded by the gene comprising any of the nucleotide sequences set forth herein any of SEQ ID NO: 5 through SEQ ID NO:75). The protein can be readily obtained by any of several means. For example, the nucleotide sequence of coding regions of the gene can be translated and then the corresponding polypeptide can be synthesized mechanically by standard methods. Additionally, the coding regions of the genes can be expressed or synthesized, an antibody specific for the resulting *polypeptide can be raised by standard methods (see, Harlow and Lane, Antibodies: A Laborrry Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, New 15 York, 1988), and the protein can be isolated from other cellular proteins by selective hybridization with the antibody. This protein can be purified to the extent desired by standard methods of protein purification (see, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989). The amino acid sequence of any protein, polypeptide or peptide of 20 this invention can be deduced from the nucleic acid sequence, or it can be determined by sequencing an isolated or recombinantly produced protein.
The terms "peptide," "polypeptide"and "protein" are used interchangeably herein and refer to a polymer of amino acids and includes full-length proteins and fragments thereof As used in the specification and in the claims, can mean one or more, depending upon the context in which it is used. An amino acid residue is an amino acid formed upon chemical digestion (hydrolysis) of a polypeptide at its peptide linkages.
The amino acid residues described herein are preferably in the isomeric form.
However, residues in the isomeric form can be substituted for any L-amino acid residue, as long as the desired functional property is retained by the polypeptide.
Standard polypeptide nomenclature (described in J. Biol. Chem., 243:3552-59 (1969) and adopted at 37 CFR 1.822(b)) is used herein.
14 As will be appreciated by those skilled in the art, the invention also includes those polypeptides having slight variations in amino acid sequences or other properties.
Amino acid substitutions can be selected by known parameters to be neutral (see, e.g., Robinson WE Jr, and Mitchell WM., AIDS 4:S151-S162(1990)). Such variations may arise naturally as allelic variations due to genetic polymorphism) or may be produced by human intervention by mutagenesis of cloned DNA sequences), such as induced point, deletion, insertion and substitution mutants. Minor changes in amino acid sequence are generally preferred, such as conservative amino acid replacements, small internal deletions or insertions, and additions or deletions at the ends of the molecules. Substitutions may be designed based on, for example, the model of Dayhoff, et al. (in Atlas of Protein Sequence and Structure 1978, Nat'l Biomed. Res. Found., Washington, These modifications can result in changes in the amino acid sequence, provide silent mutations, modify a restriction site, or provide other specific mutations. Likewise, such amino acid changes result in a different nucleic acid encoding 15 the polypeptides and proteins. Thus, alternative nucleic acids are also contemplated by such modifications.
The present invention also provides cells containing a nucleic acid of the invention. A cell containing a nucleic acid encoding a protein typically can replicate the DNA and, further, typically can express the encoded protein. The cell can be a 20 prokaryotic cell, particularly for the purpose of producing quantities of the nucleic acid, or a eukaryotic cell, particularly a mammalian cell. The cell is preferably a mammalian cell for the purpose of expressing the encoded protein so that the resultant produced protein has mammalian protein processing modifications.
Nucleic acids of the present invention can be delivered into cells by any selected means, in particular depending upon the purpose of the delivery of the compound and the target cells. Many delivery means are well-known in the art. For example, electroporation, calcium phosphate precipitation, microinjection, cationic or anionic liposomes, and liposomes in combination with a nuclear localization signal peptide for delivery to the nucleus can be utilized, as is known in the art.
The present invention also contemplates that the mutated cellular genes necessary for viral growth, produced by the present method, as well as cells containing these mutants can also be useful. These mutated genes and cells containing them can be isolated and/or produced according to the methods herein described and using standard methods.
It should be recognized that the sequences set forth herein may contain minor sequencing errors. Such errors can be corrected, for example, by using the hybridization procedure described above with various probes derived from the described sequences such that the coding sequence can be reisolated and resequenced.
As described in the examples, the present invention provides the discovery of a "serum survival factor" present in serum that is necessary for the survival of persistently virally infected cells. Isolation and characterization of this factor have shown it to be a protein, to have a molecular weight of between about 50 kD and 100 kD, to resist inactivation in low pH pH2) and chloroform extraction, to be inactivated by boiling for about 5 minutes and in low ionic strength solution about 10 mM to about 50 mM). The present invention thus provides a purified mammalian serum 15 protein having a molecular weight of between about 50 kD and 100 kD which resists inactivation in low pH and resists inactivation by chloroform extraction, which S. :inactivates when boiled and inactivates in low ionic strength solution, and which when S* removed from a cell culture comprising cells persistently infected with reovirus selectively substantially prevents survival of cells persistently infected with reovirus.
20 The factor, fitting the physical characteristics described above, can readily be verified by adding it to non-serum-containing medium (which previously could not support survival of persistently virally infected cells) and determining whether this medium with the added putative factor can now support persistently virally infected cells, particularly cells persistently infected with reovirus. As used herein, a "purified" protein means the protein is at least of sufficient purity such that an approximate molecular weight can be determined.
The amino acid sequence of the protein can be elucidated by standard methods.
For example, an antibody to the protein can be raised and used to screen an expression library to obtain nucleic acid sequence coding the protein. This nucleic acid sequence is then simply translated into the corresponding amino acid sequence. Alternatively, a portion of the protein can be directly sequenced by standard amino acid sequencing methods (amino-terminus sequencing). This amino acid sequence can then be used to generate an array of nucleic acid probes that encompasses all possible coding sequences for a portion of the amino acid sequence. The array of probes is used to screen a cDNA library to obtain the remainder of the coding sequence and thus ultimately the corresponding amino acid sequence.
The present invention also provides methods of detecting and isolating additional serum survival factors. For example, to determine if any known serum components are necessary for viral growth, the known components can be inhibited in, or eliminated from, the culture medium, and it can be observed whether viral growth is inhibited by determining if persistently infected cells do not survive. One can add the factor back (or remove the inhibition) and determine whether the factor allows for viral growth.
Additionally, other, unknown serum components can also be found to be essential for viral growth. Serum can be fractionated by various standard means, and fractions added to serum free medium to determine if a factor is present in a reaction 15 that allows viral growth previously inhibited by the lack of serum. Fractions having this activity can then be further fractionated until the factor is relatively free of other components. The factor can then be characterized by standard methods, such as size o. fractionation, denaturation and/or inactivation by various means, etc. Preferably, once the factor has been purified to a desired level of purity, it is added to cells in serum free 20 medium to confirm that it bestows the function of allowing virus to grow when serumfree medium alone did not. This method can be repeated to confirm the requirement for the specific factor for any desired virus, since each serum factor found to be required by any one virus can also be required by many other viruses. In general, the closer the viruses are related and the more similar the infection modes of the viruses, the more likely that a factor required by one virus will be required by the other.
The present invention also provides methods of treating virus infections utilizing applicants' discoveries. The subject of any of the herein described methods can be any animal, preferably a mammal, such as a human, a veterinary animal, such as a cat, dog, horse, pig, goat, sheep, or cow, or a laboratory animal, such as a mouse, rat, rabbit, or guinea pig, depending upon the virus.
The present invention provides a method of reducing or inhibiting, and thereby treating, a viral infection in a subject, comprising administering to the subject an inhibiting amount of a composition that inhibits functioning of the serum protein described herein, i.e. the serum protein having a molecular weight of between about kD and 100 kD which resists inactivation in low pH and resists inactivation by chloroform extraction, which inactivates when boiled and inactivates in low ionic strength solution, and which when removed from a cell culture comprising cells persistently infected with the virus prevents survival of at least some cells persistently infected with the virus, thereby treating the viral infection. The composition can comprise, for example, an antibody that specifically binds the serum protein, or an antisense RNA that binds an RNA encoded by a gene functionally encoding the serum protein *Any virus capable of infecting the selected subject to be treated can be treated by the present method. As described above, any serum protein or survival factor found by 15 the present methods to be necessary for growth of any one virus can be found to be necessary for growth of many other viruses. For any given virus, the serum protein or factor can be confirmed to be required for growth by the methods described herein. The cellular genes identified by the examples using reovirus, a mammalian pathogen, and a rat cell system have general applicability to other virus infections that include all of the 20 known as well as yet to be discovered human pathogens, including, but not limited to: human immunodeficiency viruses HIV-1, HIV-2); parvovirus; papillomaviruses; hantaviruses; influenza viruses influenza A, B and C viruses); hepatitis viruses A to G; caliciviruses; astroviruses; rotaviruses; coronaviruses, such as human respiratory coronavirus; picornaviruses, such as human rhinovirus and enterovirus; ebola virus; human herpesvirus HSV-1-9); human cytomegalovirus; human adenovirus; Epstein-Barr virus; hantaviruses; for animal, the animal counterpart to any above listed human virus, animal retroviruses, such as simian immunodeficiency virus, avian immunodeficiency virus, bovine immunodeficiency virus, feline immunodeficiency virus, equine infectious anemia virus, caprine arthritis encephalitis virus or visna virus.
A protein inhibiting amount of the composition can be readily determined, such as by administering varying amounts to cells or to a subject and then adjusting the effective amount for inhibiting the protein according to the volume of blood or weight of the subject. Compositions that bind to the protein can be readily determined by running the putatively bound protein on a protein gel and observing an alteration in the protein's migration through the gel. Inhibition of the protein can be determined by any desired means such as adding the inhibitor to complete media used to maintain persistently infected cells and observing the cells' viability. The composition can comprise, for example, an antibody that specifically binds the serum protein. Specific binding by an antibody means that the antibody can be used to selectively remove the factor from serum or inhibit the factor's biological activity and can readily be determined by radio immune assay (RIA), bioassay, or enzyme-linked immunosorbant (ELISA) technology.
The composition can comprise, for example, an antisense RNA that specifically binds an RNA encoded by the gene encoding the serum protein. Antisense RNAs can be synthesized and used by standard methods Anlisense RNA and DNA, D. A.
Melton, Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1988)).
The present methods provide a method of screening a compound for treating a viral infection, comprising administering the compound to a cell containing a cellular gene functionally encoding a gene product necessary for reproduction of the virus in the cell but not necessary for survival of the cell and detecting level of the gene product produced, a decrease or elimination of the gene product indicating a compound for treating the viral infection. The present methods also provide a method of screening a compound for effectiveness in treating a viral infection, comprising administering the compound to a cell containing a cellular gene functionally encoding a gene product necessary for reproduction of the virus in the cell but not necessary for survival of the cell and detecting the level of the gene product produced, a decrease or elimination of the gene product indicating a compound effective for treating the viral infection. The cellular gene can be, for example, any gene provided herein, any of the genes comprising the nucleotide sequences set forth in any of SEQ ID NO:1 through SEQ ID or any other gene obtained using the methods provided herein for obtaining such genes. Level of the gene product can be measured by any standard means, such as by detection with an antibody specific for the protein. The level of gene product can be compared to the level of the gene product in a control cell not contacted with the compound. The level of gene product can be compared to the level of the gene product in the same cell prior to addition of the compound. Relatedly, the regulatory region of the gene can be functionally linked to a reporter gene and compounds can be screened for inhibition of the reporter gene. Such reporter constructs are described herein.
The present invention provides a method of selectively eliminating cells persistently infected with a virus from an animal cell culture capable of surviving for a first period of time in the absence of serum, comprising propagating the cell culture in the absence of serum for a second time period which a persistently infected cell cannot survive without serum, thereby selectively eliminating from the cell culture cells persistently infected with the virus. The second time period should be shorter than the first time period. Thus one can simply eliminate serum from a standard culture medium composition for a period of time by removing serum containing medium from the culture container, rinsing the cells, and adding serum-free medium back to the container), then, after a time of serum starvation, return serum to the culture medium.
15 Alternatively, one can inhibit a serum survival factor from the culture in place of the step of serum starvation. Furthermore, one can instead interfere with the virus-factor interaction. Such a viral elimination method can periodically be performed for cultured cells to ensure that they remain virus-free. The time period of serum removal can greatly vary, with a typical range being about I to about 30 days; a preferable period can be about 3 to about 10 days, and a more preferable period can be about 5 days to about 7 days. This time period can be selected based upon ability of the specific cell to survive without serum as well as the life cycle of the virus, for reovirus, which has a life cycle of about 24 hours, 3 days' starvation of cells provides dramatic results.
Furthermore, the time period can be shortened by also passaging the cells during the starvation, in general, increasing the number of passages can decrease the time of serum starvation (or serum factor inhibition) needed to get fill clearance of the virus from the culture. While passaging, the cells typically are exposed briefly to serum (typically for about 3 to about 24 hours). This exposure both stops the action of the trypsin used to dislodge the cells and stimulates the cells into another cycle of growth, thus aiding in this selection process. Thus a starvation/serum cycle can be repeated to optimize the selective effect. Other standard culture parameters, such as confluency of the cultures, pH, temperature, etc. can be varied to alter the needed time period of serum starvation (or serum survival factor inhibition). This time period can readily be determined for any given viral infection by simply removing the serum for various periods of time, then testing the cultures for the presence of the infected cells by ability to survive in the absence of serum and confirmed by quantitating virus in cells by standard virus titration and immunohistochemical techniques) at each tested time period, and then detecting at which time periods of serum deprivation the virally infected cells were eliminated. It is preferable that shorter time periods of serum deprivation that still provide elimination of the persistently infected cells be used. Furthermore, the cycle of starvation, then adding back serum and determining amount of virus remaining in the culture can be repeated until no virtually infected cells remain in the culture.
Thus, the present method can further comprise passaging the cells, i.e., transferring the cell culture from a first container to a second container. Such transfer can facilitate the selective lack of survival of virally infected cells. Transfer can be 15 repeated several times. Transfer is achieved by standard methods of tissue culture (see, Freshney, Culture of Animal Cells, A Manual of Basic Technique, 2nd Ed. Alan R.
Liss, Inc., New York, 1987).
The present method further provides a method of selectively eliminating from a cell culture cells persistently infected with a virus, comprising propagating the cell culture in the absence of a functional form of the serum protein having a molecular weight of between about 50 kD and 100 kD which resists inactivation in low pH and resists inactivation by chloroform extraction, which inactivates when boiled and inactivates in low ionic strength solution, and which when removed from a cell culture comprising cells persistently infected with reovirus substantially prevents survival of cells persistently infected with reovirus. The absence of the functional form can be achieved by any of several standard means, such as by binding the protein to an antibody selective for it (binding the antibody in serum either before or after the serum is added to the cells; if before, the serum protein can be removed from the serum by, binding the antibody to a column and passing the serum over the column and then administering the survival protein-free serum to the cells), by administering a compound that inactivates the protein, or by administering a compound that interferes with the interaction between the virus and the protein.
Thus, the present invention provides a method of selectively eliminating from a cell culture propagated in serum-containing medium cells persistently infected with a virus, comprising inhibiting in the serum the protein having a molecular weight of between about 50 kD and 100 kD which resists inactivation in low pH and resists inactivation by chloroform extraction, which inactivates when boiled and inactivates in low ionic strength solution, and which when removed from a cell culture comprising cells persistently infected with reovirus substantially prevents survival of cells persistently infected with reovirus. Alternatively, the interaction between the virus and the serum protein can be disrupted to selectively eliminate cells persistently infected with the virus.
Any virus capable of some form of persistent infection may be eliminated from a cell culture utilizing the present elimination methods, including removing, inhibiting or 15 otherwise interfering with a serum protein, such as the one exemplified herein, and also including removing, inhibiting or otherwise interfering with a gene product from any cellular gene found by the present method to be necessary for viral growth yet nonessential to the cell. For example, DNA viruses or RNA viruses can be targeted.
One can readily determine whether cells infected with a selected virus can be selectively 20 removed from a culture through removal of serum by starving cells permissive to the virus of serum (or inhibiting the serum survival factor), adding the selected virus to the cells, adding serum to the culture, and observing whether infected cells die by titering levels of virus in the surviving cells with an antibody specific for the virus).
A culture of any animal cell any cell that is typically grown and maintained in culture in serum) that can be maintained for a period of time in the absence of serum, can be purified from viral infection utilizing the present method. For example, primary cultures as well as established cultures and cell lines can be used. Furthermore, cultures of cells from any animal and any tissue or cell type within that animal that can be cultured and that can be maintained for a period of time in the absence of serum can be used. For example, cultures of cells from tissues typically infected, and particularly persistently infected, by an infectious virus could be used.
22 As used in the claims "in the absence of serum" means at a level at which persistently virally infected cells do not survive. Typically, the threshold level is about 1% serum in the media. Therefore, about 1% serum or less can be used, such as about 0.75%, 0.50%. 0.25% 0. 1% or no serum can be used.
As used herein, "selectively eliminating" cells persistently infected with a virus means that substantially all of the cells persistently infected with the virus are killed such that the presence of virally infected cells cannot be detected in the culture immediately after the elimination procedure has been performed. Furthermore, "selectively eliminating" includes that cells not infected with the virus are generally not killed by the method. Some surviving cells may still produce virus but at a lower level, and some may be defective in pathways that lead to death by the virus. Typically, for cells persistently infected with virus to be substantially all killed, more than about 90% of the cells, and more preferably less than about 95%, 98%, 99%, or 99.99% of viruscontaining cells in the culture are killed.
The present method also provides a nucleic acid comprising the regulatory region of any of the genes. Such regulatory regions can be isolated from the genomic sequences isolated and sequenced as described above and identified by any characteristics observed that are characteristic for regulatory regions of the species and by their relation to the start codon for the coding region of the gene. The present invention also provides a construct comprising the regulatory region functionally linked to a reporter gene. Such constructs are made by routine subcloning methods, and many :i vectors are available into which regulatory regions can be subcloned upstream of a marker gene. Marker genes can be chosen for ease of detection of marker gene product.
The present method therefore also provides a method of screening a compound for treating a viral infection, comprising administering the compound to a cell containing any of the above-described constructs, comprising a regulatory region of one of the genes comprising the nucleotide sequence set forth in any of SEQ ID NO: I through SEQ ID NO:75 functionally linked to a reporter gene, and detecting the level of the reporter gene product produced, a decrease or elimination of the reporter gene product indicating a compound for treating the viral infection. Compounds detected by this method would inhibit transcription of the gene from which the regulatory region was isolated, and thus, in treating a subject, would inhibit the production of the gene product produced by the gene, and thus treat the viral infection.
The present invention additionally provides a method of reducing or inhibiting a viral infection in a subject, comprising administering to the subject an amount of a composition that inhibits expression or functioning of a gene product encoded by a gene comprising the nucleic acid set forth in any of SEQ ID NO: 1 through SEQ ID or a homolog thereof, thereby treating the viral infection, the composition can comprise, for example, an antibody that binds a protein encoded by the gene. The composition can also comprise an antibody that binds a receptor for a protein encoded by the gene.
Such an antibody can be raised against the selected protein by standard methods, and can be either polyclonal or monoclonal, though monoclonal is preferred. Alternatively, the composition can comprise an antisense RNA that binds an RNA encoded by the gene. Furthermore, the composition can comprise a nucleic acid functionally encoding an antisense RNA that binds an RNA encoded by the gene. Other useful compositions 15 will be readily apparent to the skilled artisan.
The present invention further provides a method of reducing or inhibiting a viral infection in a subject comprising mutating ex vivo in a selected cell from the subject an endogenous gene comprising the nucleic acid set forth in any of SEQ ID NO:1 through SEQ ID NO:75, or a homolog thereof, to a gene form incapable of producing a functional gene product of the gene or a gene form producing a reduced amount of a functional gene product of the gene, and replacing the cell in the subject, thereby reducing viral infection of cells in the subject. The cell can be selected according to the typical target cell of the specific virus whose infection is to be reduced, prevented or inhibited. A preferred cell for several viruses is a hematopoietic cell. When the selected cell is a hematopoietic cell, viruses which can be reduced or inhibited from infection can include, for example, HIV, including HIV-L and HIV-2.
The present invention also provides a method of reducing or inhibiting a viral infection in a subject comprising mutating ex vivo in a selected cell from the subject an endogenous gene comprising a nucleic acid isolated by a method comprising transferring into a cell culture growing in serum-containing medium a vector encoding a selective marker gene lacking a functional promoter, selecting cells 24 expressing the marker gene, removing serum from the culture medium, (d) infecting the cell culture with the virus, and isolating from the surviving cells a cellular gene within which the marker gene is inserted, to a mutated gene form incapable of producing a functional gene product of the gene or to a mutated gene form producing a reduced amount of a functional gene product of the gene, and replacing the cell in the subject, thereby reducing viral infection of cells in the subject. Thus the mutated gene form can be one incapable of producing an effective amount of a functional protein or mRNA, or one incapable of producing a functional protein or mRNA, for example. The method can be performed wherein the virus is HIV. The method can be performed in any selected cell in which the virus may infect with deleterious results. For example, the cell can be a hematopoietic cell. However, many other virus-cell combinations will be apparent to the skilled artisan.
The present invention additionally provides a method of increasing viral infection 15 resistance in a subject comprising mutating ex vivo in a selected cell from the subject an endogenous gene comprising a nucleic acid isolated by a method comprising transferring into a cell culture growing in serum-containing medium a vector encoding a selective marker gene lacking a functional promoter, selecting cells expressing the marker gene, removing serum from the culture medium, (d) infecting the cell culture with the virus, and isolating from the surviving cells a cellular gene within which the marker gene is inserted, ito a mutated gene form incapable of producing a functional gene product of the gene or a gene form producing a reduced amount of a functional gene product of the gene, and replacing the cell in the subject, thereby reducing viral infection of cells in the subject.
The virus can be HIV, particularly when the cell is a hematopoietic cell. However, many other virus-cell combinations will be apparent to the skilled artisan.
The present invention provides a method of identifying a cellular gene that can suppress a malignant phenotype in a cell, comprising transferring into a cell culture incapable of growing well in soft agar or Matrigel a vector encoding a selective marker gene lacking a fUnctional promoter, selecting cells expressing the marker gene, and isolating from selected cells which are capable of growing in soft agar or Matrigel a cellular gene within which the marker gene is inserted, thereby identifying a gene that can suppress a malignant phenotype in a cell. This method can be performed using any selected non-transformed cell line, of which many are known in the art.
The present invention additionally provides a method of identifying a cellular gene that can suppress a malignant phenotype in a cell, comprising transferring into a cell culture of non-transformed cells a vector encoding a selective marker gene lacking a functional promoter, selecting cells expressing the marker gene, and isolating from selected and transformed cells a cellular gene within which the marker gene is inserted, thereby identifying a gene that can suppress a malignant phenotype in a cell. A to non-transformed phenotype can be determined by any of several standard methods in the art, such as the exemplified inability to grow in soft agar, or inability to grow in Matrigel.
The present invention further provides a method of screening for a compound for suppressing a malignant phenotype in a cell comprising administering the compound 15 to a cell containing a cellular gene functionally encoding a gene product involved in establishment of a malignant phenotype in the cell and detecting the level of the gene product produced, a decrease or elimination of the gene product indicating a compound effective for suppressing the malignant phenotype Detection of the level, or amount, of gene product produced can be measured, directly or indirectly, by any of several 20 methods standard in the art protein gel, antibody-based assay, detecting labeled RNA) for assaying protein levels or amounts, and selected based upon the specific gene product.
The present invention further provides a method of suppressing a malignant phenotype in a cell in a subject, comprising administering to the subject an amount of a composition that inhibits expression or functioning of a gene product encoded by a gene comprising the nucleic acid set forth in SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NOS80, SEQ ID NO:81, SEQ ID NO:82 or SEQ ID NO:83, or a homolog thereof, thereby suppressing a malignant phenotype. The composition can, for example, comprise an antibody that binds a protein encoded by the gene. The composition can, as another example, comprise an antibody that binds a receptor for a protein encoded by the gene. The composition can comprise an antisense RNA that binds an RNA encoded by the gene. Further, the composition can comprise a nucleic acid functionally encoding an antisense RNA that binds an RNA encoded by the gene.
Diagnostic or therapeutic agents of the present invention can be administered to a subject or an animal model by any of many standard means for administering therapeutics or diagnostics to that selected site or standard for administering that type of functional entity. For example, an agent can be administered orally, parenterally (e.g, intravenously), by intramuscular injection, by intraperitoneal injection, topically, transdermally, or the like. Agents can be administered, as a complex with cationic liposomes, or encapsulated in anionic liposomes. Compositions can include various amounts of the selected agent in combination with a pharmaceutically acceptable carrier .and, in addition, if desired, may include other medicinal agents, pharmaceutical agents, carriers, adjuvants, diluents, etc. Parental administration, if used, is generally characterized by injection. Injectables can be prepared in conventional forms, either as 15 liquid solutions or suspensions, solid forms suitable for solution or suspension in liquid prior to injection, or as emulsions. Depending upon the mode of administration, the agent can be optimized to avoid degradation in the subject, such as by encapsulation, etc.
Dosages will depend upon the mode of administration, the disease or condition to be treated, and the individual subject's condition, but will be that dosage typical for and used in administration of antiviral or anticancer agents. Dosages will also depend upon the composition being administered, a protein or a nucleic acid. Such dosages are known in the art. Furthermore, the dosage can be adjusted according to the typical dosage for the specific disease or condition to be treated. Furthermore, viral titers in culture cells of the target cell type can be used to optimize the dosage for the target cells in vivo, and transformation from varying dosages achieved in culture cells of the same type as the target cell type can be monitored. Often a single dose can be sufficient; however, the dose can be repeated if desirable The dosage should not be so large as to cause adverse side effects. Generally, the dosage will vary with the age, condition, sex and extent of the disease in the patient and can be determined by one of skill in the art. The dosage can also be adjusted by the individual physician in the event of any complication.
For administration to a cell in a subject, the composition, once in the subject, will of course adjust to the subject's body temperature. For ex vivo administration, the composition can be administered by any standard methods that would maintain viability of the cells, such as by adding it to culture medium (appropriate for the target cells) and adding this medium directly to the cells. As is known in the art, any medium used in this method can be aqueous and non-toxic so as not to render the cells non-viable. In addition, it can contain standard nutrients for maintaining viability of cells, if desired.
For in vivo administration, the complex can be added to, for example, a blood sample or a tissue sample from the patient, or to a pharmaceutically acceptable carrier, saline and buffered saline, and administered by any of several means known in the art.
"Examples of administration include parenteral administration, by intravenous injection including regional perfusion through a blood vessel supplying the tissues(s) or 15 organ(s) having the target cell(s), or by inhalation of an aerosol, subcutaneous or intramuscular injection, topical administration such as to skin wounds and lesions, direct transfection into, bone marrow cells prepared for transplantation and subsequent transplantation into the subject, and direct transfection into an organ that is subsequently transplanted into the subject. Further administration methods include oral 20 administration, particularly when the composition is encapsulated, or rectal administration, particularly when the composition is in suppository form. A pharmaceutically acceptable carrier includes any material that is not biologically or otherwise undesirable, the material may be administered to an individual along with the selected complex without causing any undesirable biological effects or interacting in a deleterious manner with any of the other components of the pharmaceutical composition in which it is contained.
Specifically, if a particular cell type in vivo is to be targeted, for example, by regional perfusion of an organ or tumor, cells from the target tissue can be biopsied and optimal dosages for import of the complex into that tissue can be determined in vitro, as described herein and as known in the art, to optimize the in vivo dosage, including concentration and time length. Alternatively, culture cells of the same cell type can also be used to optimize the dosage for the target cells in vivo.
For either ex vivo or in vivo use, the complex can be administered at any effective concentration. An effective concentration is that amount that results in reduction, inhibition or prevention of the viral infection or in reduction or inhibition of transformed phenotype of the cells A nucleic acid can be administered in any of several means, which can be selected according to the vector utilized, the organ or tissue, if any, to be targeted, and the characteristics of the subject. The nucleic acids, if desired in a pharmaceutically acceptable carrier such as physiological saline, can be administered systemically, such as intravenously, intraarterially, orally, parenterally, subcutaneously. The nucleic acids can also be administered by direct injection into an organ or by injection into the blood vessel supplying a target tissue. For an infection of cells of the lungs or trachea, it can be administered intratracheally. The nucleic acids can additionally be administered 15 topically, transdermally, etc.
The nucleic acid or protein can be administered in a composition. For example, the composition can comprise other medicinal agents, pharmaceutical agents, carriers, adjuvants, diluents, etc. Furthermore, the composition can comprise, in addition to the vector, lipids such as liposomes, such as cationic liposomes DOTMA, DOPE, DC-cholesterol) or anionic liposomes. Liposomes can further comprise proteins to facilitate targeting a particular cell, if desired. Administration of a composition comprising a vector and a cationic liposome can be administered to the blood afferent to a target organ or inhaled into the respiratory tract to target cells of the respiratory tract.
Regarding liposomes, see, Brigham et al. Am..I. Resp. Cell. Mol. Biol. 1:95-100 (1989); Felgner et al. Proc. Natl. Acad Sci USA 84:7413-7417 (1987); US. Pat.
No.4,897,3 5 For a viral vector comprising a nucleic acid, the composition can comprise a pharmaceutically acceptable carrier such as phosphate buffered saline or saline. The viral vector can be selected according to the target cell, as known in the art. For example, adenoviral vectors, in particular replication-deficient adenoviral vectors, can be utilized to target any of a number of cells, because of its broad host range. Many other viral vectors are available, and their target cells known..
EXAMPLES
Selective elimination of virallv infected cells from a cell culture Rat intestinal cell line-I cells (RIE-I cells) were standardly grown in Dulbecco's modified eagle's medium, high glucose, supplemented with 10% fetal bovine serum. To begin the experiment, cells persistently infected with reovirus were grown to near confluence, then serum was removed from the growth medium by removing the medium, washing the cells in PBS, and returning to the flask medium not supplemented with serum. Typically, the serum content was reduced to 1% or less. The cells are starved for serum for several days, or as long as about a month, to bring them to quiescence or growth arrest. Media containing 10% serum is then added to the quiescent cells to stimulate growth of the cells. Surviving cells are found to not to be persistently infected cells by immunohistochemical techniques used to establish whether 15 cells contain any infectious virus (sensitivity to I infectious virus per ml of homogenized cells).
Cellular Genomic DNA Isolation Gene Trap Libraries: The libraries are generated by infecting the RIE-1 cells with a retrovirus vector (U3 gene-trap) at a ratio of less than one retrovirus for every ten cells. When a U3 gene trap retrovirus integrates within an actively transcribed gene, the neomycin resistance gene that the U3 gene trap retrovirus encodes is also transcribed, this confers resistance to the cell to the antibiotic neomycin. Cells with gene trap events are able to survive exposure to neomycin while cells without a gene trap event die. The various cells that survive neomycin selection are then propagated as a library of gene trap events. Such libraries can be generated with any retrovirus vector that has the properties of expressing a reporter gene from a transcriptionally active cellular promoter that tags the gene for later identification.
Reovirus selection: Reovirus infection is typically lethal to RIE-1 cells but can result in the development of persistently infected cells. These cells continue to grow while producing infective reovirus particles. For the identification of gene trap events that confer reovirus resistance to cells, the persistently infected cells must be eliminated or they will be scored as false positives. We have found that RIE-1 cells persistently infected with reovirus are very poorly tolerant to serum starvation, passaging and plating at low density. Thus, we have developed protocols for the screening of the RIE-1 gene trap libraries that select against both reovirus sensitive cells and cells that are persistently infected with reovirus.
1. RIE-1 library cells are grown to near confluence and then the serum is removed from the media. The cells are starved for serum for several days to bring them to quiescent or growth arrest.
2. The library cells are infected with reovirus at a titer of greater than ten reovirus per cell and the serum starvation is continued for several more days.
3. The infected cells are passaged, (a process in which they are exposed to serum for three to six hours) and then starved for serum for several more days.
4. The surviving cells are then allowed to grow in the presence of serum until 15 visible colonies develop at which point they are cloned by limiting dilution.
MEDIA: DULBECCO'S MODIFIED EAGLE'S MEDIUM, HIGH GLUCOSE (DME/HIGH) Hyclone Laboratories cat. no. SH30003.02.
NEOMYCIN: The antibiotic used to select against the cells that did not have a U3 gene trap retrovirus. We used GENETICIN, from Sigma. cat. no. G9516.
RAT INTESTINAL CELL LINE-I CELLS (RIE-1 CELLS): These cells are from the laboratory of Dr Ray Dubois (VAMC). They are typically cultured in Dulbecco's Modified Eagle's Medium supplemented with 10% fetal calf serum.
REOVIRUS: Laboratory strains of either serotype 1 or serotype 3 are used. They were originally obtained from the laboratories of Bernard N. Fields (deceased). These viruses have been described in detail.
RETROVIRUS: The U3 gene trap retrovirus used here were developed by Dr. Earl Ruley (VAMC) and the libraries were produced using a general protocol suggested by him.
SERUM: FETAL BOVINE SERUM Hyclone Laboratories cat. no. A-1115-L.
Genes Necessary for Viral Infection Characteristics of some of the isolated sequences include the following: SEQ ID NO: 1- rat genomic sequence of vacuolar H+ATPase (chemically inhibiting the activity of the gene product results in resistance to influenza virus and reovirus) SEQ ID NO:2- rat alpha tropomyosin genomic sequence SEQ ID NO:3- rat genomic sequence of murine and rat gas5 gene (cell cycle regulated gene) SEQ ID NO:4- rat genomic sequence of p162 of ras complex, mouse, human (cell cycle regulated gene) SEQ ID NO:5- similar to N-acetyl-glucosaminyltransferase I mRNA, mouse, human (enzyme located in the Golgi region in the cell; has been found as part of a DNA containing virus) SEQ ID NO:6- similar to calcyclin, mouse, human, reverse complement (cell cycle regulated gene) SEQ ID NO:7- contains sequence similar to :LOCUS AA254809 364 bp mRNA EST 15 DEFINITION mz75a10.rl Soares mouse lymph node NbMLN Mus musculus cDNA clone 719226 SEQ ID NO:8- contains a sequence similar to No SW:RSPI _MOUSE Q01730 RSP-I
PROTEIN
SEQ ID NO:9- contains 5' UTR ofgbJU25435fHSU25435 Human transcriptional repressor (CTCF) mRNA, complete cds, Length 3780 SEQ ID NO:38- similar to cDNA of retroviral origin SEQ ID NO: 50- trapped AYU-6 genetic element Isolation of cellular genes that suppress a mnlignant phenotype We have utilized a gene-trap method of selecting cell lines that have a transformed phenotype (are potentially tumor cells) from a population of cells (RIE-1 parentals) that are not transformed. The parental cell line, RIE-1 cells, does not have the capacity to grow in soft agar or to produce tumors in mice. Following genetrapping, cells were screened for their capacity to grow in soft agar. These cells were cloned and genomic sequences were obtained 5' or 3' of the retrovirus vector (SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83). All of the cell lines behave as if they are tumor cell lines, as they also induce tumors in mice.
Of the cell lines, two are associated with the enhanced expression of the prostaglandin synthetase gene II or COX 2. The COX 2 gene has been found to be increased in pre-malignant adenomas in humans and overexpressed in human colon cancer. Inhibitors of COX 2 expression also arrests the growth of the tumor. One of the cell lines, xl8 (SEQ ID NO:76), has disrupted a gene that is now represented in the EST (dbest) database, but the gene is not known (not present in GenBank).
(SEQ ID NO:76): >02-X18H-t7.., identical to: gblW55397jW55397 mbl3h04.rl Life Tech mouse brain Mus at 1.0e-114. x18 has also been sequenced from the vector with the same EST being found. (SEQ ID NO:77): >x8_b4_2.. (SEQ ID NO:78): >x7_b4.. (SEQ ID NO:79): >x4-b4.. (SEQ ID NO:80): (SEQ ID NO:81): S"i >x15-b4. (SEQ ID NO:82): >xl3-re.., reverse complement. (SEQ ID NO:83): >x12 b4..
Each of the genes from which the provided nucleotide sequences is isolated represents a tumor suppressor gene. The mechanism by which the disrupted genes other than the gene comprising the nucleic acid which sequence is set forth in SEQ ID NO:76 may suppress a transformed phenotype is at present unknown. However, each one represents a tumor suppressor gene that is potentially unique, as none of the genomic sequences correspond to a known gene. The capacity to select quickly tumor suppressor genes may provide unique targets in the process of treating or preventing (potential for diagnostic testing) cancer.
Isolation of entire genomic genes An isolated nucleic acid of this invention (whose sequence is set forth in any of SEQ ID NO: I through SEQ ID NO: 83), or a smaller fragment thereof, is labeled by a detectable label and utilized as a probe to screen a rat genomic library (lambda phage or yeast artificial chromosome vector library) under high stringency conditions, high salt and high temperatures to create hybridization and wash temperature 5-20C.
Clones are isolated and sequenced by standard Sanger dideoxynucleotide sequencing methods. Once the entire sequence of the new clone is determined, it is aligned with the probe sequence and its orientation relative to the probe sequence determined. A second and third probe is designed using sequences from either end of the combined genomic sequence, respectively. These probes are used to screen the library, isolate new clones, which are sequenced. These sequences are aligned with the previously obtained sequences and new probes designed corresponding to sequences at either end and the entire process repeated until the entire gene is isolated and mapped. When one end of the sequence cannot isolate any new clone, a new library can be screened. The complete sequence includes regulatory regions at the 5' end and a polyadenylation signal at the 3' end.
9*t Isolation of cDNAs An isolated nucleic acid (whose sequence is set forth in any of SEQ ID NO:1 :oo. through SEQ ID NO:83, and preferably any of SEQ ID NO:5 through SEQ ID NO:83), or a smaller fragment thereof, or additional fragments obtained from the genomic library, that contain open reading frames, is labeled by a detectable label and utilized as a probe to screen a portions of the present fragments, to screen a cDNA library. A rat cDNA library obtains rat cDNA; a human cDNA library obtains a human cDNA.
Repeated screens can be utilized as described above to obtain the complete coding sequence of the gene from several clones if necessary. The isolates can then be sequenced to determine the nucleotide sequence by standard means such as dideoxynucleotide sequencing methods.
at Serum survival factor isolation and characterization The lack of tolerance to serum starvation is due to the acquired dependence of the persistently infected cells for a serum factor (survival factor) that is present in serum.
The serum survival factor for persistently infected cells has a molecular weight between and 100 kD and resists inactivation in low pH (pH2) and chloroform extraction. It is inactivated by boiling for 5 minutes [once fractionated from whole serum (50 to 100 kD fraction)], and in low ionic strength solution [10 to 50 mM] 34/1 The factor was isolated from serum by size fraction using centriprep molecular cut-off filters with excluding sizes of 30 and 100 kd (Millipore and Amnicon), and dialysis tubing with a molecular exclusion of 50 kd. Polyacrylamide gel electrophoresis and silver staining was used to determine that all of the resulting material was between 50 and 100 kd, confirming the validity of the initial isolation. Further purification was performed on using ion exchange chromatography, and heparin sulfate adsorption columns, followed by HPLC. Activity was determined following adjusting the pH of the serum fraction (30 to 100 kd fraction) todifferent pH conditions using HCI and readjusting the pH to pH 7.4 prior to assessment of biologic activity. Low ionic strength sensitivity was determined by dialyzing the fraction containing activity into low ionic strength solution for various lengths of time and readjusting ionic strength to physiologic conditions prior to determining biologic activity by dialyzing the fraction against the media. The biologic activity was maintained in the aqueous solution following chloroform extraction, indicating the factor is not a lipid. The biologic activity 15 was lost after the 30 to 100 kd fraction was placed in a 00"C water bath for 5 minutes.
solated nucleic acids Tagged genomic DJAS isolated were sequenced by standard methods using Sanger dideoxynucleotide sequencing. The nucleotide sequences of these nucleic acids 20 are set forth herein as SEQ ID NO:1 through SEQ ID NO:75 (viral infection genes) and SEQ ID NO:76 through SEQ TD NO:83 (tumor suppressor genes). The sequences were .run through computer databanks in a homology search. Sequences for some of the "6b" sequences [obtained from genomic library 6, flask b] SEQ ID NO:37, 38, 39, S42, 61, 65, 66, 69) correspond to a known gene, alpha tropomyosin, and some of the others correspond to the vacuolar-H'-ATPase. These sequences are associated with both acute and persistent viral infection and the cellular genes which comprise them. e., alpha tropomyosin and vacuolar-H'-ATPase, can be targets for drug treatments for viral infection using the methods described above. These genes can be therapy targets particularly because disruption of one or both alleles results in a viable cell.
34/2 Throughout the specification, unless the context requires otherwise, the word "comprise" or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.
*o *o o SEQUENCE LISTING GENERAL
INFORMATION:
APPLICANT: VANDERBILT
UNIVERSITY
305 Kirkland Hall Nashville, TN 37240 (ii) TITLE OF INVENTION: MAMMALIAN GENES INVOLVED IN VIRAL
INFECTION
(iii) NUMBER OF SEQUENCES: 83 (iv) CORRESPONDENCE
ADDRESS:
ADDRESSEE: Needle Rosenberg, P.C.
STREET: 127 Peachtree Street, Suite 1200 CITY: Atlanta STATE: Georgia COUNTRY:
USA
ZIP: 30303-1811 COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk S* COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (vi) CURRENT APPLICATION
DATA:
APPLICATION NUMBER: FILING DATE:
CLASSIFICATION:
(viii) ATTORNEY/AGENT INFORMATION: NAME: Selby, Elizabeth REGISTRATION NUMBER: 38,298 REFERENCE/DOCKET NUMBER: 22000.0061/P (ix) TELECOMMUNICATION
INFORMATION:
TELEPHONE: 404 688 0770 TELEFAX: 404 688 9880 INFORMATION FOR SEQ ID NO:1: SEQUENCE CHARACTERISTICS: LENGTH: 828 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: AAAAAAAAAT TACCATTTTT GGGNGAACCT TTNATANTTN GTTCCTAGAG GGNGAGTCAG GGGTAAAAAA AACGATNAAG GGAGTTGNGG CGATTGGAGA AGCTATTATG AAGGGATAAA 120 ANACTTAGGT TGAGCCGGCG GGTGGGGTGT ATTCTTGGGG TGGNGAAAAG NNAGATCAAC 180 ATGAGATTTT TTTGTTTTAG GTTTTGCATG TTGTAATGCA ATANTTTAAC CTGATTTTAT 240
GTGCAGGATG
ACCGGTGAGT
ATATGGAGCG
AGACGTCTGC
GGGCCCCGGG
GCCTCGCT CC
ACGCGAGCGC
GGCGGGGTCG
CC CGTC GT CC
CCTGAGGTTT
CCGCGCAGCC
CTACGGCCCC
ATGGAGCAGT
GGCGGGTAGC
CCCACCGGCG
GGCCGCCATC
CCTT CGCATC
TAGCCCGCCG
GTGAGCAGGA
GCAGAGAAGG
GCCCCTGGGG
GGACCAGTGA
AGGGCCCATA
CAAAGTGGTA
TTGNTCT
GCG
CGCCGCTTCG
CCGCCTGCTG
CCTNAAGGGC
ACACAGGAAA
CGGGTATCAT
ccGATGGGCC
AGACCCAGGC
CATTGTCCAA
CAGCCCATGG
GT GCTGGTAT
AGAATCTTCT
AGCTTGCCCT
CNGCANGCNA
AGGAACACCG
TCGNTCCACC
CAAAAAGGTA
AAGGCCGAAC
GGGCTGCTGG
GGGCGTGGCC
TTAGAGCGCA
TTCGTCTGCT
CTTCCCCGCT
GA.AXAkGT GTANTCGAAkC
CTGTATGNTA
GGGTTCGP.GA
GTTGGGCCCC
AGAGCCT GGA CATATCAT GG GC GCCqTGACT CGCTCT1CTCT
TGCAGACATG
300 360 420 480 540 600 660 '720 780 828 GNGGACATTG
AAAGACCCTA
INFORMATION FOR SEQ ID NO:2:* SEQUENCE
CHARACTERISTICS:
LENGTH: 845 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION; SEQ ID N0:2; T CNCCTAAGA
CCNNGGACCC
AAGGGNANGN
CNCGGCCGNT
GCGNCGNCGG
GATGNNGGGG
~AACGGTGNG
CGCGNAGTTG
CNAAAPAA7A
TTCTAGGNGT
NCCNCCGGGG
CCCCNGGAGC
CNTCCCGACA
TCTATTTNNG
NANGAGANP.G
ACCNAGGGGA
GGNNAAACAN
CCNTGGGCCN
CCCCCCCAAC
AATTG14NAAT
NGNAGNNGTT
GCNGGGGACG
AAANAANNGN
CAN.GNTGNGG
GGAGTTT GTT
GTCTANNAGG
TAGTAGGCGT
NTT CATGGGC
AAAGAGCCKT
ATTGGGCGAA
GATCCANC CT
CATNTNTTCC
GCCCCCCANC
AATATGGC
GG
CCAI-IGT GNCN
TGGTAAGGGG
CCGCTNCGTT
TCCNTCTACC
CCG;TGGCNAA
CNGGNGGCGT
CGTATGTTAG
CNNGCAACAA
TTTADLANCT
TCCNTNACTT
GTTTTNAf CA
CAT'TTTGNNT
CAGCGGNGAC
GC-AGANNTGG
GCCCGGGGTG
CGGCCCTGGA
GTNCCCTGCT
CCCCAT CNAN
TGNCGACAGN
ACCTNTCGAA
ACNAAAGGAN
NNGNCCNGTT
TTCNTCCCCN
CCNG14GGCCC CNGN14CCTGG
ANCAGTAGCC
AGC GGC GGCG
GANGANATTT
TGNAGCCCNG
GNGGAC-CGAmC
GCNCNCCAGT
GGCCNNCGTC
GGACGCGNNA
TGAYLATAGNG
GCNTTAAATT
CGGCAGTGCN
GGAGAGANTN
AGNGCAGGCA
GAGCGGGCKJC
CNN GGGCNGC
NGCCNGTGCC
C-ANCTGCA14T
NAGCTTCCTT
GAT GGGANNN
AATAGATAGG
GTTAGATGGN AATGGAGAnNT ANATACCGGG
CTTAGCTTCG
120 180 240 300 360 420 480 540 600 660 720 '780 840 GGGGG 845 INFORMATION FOR SEQ ID NO: 3: SEQUENCE CHAJRACTERISTICS: LENGTH: 818 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: C. C C C
C.
C.
C.
C
TACACCTrrG
GCCCCGGNCT
ATAAAGCCN
TAANNGTGGG
TNNWNGGCAA
TGAAAAAGAT
TTNTAGACCA
AGTAAGCAGT
AACAPAGCCA
ATAAATAGCA
AAACTAAAGA
GATCCCCCAC
CCACATGGCC
TGAAGGAPTNT
C7TGTGGAAT N CNNN NNGGGC
GGGGCGAGNT
GAGGGAT CAA
AGTGCCACCA
CCATTTATTA
ATTTACATAA
TCTAAGCAGG
AATCCACACT
GCTTTGTAGC
ACCTCTTAGC
GACTCCGCCA
CACAAATCAA
TT CT ATTTA
GTACCGAAGA
CATT CACCNG
CGTAACC-TTA
AGACANATNA
NA-ACTGAGCC
AGATTTNTTC
CTTTTTCGTA
GCCATTCACA
TGCCTGAGCA
TGGTGCCATG
T CGATCCAAC
CTTGCCAAGT
CGGCGCNANT
AGGATTTTCN
GGCNGCAATA
GTATGAAATC
TTTGTTATTT
ACAAACAACA
CCCAGCAATN
TCCATGTGGT
ACCAATTCCA
CGTGCCTTCT
ATTCTCATCT
AGGGCAGCTT
AGACACCT
NCNCTCTTCA AAATTACGC G1ANGAAN
ANAANGTAT
CATAT CATTT
AAACAAANTN
TNNAATNTTT
CCCNNNCCAC
TTAACAGT CC
TCATTCGGCG
A14GAGGCAAGN
CCCATTACAA
CACAATG OCT CTATATCC GT
TACCCCACAT
TATTTGCTCT
CCTTTTCCAN
CCCAAANAAG
T CCCTNTATA
TTTTTTTNNT
AC.AAGTTNCA
COCCACOTAC
GCACTTACAG
TCCATAACTC
GGAAPTACCCA
GCCGCATCT C
CCCACCTACT
TTCATCT CCC
GGCCNGCTAN
120 240 300 360 420 480 540 600 660 720 780 818 INFORMATION FOR SEQ ID NO:4:.
SEQUENCE CHARACTERISTICS: LENGTH: 857 base pairs TYPE: nucleic acid STRAN,1DEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: TGGAAACANT GNCNTAAACGT TNACTTNNNA GATArrGAN\N AANNTNCCGN AAAANAACCT CNNNNACAAT CTCNCAANNA TTTNAANGAA GCCGAATAA ATCNAAANTC GCANTTAAAA AAALMAGGGN NAVNGNNTTN NGGTTNAANA NAAGGGGGT NTNCCCGTTT
TTTTTTTAGG
ATCCTGGGAG
TAACCNACAG
CAGTAAGGGA TGGGGCCCTA TTCCG1'TAAG
TATTTTGACC
CGCGNCCAGA
TTNTGCGAAN
TCGCAGATNT
GACCGCAGAT
GGAGAAAATT
TCTTGGGGTT
GGCACCCATC
GCCACGTCAG
GCGGAGCATG
CGCAP.AGGCC
GCGAGGGATT
AANGGGGTCT
GGCGAGGCGC
GTGGAAGCTC
GAACCNAAAA
TTTTTA14CAA TTTC CAGGGG
GTCATTTTGG
TTTCNTTTCC
CCNTCCCGNG
CTCACGCTCG
TGCN1TNTAAC TT CNTTTCNG
GCGATAGTTC
TTNGNANAAG
CGAACACCAT
ATGTNTC CGC
GAATGACTGT
CACCTTATGT
ACCCAAASAA
CGACGCCAGC
ATCCGGGGCT
TCTCTGGCCG
CCCTCCGCCT
GGNGNTCCTT
TGACAGGANA
ACAGCCGTTG
TGTAGACACT
C CGNTGGAGC CACAACT GTT
ACGCNTGCGC
CGGGCGGCGG
GCTGGGCGCG
CCTCTTCCCG
CCGGT CAGNA NGAC CTTAAA
GCTTTTTTAG
AGTGGTGGCC
CTCGCTGCCC
GCAGAGAAAG
CGCTGCCGCC
GGCGACTGCT
GTCCAGGCCA
240 300 360 420 480 540 600 660 720 780 CTAGGGAGTT C GCTGAC GCC GGGTGAACTG AGCGTACCGC CTGAAAGACC
CCACAAGTAG
GTTTGGCAAG
TAGAAAG
INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 896 base pairs TYPE: nucleic. acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SE-Q !D
GGGAGAAAGG
TTTAACGGGG
NTAAGAGGGN
AGGGTTCCCG
AGGTGTTTTT
ATT GGCAAGG
GGGGAAAGGT
NTCCATGGAA
GGNTCTTGTG
AGACTAGAGT
GGCGACNTTT ATTGGTCCNG
GAGNGGGGG
GGAGGCCCCG
GGGGGTNTCC
GAGAAT GGGG
TTTTTTTTTT
AAATTTTTTT
CCCCNATTNA
GAGTGGCTTT
CTTGAGAATA
GTTNTAGATT
GNNGAGGAA~T
GNNNTTNTTN
NGGATAAAAN4
TTTGTTCANA
CCTAINCCTCC
ACAAAATGNG
TNTGGNGAAG
TTGTTGGCCA
NTAGGTCTTC
7 CCCGGGGGA
GAATNGTGGN
GATTGGCAAC
AANAGGAA
TTGAAPAATA
TTTCAGNGGA
TTCATTTTCC
GCTTTATNGT
ANGTTTCCAG
NCAAATGGGT
GGAANAAJAAA
GCACCGGGGG
TCACCCCGGN
TGATT CAAGT GT GGGAACAG
GTGTGGCCCA
TTAAC CT-1NA CTT CATTTNT
TCACCAGTCC
CAAGATCCGC
GGCAAGGAAG
TAGTTGTACC
TAAAAAGTA
GGGTTC CCAA
CCCATTGTGT
NNACT GTAAN
AANACTATTT
TTGGCTTTTT
120 IS0 240 300 360 420 480 540 c00 AGTATGGAAA TCACCAGTAA TGGCAATATA ACATCCCTGC TTCTGTTTCT
TAGAAGGCTN
39 NATTACAGTG TGTTCAAACT CCGTGTCATT GCAACAGGTT AAACTAACTT TNTACGTAGG ACATCAGGGT ATTGACATTC TCATCCTAAkA GTCAGTTTGT CTGTTTCCAG AGGAGGAACT GAAGCAGTGG TTCTTTAAGT AACTGACTCA GGGCTTTCCT GCCTGGCGCG CCTGCCAGGC ATNGTGTAGC ATTGTACTGC ATCTTCTTTG ACCAGTTTCC CCAGGTGAAG AGCCTG INFORMATION FOR SEQ ID NO:6, SEQUENCE CHARACTERISTICS: LENGTH-: 937 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomnic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: GGGCCCCCCC CCCCCNAN4TT AA4TTTTNGGG AAGAAAAAAG GGAAA.AAANT TTGGGGTCAG
GAAAAANGAA
NAGAGGTCCC
TTTAT CAGAT
ACATATTAGT
GGNGGTCACC
AGAGATGCGG
CAGCATCAGT
NTGGCTGGTA
CTGAACACGG
AGAATCCTTC
GGGCTGGGAG
TCCCCTGGGC
CCATCCCCTC
GCGT CCCT CT
AGACCCCACA
GTTGGNAANC
TTNNTTCCNN
TACCCGNGNG
CAGATTATAC
ACACAGGAGA
TAGACT GACT
ATTGTTCCCA
GAGGTTCAGC
PATTAGAGGG
ATCCTGCTCC
GTGTGTTTGC
GGGCNTCACC
GACCACTCTT
AAGazTCTGT C
ATGTAGNTTT
GNNGGGGNGN
GG.AAAAGTTT
TCACCTGGGG
ATAGCAAANA
T GTATTATCC GTTCC CTTTT
GTCCCCNTCA
ACACATACCA
ArACTCGATGT
CAGTCCGGAC
CTTGCCTCAG
GATGCTGGCC
TTGGCGCTTC
CACTNCCTGG
GrGCAAGCTAG CAGNATTN GA
AAAAGGGGTT
ACCCTTTACNM
TAGTTAGGAG
GCAGTATTAG
CGNTTGGAGT
CACTGATTCG
GAGTTACGAG
CTCCGGCTTG
GTCCAGGCAA
GCGNTGGGTG
ACTATAAGGC
ATTGTCGACG
TCTAGGGGTT
CAAAGGT
ANAGTGGGGG
CAATTAACTT
GGTGGCGGGA
CACAAN4GAAT AGAGTTGAGAz GACCTT GCCA
AACTTTAAGG
TCACGTGCCA
CACTGGT CTT
CAAGGGCGTG
GGGTTGGGGC
CAGCCAGACT
TGTGGTGAGC
ANNTTAATTT
NGGATCNCCA
CAT TN GAAAN
CATTTATGGT
ACCATATNTT
TTAGAGGCAAL
J CACTGATCT
GAAGGGCAAA
CTCTTGCANT
GAAAGTGAGG
GTGCCAGCAC
GCGACACAGT
TCTCACTGGG
AAGCNTTTTC CTGCCCTGAA INFORMATION -FOR SEQ ID 10:'7: Wi SEQUENCE CHARACTERISTICS: LENGTH: 888 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:
A.AAAGGGGGC
AAAANTACTT
GGCCGGAAGC
TTCCTTTccT
TTTTCNGTNA
NATTNN GGTT
TTTTCCCNGC
GTAAATTACA
TAAACATTAC
CCCAAANAAA
CAAGGGANGA
CCTGATTCCN
ATA14GACTTT
TCCTCAGTCA
CCCAGCGGNG GGGGGTTGTC CAAGGAATCA AAANGTGGGG
NGGGGGGGAA
TTAAAAAAGG
CTCGGACNGG
TATNTTAGCA
AAGGAALAGCA
AGNAATTGGN
T TTTAAGACA
ATGGGTA.GG
CCTTCATCCN
AAT14TTCAAA
CATGGGCAGG
TTTGNTGTCA
CCTGATGGNG
GCTTCTGAGC
CNGCCNNANA
TTTCNNTGTT
AATN GCCGGC
GGGGGGGGAN
TCCCAGAGAG
GGCA14GATAN4
GCTTGGCACA
GAGGNAGTTA
AGNAGCCCCN
WTAGGGNACA
CACA GA CANT
ACGCTGCCGT
CT CAGGGT CC
GACNNTCACN
ATANANGAC G TTCNGGGGNG
AGGACAAGGA
CAGGAAACCA
AAACACGGAN
NGCCAAGAAA
TATNNGGCAG
GGCCAGGGTA
ACACAAGCAT
TGGGGAACGT
GAATCAGTGN
GCTCCAGGGA
GANGGGACAC
CAGCAGGCAC
AAAAGGGNAC
NCGAGTTGGG
AAAAAGGGAA
ATNGGCCTGT
CAGGTNATTA
AGTAGGGCA14
TCNTGGCGGG
TAAGCCAAGC
TCAGAGACTC
CAACCTT CCC
TNCCTCGTGG
GCACNGGGAT
NGGGNTTM GG
GAANNGGGTT
C CAAAATT CT CCA-1AGGTA-4 GTATGGAT GT
TCTCACATAT
NTA14GACTCA
CAGGGGCACC
GGAINGTGAGT
TAGCACACAT
120 180 240 300 360 420 480 540 600 660 720 780 840 AGT GGCAANG AC CTCATTCT AAGGAGCTGG CTAGTAGA TCTCGTCTGT CCCACTGAAA INFORMATION FOR SEQ ID NO:8: SEQUENCE
CHARACTERISTICS:
LENGTH: 980 base pairs TYPE: nuclei-c acid STRA4NDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genolnic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: AGAAATGAAA AAGAAGGAAA GCTAAAAATA
GATTATAAGT
GAAAAAAAG AAA3AAGAACA CAGAGAAGAA
TAAAGGAGAA
AGAAAGAAAA AACGGAAAAG PAACCTAGAA
AATAAAAAAA
AGAAAGGAGA AAGACTTACC TAGAGCC CAG AAATAGAGAA AGAAGAGGAG AGAAAAAGGA TTAGAGAGGG
TGAGGTAGAA
AGAAAAAAAC TAACAAAGAT GCATATAAAC
AGAGAGAAGA
GTTCTATTTG
GAAAAAGGAA
CAAAGTATCC
ACTAGAAA
GGA.AGAAAAG
TGATTAAGAT
AAAAAAGAAA
GAGAAAAAAA
GATAAGGAAG
AAAATGGAGA
ACAAGAAAGC
TAGAGAAAAA
GACCAAAGAG
GAAGAAAGAG
GTAGAGAAGG
GCACCGAGCT
LAAAAAAAAAA
TGACAGAAGT
TGAA.ACGGGC
GCCAATGAGG
ACCCGGAGAG
GACGGTACTT
AGAAGGTAGA
GGCAAAAGCA
GACCATTCCC
GAACCTGGTT
TGAJAAAATAA
AGCAGGCAGG
GGCAGAT CCA
AAGGGCAGGA
GTAATTTTTT
TAGTATACAT
CAGGACAAAT
AAGGAATAAG
TACCCCATAG
ATCACACAGG
ATTCCTTCGG
AAGCCAGCCA
CATCCGCAAA
AACCATATCA
TTTTACGGGA
CGTTTTGCCC
AAAACAAAAA
ATAATAGCAC
GGGGGAACGA
CAGGAGTGGT
GCGGAGAACT
GCACCCCAGC
GTCCTCAAGG
AGCCGA1GC CT
AGCGTCCAGC
GAGTGGTCAG
CAGGAGGGGA
CAATAGCAGG
CCC CGGAATC
ATAGCACGGC
AGAAGAGGAT
CCAAACAGAA
GAGCATCGGC
CGGGACGGCT
CAAGTTAGTG
ATTCTTTTGT
GAAGGGGAAA
ACAGTAAAGG
AAAATACAAG
GTTCCGGGCA
GGGAACTCCT
GCAGCCGCAA
GAGGCCCGGA
GCCATGAGAC
GGC CGGAAGC
TATCCCCAAC
420 480 540 600 660 720 780 840 900 960 980 AGAACC GTAA GCTAGAAATA INFORMAT.ON FOR. SEQ ID NO:9: SEQUENCE CHARACTERISTICS: LENGTH-: 845 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) IMOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: TCNCCTAAGA NA14GAGANAC GTTAGATCCN AATGGAGANT
CCNNGGACCC
AAGGGNA14GN
CNCGCCCGNT
GCGNCGNCGG
GATGNNGGGG
AAAC GGTGNG
CCGNAGTTG
CXAAAAAAAA
TTCTAGGNGT
NCCNCCGGGG
CCCCNGGAGC
CNTCCCGACA
ACCNAGGGGA
GCNNAAACAN
CCNTGGGCCN
CCCCCCCAAC
AATTGNNAAT
NGNAGNNGTT
GC14GGGGACG
AA.PN.ANNGN
CA14GNTGNGG
GGAGTTTGTT
GTCTANNAGG
TAGTAGGCGT
AAAGAGCCNT
ATTGGCAA
CAT CCANCCT CAT14TNTTCC GCC CCC CANC ,AATAT CGCGGC C CANCT GN CN
TGGTAACGG
CCGCTNCGTT
TCCNTCTACC
CCGTGGCNAA
C14GG14GGCGT CN14GCAACAA TTTAAAA14CT
TCCNTNACTT
GTTTT NAN CA
CATTTTGNNT
CAGC GGNGAC GGAGA-1NTGG
GCCCGGGGTG
CGGCCCTG3A
CTNCCCTGCT
CCCCATCNA14
TGNCGACAGN
A14ATACCGGG
ACNAAAGGAN
NNGNCCNGTT
TTCNTCCCCN
CCNGNGGCCC
CN CNNCCTGG
ANCAGTAGCC
AGCGGCGC
GANGANATTT
TGNAGCCCN G
GNGGASGCGAC
GC14CNCCAGT
GGCCNNCGTC
CT TA CCT TC
CGGAAAGAGG
T CAAATAGNG
GCNTTAAATT
CGGCAGTGCN
GC;.AAGANTN
AGNGCAGGCA
GAGCGGGCNC
CNNGGGCNGC
NC CCNGTGCC CAN CT GCANT NAGCT T CCT T GAT GGGA.NNN 120 180 240 300 360 420 480 540 600 660 '72 0 780 TCTATTTNNG NTTCATGGGC CGTATGTTAG ACCTNTCGAA GGACGCGNNA
AATACATAGG
GGGGG
INFORMATION FOR SEQ ID SEQUENCE CHARA4CTERISTICS: LENGTH: 528 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (qenomic) (xi) SEQUENCE DESCRIPTION: SEQ ID 840 845 120 180 240 300 360 420 480 528 ACCTTTCNGC GAACGGNGNG GAAAAGNC4C CAAACAAAAA
GACCCCNNTG
4 4
GGATTTNNTA
C CCGGCAAAT14
TGCNCCCNNA
GTTT CGGAAT
TTNNAATTAA
CCTATTACCN
CTTATANGCA
GGATGGAGCA
CTTGGGGGN
TATT CCCGGC
CAGCGGAAT
CCTNNTACCC
CT CGCGTTCN
CATTCTGGGG
GGAACNCCCT
ATTCNGGAGC
TNAGGGGCAA
CCNNATTNTG
CCAATTTCNG
NGANTGCACG
TTGGAAACGA
ACGNAITAGTT
CNGCACAT
GT
GTTTTTTANN
CCCCACGT
ANTATTGAAN
CGJAGANC CNG
GGNTGCCCTT
GANATCCCTN
NACCTTCANT
NGGGNNANCC
GGCGATTGCC
NNTNTCCGAC
NGNGACCCGG
AN4GATNNCTG
CTTTGNNTTT
AN CTAATGC C
CAGCCTGCG
GTTCAATC
CAT CTAACTT
NGGGGNCNTG
GCACTTNTTC
CTGJAGCGTTT
AN4NTCACACG AANC CAT NGA CCNGACCTAT
ATGCGCNGAA
INFORMATION FOR SEQ ID NO:ll: (ii SE-QUENCE
CHARACTERISTICS:
LENGTH: 927 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: AANACGGTTT AATAAGGGGG ATGTTCAAAA CNCCACTCCC
GGGCAANAAA
AGGGGGCGAG AA}JGGATTCG NGTATAGTTT CCCACCACAA
ACCTNCTTCC
GGGGGNAACC CACCNCATCA TTATGGGCTC AAGGCAGCAC
CCACCCATTT
AGTCACTTTT TTTTGCTAYA ATCAAAGTTC CTTCGAACAT
NTCCTTTTAT
TTGCTGTTAA ATTAGCANTT TNTCNCAGTT TCAAAGTTNT
GGTTCCNGAC
ATTGCTTCAC CGCTTNTTTT GNCGCCAGGAA AGCAGACCCN
TGTTNGCAGG
ANAAAAAATT
ATTTTTT CGG
TTCGGGGGNA
CCAAGGAGTT
NAGNTTTGTA
GCGAGITTCCN
120 180 240 300 360 43 ATTTTTAGTT CCCATTTGGT GTTTCCNTAG GTAATGGAGT NTGAGTTGAG TCCCTTNTCC TATCAGCCGG GGTGGCATTC CAGCCAGATT AGATTT CAGT NTCNTTTNTA ACAGGGAAGT GCAGCCTTTC CACCCCCArLVJ GAGTGAAC CC TGC CNTTTCA CGTTGGCTTA GCATGCAGAT TNTTTGGCTC CATGCCCGGA TGAAACTTCC ATTATCATAG AATGGCAGGC AGGTCCTTTG GGCCNAATGA GATGGNTCAN TGAGCAAAGG
CGNTTACTGC
TAGTNTTGGA ATTCACAGGG TAGAAGTTGA ANACNTTTGA TAGCAGGGCA GNNGTGGTGC ATHCCTTTAA TTTGGGCTAC NGAACCTTGG CAAGTAGAGG ANGTCGT INFORMATION FOR SEQ ID NO:12: SEQUENCE CHARACTERISTICS; LENGTH: 911 base pairs TYPE: nucleic acid STRXNDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (geflomic) CTGCAGACAG TTTGAGTNTA TGTCCAAAGG AGGAATCCAG TAGACACACC CGGCCAGNTT GCTTTTACCC AATTTACTTT GCAGCTGACA TGGGAGGCTT CGGTTAAAAC CAGGAGCCTG CAACCCTGAT GCCTTCAGTT CTCTTCAAAA GTTGTCCCTG TTTGTG;VkAG ATATCCACAJA 420 480 540 600 660 720 780 840 900 927 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: GGGAGTTT GC
ATANNATACT
NTATCGTCTN
NCTCTATNTT
GTGNGGGGAN
GGGGAGANNA
TGCA24AGANA CCTNC GGGGG
GTNNGTNTCC
TCCCNTANCC
TNTATNTNNT
ACTCCCTTCC
TCNTTN4CNTN CCC CCCAAAT
CTACNNTNCC
GGGGN14TCTN
CTCNCCTCAC
CTCNNAGAGT
TACNCTTCTC
AATGGGGCTC
CT TNT NTNTA N CNT GNT CCA
CTGGGGGNGT
ATATATTTGG
CGGGGCCTIJG
NGGAAAACCC
T14GGGCATTT CCCC CTNAN4G
AAATGTTTGN
ATATNTGCGIN
GTATNGNGAA
TCNT CTCT CT
NGAGNCTCTT
AATCN CCTNT
AAAATCTCAA
NTATTATNTN
GGTCNTTACC
AAANTTTATT
TTTTCNAAAC
TTTCTTTTCC
TNT CAAGGGC
GCTCCCCGGG
ACTCTTTCTC
GAACTGNNAG
NTAGAGTGNG
ATATTTCCCC
CNCCATT14TT
ATTTGTGTCT
TNT NTATATN
AAAACCCCNT
NCCNNCCNTT
NGGNT TT C CC
CCTCACCNAA
AAGAGAATAT
NAAAATANNT
NNCCACAN14A
TGTNTNTGGG
ATGTANAAAA
NCCCCTCTCN
14NNANNN GC G CTTNTCC CAA C14TATNTTAT
TTTTNTCTCA
NNGNTCCTTT
CTTTTNNCNT
C CCC NT TT NC
NNTCTCTCTC
CTCTN'TCNCG
AAAGCGCCCA
GCGCGTTCTC
CC14CANNTGT
CCATATATNA
TGTTTNTATT
ACNCTATNTC
ATACNTATAN
CTTTTCNTC14 T CTNTTAAT
CCCNCTCAAA
CTCCCCCCNC
TCTCAGAGNG CCNATTACGC NACAGGGGGN GTCTCAC.7NT ATAANCTCAT 120 180 240 300 360 420 480 540 600 660 720 780 840 CCCCCCCAAA NTGNGAATAC CCTGNTTTTC AGNGGNNNNG AAALAATCCCT CCCC6ANGGN GCCCCCCTCC
T
1NFORMATION FOR SEQ ID NO: 13: SEQUENCE CH1ARACTERISTICS: LENGTH: 880 base pairs TYPE: nucleic acid STRANDEDNESS: double tD) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: GGGCACCAAC GGNGGAAGAG TTTTCCANGG TADNAAGAAAG NAGGANTGGG NCGANAAN1AA
TTANTTTTNA
GAAACCCTCN1
TTTCNTTTTT
TTAAACGTAA
TAANGNAAGNV
14CTTTAANAC
ANGGGTAAGT
CNTTGATCGN
TTTCTTCCTN
AT GGGCADGGT
GNCTGTCACA
GNGNNGGAGA
TTNTGAGCNT
CAAGGGCGTC
AAAAGGNCAC
GACGGTTTTC
GAGCAAATTG
CGCAGNTTTG
GGTTCAAGAG
AGGTNNNAAA
GNTTGG-CACA
GNGGTTGTTT
GGTGCCNCAN
TGGGTACAGA
CAGACACTGC
CGCTNCAGNG
CTGGTCC CNG
TCCACAAGAC
CAGATANAAA
NN GANTNTTA
CCAGCAGGGA
GA14AAACACA AGAnGCCGATG
AATNNGGCTG
GNCCAGGGTA
ACACCGCNTT
GGNGAACGAC
ATCAGTGTTC
TCCCAGGGAC
ANGGGACACT
CAGAGNACAG
AG CGT GN CNA
AAACTTTTNA
AANAGATTCA
ACNGACNAGA
GNTNACAT GG
AAATNGCCNG
CTGTTTATAA
AGTAGGCATN
AAAGAAANGT
AAGCCAAGCG
AGAC-ACT CCA
AACCCTCCGG
CCTGGTGGTA
T GGN4AATGAC
GTAGATAAGT
GGGGNGT'TAA
GGGGAAGCAC
GGNTNGC-TTT
AAAGACCTGG
GTCCAAAATC
CNATAGNTAA
NAArGAAkTGT
TTAAAAATAT
NAT GANT CAC
GGGGCACCCA
GATGTGAGGN
GCACACATTC
TTTTTTCTTA
NPAAAANCCN
GAGATTATCT
TTGNATNCNN
GNNATTAGGG
TTTTTCCTTG
GTGAANNACA
TAAACATNAC
CCCTGGrGCTG
AGGAGACGAC
GATTC CNTCA
NANGAC-TTCC
TTCAGTCNGA
CTTGNGNCTC
120 180 240 300 360 420 480 540 600 660 720 780 840 880 INFORMATION FOR SEQ ID 140:14: SEQUENCE CHARACTERISTICS- LENGTH: 923 base pairs TYPE: nucleic acid STRANDED14ESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION4: SEQ ID NO: 14:
GC-GAGGAGTA
GTNGCGTAGG
CCANI4GTNAA
TTNNTTGNAA
GCCNGCGNGN
NTGGNNNATA
CCCGCGAGAG
CTNNGANANN
CTNNCACACC
GGTTNTTTAC
CCCANGGTGA
TNCNGAGTCA.
NACTGNTCCA
GTGANNGGAC
CCCNGCPLGAG
CNGGANGGGT
NNACAAAGAG
NCAACTNTGG
CCNAGATTCG
ATCNGGCNAG
CCCNGCTCTC
CCCGNGGNAA
ATNNGGCTGN
CCAGGTI7AGT
NNPAAANNP.A
ACGACNA.NCC
GTGTTCAGAG
GGNNCNNCCC
ACTTCGTGGH
CACT14TNGCA
CCGACGTAAN
ATAGGAACGG
CGGGGGTGGG
AGGGACGGAC
GGAGGGTNGG
ACANGNNGGA
GGGCGNGTCC
TGTTNATCNC
GTCCCNTNCA
AAAAJAAANTC
.MANCTNTTGA
A14TTCNGCGG TCCGGTTGN G
GGTGNCNCAC
ATGNCTTTNT
TNTNT cACAG
GGNCGNNAAC
ACNNAAGGCG
NGGANTATCN
TTNNNNGGTT
C GNGGGTNTT
AAAAIJTCTTN
KATAGGTAGN
NGGTATGTTA
ACCNTCCCGG
NTNACAAGGG
CACCCCTGAT
AGTCNAAGAC
ATTCGTCGGT
TTGTTCTGGG
GNAAGNCGPLN
NTNNCNTNTN
NGNGGCNNNA
TAT CCNTNTT
T-ICNGGNGACN
TNNGGTGAGG
TTCCCTGCTT
TCAACCNNCA
ANACGTTACC
GCNT GNT GNT
ACGACGTGNG
TCCCNCGGNN
TTC14GGNNGG CGGCTTA14GA GCTTCC14AAT A1GAGGAGGG GAAAAGGCC G
GAAGGTTTNW
NGLTN CGA14T
NCCCCAGT-T
AAGNN GCNT C
NTNCNACAGG
NGGGGP.NGT G
NNTG-ATCGGG
TCCTNGGGGC
CAGGTTGNC G
GTNACACAGA
TGACNCTACK
HCNTCTNGGT
GGGTCCT CCC AAAAGNCNGC TTTAGCTGTA ATA INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 830 base pairs TYPE:- nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID ANANAGAGTA ANTAANANALA GAGGAAGAGA NAAGAAAGNA GAAGGNAAGG GN14GGCGAGG AAAAAAGGAA AGGAGAANAA TAAAAGAAAA AGTGAGGAAG NAGAAAAAAZG NAAAGNGGAG ATAGNAGAT4A GGNCCGGNGG ANAAAAGANT NAGNTGAAAG AATAAAGAN4N PANGGCGANAA GG.AAAGAAGA NCGAGNATTA AGGAAAGA14N NGGGGGGAGG GAAN'GAGGCG AAflTCNNGAG ANCAGTNNAN AATNAGGAGN AGANANGAAG NNNAN4GANGA AGG.AGGGGAA AGAGIGGNACA GTA14AGTAAC CNACNNCNGC GAGNGNGCCA AATAGGTNGC GCCAGCNACA CCNGGGCGAG GGGGCATCA14 GAGCCAAGGG GAGCGGGTCC AGNCNTAGTT
ANANAAANGG
GAAGGAGTAN
AGATTAANGA
G.AAANAAGAG
AAGGCAAGAG
t3AAAAAACAA
NGGCCCGAGC
NTGAAPAGGAA
AGGGGAGGNG GGNAGATATT ATATGGTCGN
GCCCCCCCCN
AGGNGT GANN AGCAGGGCCN TNTT GGNTGN GGGATCGNGC GGACNTTCCG CNGNGCCTTC CGTAGGCCCA
NTGTCAP.ATG
ATGCCGGNGN TAGNGANTGA TGCGGGGGCC
NGCCCCCCCG
CNGTGGCCGC CATNACGGAG TTCCCAGTGG
TGAGNGTGCG
GCCGCCGGTC CCCGCAGACA GGAACGCGGA
GCGNNCCCTG
CTTGAAAGAC TNNACNAAAN GACGCNGATT
TGTAGAALAAG
INFORMATION FOR SEQ ID NO:16: SEQUENCE CHAkRACTERISTICS: LENGTH: 166 base oairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic)
GTGTC:TCCGGT
ATGATCAGAG,
TATTCAAGCC
GNTTTCCG7CC
GAGNTGAGGC
CGCTNGAACG
GAAAAAAAAA
ACCNGAGGCC
GGTTNGAAGG
CCCGCAGCCN
CCCGCGGGTC
TA14GGGNCCA
S.
S.b
S
S.
S.
S.
S
S S
S
(Xil SEQUENCE DESCRIPTION: SEQ ID NO: 16: ATTCTTCAGC TTTTGCNTAG AGGAA.AAAGA ATGGATTGTT TCTAGGACAA
CCTGCTGAGG,
TGCTCACCNA GNGTTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC
TCTCTCTCTC
TNTGNCTCTC TCCTGAANNT CCCCANAGGN NCTTNGCAGN AAAA14G INFORMATION FOR SEQ ID NO:17: SEQUENCE
CHARACTERISTICS:
LENGTH: 162 base pairs TYPE: nucleic acid STRA11DEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: CNTTTTNCTG CNAAGNNCCT NTGGGGANNT TCAGGAGAGA GNCANAGAGA
GAGAGAGAGA
GAGAGAGAGA GAGAGAGAGA GAGAGAGAGA GAACNCT1NGG TGAGCACCTC
AGCAGGTTGT
CCTAGAA-ACA ATCCATTCTT TTTCCTCTAN GCAAAAGCTG
AA
INFORMATION FOR SEQ ID NO:18: SEQUENCE
CHARACTERISTICS:
LENGTH: 871 base pairs TYPE: nucleic acid STRA14DEDNESS: double TOPOLOGY: linear 540 600 660 720 780 840 880 120 166 120 162 (ii) MOLECULE TYPE; DNA (geflofic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: GAATAAAACC CCAGAAAGGT TTTAAAACAT TCCCTATAGA AGTTGATNAA TThAAPLkTAAT
TGGAGGTGAA
MTTTTGGGG
TTN GTAT TAT
TAAAAGGTAG
TGGACAGAAA
CGATTTAGGC
TGCAGATNT G
CAGGCTTGGG
GGCGCCGGTG
TGGCAGAGAC
CGC CCACAGG
AAAAGGGGGG
CATCGGCCCG
ATACACAGAG
GTTTTAT GNA TTAAAAAC CC
GAGAATCGTA
TATTAAGAGT
CAAGTTATTT
ACGGCTTGGT
C CCAGCATGA
AGAAAGGACT
AACGCCAGAT
CCTTACCGCA
CCCAAAACCA
GCCCCAATCA
GGTTTTTCAA
NAAANGAATT
TTANGGATT C
TCAATAGCCC
TATTCTTAAG
CCACAGTATG
TCCTTAGGTT
GAAGAGAGGG
TCATCTTGCC
CTGCAGAGC
GCACAAAGCG
TATPAGCCNT
TGCCCCCCCC
TTAAT CAATA
GGACCGATCA
NGTTGATTTT
AAGATAACAG
ATCCNGCACT
GTATCAGAAG
ATTGCCACAG
GGAACCAAGT
ATGNTCANTC
ATTCCGGCCT
CGCCACCCGG
GGACCAGGCC
CCAGCATTCG
AAAAAATAAA
ATTTCCAACA
AAATCAAGCA
ACT CTAAAAG
TTGGAAAATT
GAGTAAAGAG
CA-ACGGTCTT
TCTTCAGGGA
AGCCAAACTG
TTAACCGCTT
AGGTCCCGCC
CCCGGCC C CT CCCCCCTC
TTACNTACNT
AATTTATTTT
GTAAATATAT
TTAAAAG-TAT
TAAAACCAAG
ACACCACACG
CC CC GCAAG
CCNCACGGGC
CAAACCCTTN
TCCCACACTC
AGT CAAAACA
CCCCCAGGA
CT CCCGCT CC @600 0@ S. S t.
0005 0605 0 S. S U S *5 S. S *0 04 .455 0 3900 CGCGATGGGC CCTTATGCTC CCGATACGCA T INFORMATION FOR SEQ ID NO:19: e SEQUENCE CHARACTERISTICS: LENGTH: 936 base pairs TYPE: nucleic acid S. STRA4DEDNESS; double so TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genociic) (xi) SEQUENCE DESCRIPTIC14: SEQ ID NO:19: TCGGATTCAA AAATTCGAAC TTANTTTTTN AGGAAATTTN TTTTTAAAAT GCNNTNGCC ACCAATTAAA ANCNGTTTCA ATTNAAAANG ATTCCCGCCC TTTNCTCCAN GCAATTAACC AACTAATTTG GNTTGCNACC ACTNGTTTTG AGGCATTTTA A1AACAAATT AACAGGCNC- CCATNTTCAA CCGGNGNTAG TGAAACNGAC CNTTTTCCCG CCGCCCTTT CCNATTNGTT TCCTTTTTTA ATGCNGAAAAA AAATN4ATGGT TTTATATCAT CCTTNTTGGC ATCACCACAT TNTAATT CCC
GAAAAANCCA
CGCCTNTAAA
NTTCTTTTNA
GCAT TAACAG TGGCNATT CA
ATTAAAACAG
GGGTTTGATT
TTTCCTTTTC
TTTTNACATT
CTGAGGATAA
CTGCGGAGAA
CC CC GAGTGC
AGAAGCTGAG
TAGCCTGGGA
AAAGTNCCCA
ATCATTCATG
GC-CCTGACCC
GGACTGAAAA
TTTGTGCCNT
CAGTGAAGTG
AGTATGCGTG
AGCGTTAGT C
GCTGAGGATT
GTAAGATT CA CAN NGN CTTT
ATNGGCTTTT
GCCNACCTTC
CAGGCGAATG
GCT GTGCGCC
CTCTCTAGCA
CTCGATAAC
AAC CGGCCAA CAT CTCTCAG
GTCTTTGNTA
CCCAACTAAC
TCCCCATTAC
GCTTGCTTAG
AATCATTTCN
CCTCTCTCAT
TTCTTCTGCG
ATTACTCAC
TGTGCTGGGG
TAACAAGTTG
ACAGCCTGCA
AACTCC
CAT GNAAACA
CTGAGGTCCA
GTCCTCTCTT
TT CCCTCTTT
CACCAACCCG
ATGACACAGA.
GATTTTAAAT
T GAATCACGC
GGCATTAAGC
CAAAGACCCA
GCACTCCTT
GAGGGTCCAT
TAAGTGCCCCC
GAGATCTGCC
CCACCGTT GA
GGAATCACAC
TCCCAGGAGC
CACCGAACTC
420 480 540 600 660 720 780 840 900 936 INFORMATION FO0R SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 888 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE- TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID
AGCNNGGGC
TAA.NAANCTT
TGCTNNCT CT
AAATATTTCA
NTC CACCTAG
GATTTCTTTT
CNCCTAACTC
AGANTAATTA
TTGAATATAC
NAACCATG-N
TAATCTTCAT
CACCATAACC
ACTAAATGNT
ATATATATAT
AAN CGAAAAA
ATNGNGAAA
AjAAAATTC
TTNTCAACAA
TACCAAACAT
TNT GCTTCTT C CAT CTTAAT ATCTTACTAAn
ACACCCCTTC
ATTTTACATT
ACACACCACC
GGGAC CATCA
ATATATATAT
CCTTTTANTT
AAGATTTATT
TCTAACAAAA
CNNANCCCNT
TNTAAACTCA
TATTTAAAAT
GAAACATAAT
AATATTANCC
TGTGATTCGN
CTTC CTNNGA
AGTNTAGGAA
CAATACCCCN
ATATATATAT
AANATGACCT
ATA.AGATTTT
CCNTTTTTGT
AGGGAAGGAC
ACGGTTAACAT
TCAATATTCA
TTCAATAATT
ATCCAAATAG
GGGACNT CCC
GCANCCTCCT
GTGAGGNTCT
TAACCNTCTC
ATATA.TATAT
TTTT GGGCGA
TTATAANATT
TTTTTGTTNT
ATCATATGCA
CACACTCAAN
GGATTTCATT
TCCAAACAAT
NTAATAAACA
CATAAGGCTT
CCCTCTTAAG
CTTTAATCTC
CANAGAACTC
ATATATNTAT
AANACAAA.NT
TTNGGGGGCC
CCAACNACTT
TATTTTCANA
TCCCANGACN
TATACTAACA
NTCATTTTTC
AATAGATANCG
CTTTCTATAA
AAAANCACTC
TTACCAAAGT
TAAAAGCNTC
ATAAAGAGC
GC-CAAACTTN TTTATNTCCA AANTTTTCT TTNCCCGCCN
AACCAGTTTT
120 180 240 300 360 420 480 540 600 660 -720 780 840 49 AGTATTGA,A GACNTNCACC AATNGAGCTG GCNAGCTAGA AGAGGTCG INFORMAkTION FOR SEQ ID NO:21: SEQUENCE CHARACTERISTICS: LENGTH: 903 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2l:
CTTGGAAGGT
TTTGGAAACT
TCAAAGTTCC
AAN'ATGTTTT
GAGGGAGAGA
GGTTTTCGGC
TGAINATNACC
CGGAGCGCCA
CGCGCCCGCC
CCTTGGCCCG
AGCACCALTGT
TTGTGTAGCA
ATAAAAATAA
GTCAGTTCCC
'I'CGGGAGTGC
C GT1
TTTTTTTTTG
CATGGNTTGG
TTTGGAT TNA
GAGCCGGATC
AGCCGCNGCG
CCGTTTCTTG
ATTACTGCCC
CGNGAGCTGC
CGCACCCCAG
GAT CAGGZLAG
GCCCGCCGCG
CCGGCATATT
GAAGCCGCCA
GGACACCGTG
GTTGAAGTTA
AAGTANAACT
CGGNGGNGGA
CGCANTCGGG
GGTTCGGAGN
TCGGTGATGT
CGATNTGGTG
GTTTTCCCTG
CGCAAGGGAG
TCTGGCTCCN
GCCACTGGTG
TAAGGCCGGA
GCAGCGCTCT
AAAGACCTTC
NTTGGGGATT
TTTATTCAGA
ATT GGGGAGN
GGTTTCTACC
TTTTAAGGTT
TTNGTACAAG
TTTAT GTTTG;
GCCGCGCGGC
GGGTCCCCTT
TCCATTTCC C GGAT GGCNTT
GCAGGP.ATCC
GCGCAGCGAG
ACCTATAGNG
GGGGGAAAAA
AGNGAAAGTT
GGAGAGAGAA
GGCAGAGCCA
TNTTAATCTT
CTTTCATTTC
CCCGTTCNTG
CCGAGGGGGT
CATTTTTTTT
NTCCCGACTG
CGCTGGCCTG
CGGCGCTCAC
CTGCTGCTGC
CNTGGCAAGC
TTAAAAGGAT
TTAATAATGAl
GAGAGAGAGAL
GGACGGAGAG
GGAAGGT GTC
TTCAGGATTT
CGCNTGGCCC
GGGTGGGGGG
CATTGACTTC
AAGGGAAACA
ANGTAGGGGG
ACGCGGCCTG
GCCAGCCAGN
TAGAAGAGGT
TTTTTTNNCA AAANCCNGGG NGGGTTTTTT TTAANAAANA GGNGAAAAGA iNFORMALTION FOR SEQ ID NO:22: SEQUENCE CHARACTERISTICS; LENGTH: 918 base pairs TYPE: nucleic acid STRANDEDNJESS: double TOPOLOGY! linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPT:OCN: SEQ ID 140:22:
TCGGGGGCALG
CNTATTNGTT
AAGGGANAN4G
NGATTCTGTC
TGCCTGAGAA
CTAATNTGCA
CNAGGGGGAG
GTTGNTCTGG
TGAAGTCCGG
GGGCAAGGTT
CGGCGGCGGT
GCTTAGAGCA
ATTGGCTGGA
CTAACAAAAA
CT GGCAGTTT
GAAAANTTTG
TTNGGCCCNG
CAGGGAAGGG
AGTTTC GCTT
AATTTCAATG
TTTNGGGATN
ACCCAANTAA
TAATAGAGCA
CCGTGCGAAA
I AAC GT CCAT
AGCAACCAGC
CCGCAGGGCC
AGTGGTTAGT
GTAACCAGC G
AAATACAAAC
GGGTTTTCGN
AAAGTAAANA
N GGNATTTTA
TAAGCAAAGG
GGTGGCAATT
TGTCCCTGGG
CCCAAAGGAC
GATTGCTCAN
CGAT CAGAGC
GTTCTTTTGC
TGAATGAAAG
CAGAAAATTG
GACGGAAAAC
GAAATGCCCC
AAAAPAAAAAA
ATTTTTTTTT
TNTCCAJANTT
NGANGAAGGG
CTTAGGACTC
GTCCNTAAGN
T GAPATTAT C
AAACACGGTT
CC GGGAAGPA TT GGCGAGCT
ATGACAGCGG
GCCGCGGGCG
TGT GGGCTTT C CTAAACTAA
ANGGGCANAA
NAAAANATGG
TCNGGTTCCT
NNAGTTTCAG
AGGACAGGAT
TCCGGACCGG
ATGGCAG!CNA
GTTCCATTTG
ATCATCCCAG
TCGCCTTCGG
CTCNTTCGGA
GGTGTGTTGG
ACCAAATGTA
AGGTGGTGTC
ACCCGGTNAA
AAAAATTGAA
ACTTTTTTCC
AAGTTAGGCT
TCAGNGNGGA
GANAGATGTT
CNNACCAGTA
GATATATCCN
GCACGGAGCG
AATCCGGAGG
TTGGCTCTGC
TCTTT CTGTG
AAACGGAGTA
AGTAGTCTCT
120 180 240 300 360 420 480 540 600 660 720 780 840 900 918 a.
a a a a.
C a NATCTCTTT TAGGCATTGT TTTGAAAGTC CCCACAAGGN TTTGCAAGTA AN4AAGT CG INFORMATION FOR SEQ ID NO:23: SEQUENCE CHARACTERISTICS: LENGTH: 309 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID tNo:23: AGAGAGGGTT TAGCACAGGC AGCNTATTCC CAGTTTGTGC TGTAGAACTG
GAACCTCAGG
C CTCATT CTG AAA.TNTGCAG CCNTCCCCAG CATCCTTCNT GGCACAGCNT
GGCACAGACN
TGNTAAGTGT CTATTAGTGA CTAATACAAA GGAGTATTTC AGAACGTTGG
CACATCTCAG
CACGTTGCAA CTGGCTGGAG CTGGTTGAGC TCTTGCTGCT TCCATATCCC
TTTGTAGCTG
CTCTCCACTT TTCTGAACCC CGGGTCCATG TGAAAGTCCC CACAAGGNNC
TTTGCAAGTA
GAGAAGNCG
INFORMATION FOR SEQ ID NO:24: SEQUENCE CHARACTERISTICS: LENGTH: 904 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genor-nic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: TTTCATTTAA AACNCGGGGG NTGAACCCAA TCTTNANGGT GGCAGTGNGG NNGATCTTAA
CGGTTTTTNA
TTTTATAN G TTCAGCC GAG GAT CGATTTG
AATCATTNGA
TGAGAAGNAC
TGCATAAGCC
CAT CTT CTCT
CTCCAAGCAA
GAGGGGTGGG
TGGGATCTGA
ACTGCCAAGG
ATTATTTCAC
GAAATT GAGA
GNCG
GAAAAAAAAR
AAAAAAGAT G
CATTACCTGA
AGTTTCAGGA
TGAGGATGAA
ACTATGATAA
ATGGGAGACA
GCCCCATTCT
ATACCAGAAC
GTAAGGGCAG
GGGCAAGAGA
rGATTTGGGAC
CGGACCAGAG
GCTAT GAGCT
TNCTTCGCTC
ATPACGAAAT
NAGTAATGAA
TCAATTCAGT
'FGGTGAGTGA
CAAGTGTCTC
AATTCTTTTC
GTTTTCCACC
TGGAGGAGAA
TGGCGCTCAT
ACCTGTAAGC
TTCTCCATCT
CTGTAGCAGA
AGGNGCGAAA
NCACCCCCAA
TTTAAAAAC C
GGTNTTCCGG
TACC GNT GAC
GTGATGATGA
ACT CCACATT
NNACACAATT
ACAGGTCTGC
AATTC CAGTC
TCCTNACATG
TTGATTTGAT
CTCTCTCTAA
GATGAGCTCC
GNCCCCACAA
GCCTCCCNTT
GTCGTTAGAG
AGGGTTGCCT
CATCCACCNN
TGATGATGAT
AAGGTTTGCC
AATAGTNTCT
AGCGGGCTAC
CACTGAGTCA
GTGTCTTCTC
TTCCACTGCT
CCTGAAATCC
AAGTTT GAA
AGNNTTTGCC
CTTANCAGCT
GAAATGAAGG
TCCAATCCCA
CCTCCNGTAT
GAY GAAGGGA
TGNAAATTAG
TANYCCTTC
AGCTTCCALGT
TGGGCAGGGG
TTGCCTAGCC
GACTGGAGTC
TTAGGATTCT
TGAGAAAGGG
AAGTAGAAAA
120 180 240 300 360 420 480 540 600 660 720 '780 840 900 904 INFORMATION FOR SEQ ID SEQUENCE CHAR-ACTERISTICS: LENGTH: 883 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID GGGGGGGGAA ACTTNTTTAT NTGGAAAANT TTTGTTTNGG CGGGNAAGGA GTTTTTAANA ANGTTAAQGG AAAAAGCTTT TANTTAA.NAT GACCTTTOT GGGGAAANAC AAANTTGGTN NGTGTATTNG NGAAAPAGAT TTATTATAAkG ATTTTTTATA ANATTTTNGG GGGGGAAATA
TTTC-AAANAA
CGTACTTNTC
CTTTTTAGGA
AACTCTNTGC
AATTACCATC
TATACAT CTT
ATCGNACACC
TTCATATTTT
TAACCACGACA
ATCNTGCCAC
TATATATATA
TCAAACACNT
AATTCTGTAA
AACAACN NAN
AACATTNTAA
TTCTTTATTT
TTAATGAAAC
ACTAAAATAT
CCTTCTGTGA
ACATTCTTCC
CCACCACTNT
CAT CAGAATA
TATATATATA
NCAC CAATNG
CAAAACCNTT
CCCNTAGCCA
ACTCAACCTT
AAAATTCAAT
ATAATTTCAA
TAN CCAT GCA
TTCCNGGCAC
TNNCACCANC
ACGAAGT GAG
CCCCNTAAGC
TATATATATA
ACCTCGCNAG
TTTGTTTTTT
ACCACAT CAT
AACATGACAC
ATTCACCATT
TAATTTCCAA
AATAGNTAAT
NTGGCCCATAA
CTCCTCCCTG
CNTCTCTTTA
NTGTCCANAG
TATATATATA
CTAGAACAG
GTTNTCCAAC
ATCCATATTT
TCAANTCCCA
TCATTTATAC
ACAATNT CAT
AAACAAATAC
CGCTTCTTTC
TTAACARAn~k
ATGTCTTAC
AACTCTAAAA
TN4TATATAAA
TCG
NACTTNT OCA
TCANACATTT
NCACNCNCGT
TAACAACANTT
TTTT CT? GAA
ATANGNAACC
TATAATAATC
CACTCCACCA
AAACTACTAA
GCNT CATATA
CACGCACTAT
240 300 360 420 480 540 600 660 720 780 840 683 a a a a paa.
INFORMATION FOR SEQ ID 14:26: SEQUENCE CHARACTERISTICS: LENGTH: 924 base pairs TYPE: nucleic. acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID 140:2'6: TTTCGPACGN TTTTNACGAA AGAAAN4TCTN TTTNAGCCNA GCAACCCTA TTCCCACG
TTCCCCCAAA
TNT? NNAANA AGCAATC CT C
TCTTTCACCC
AAACCTCTAT
CCATCCTACG
ACCAGTCACA
ACAACCT CC?
TGTGACTCCT
CCNTCCCCCC
TCCCCAAAAA
AT TT TGG T?
AAAATAATAC
TNACCNTCAC
CACTTATTTT
TACTTTTGCC
CAATCCNTGC
ATAGATGAGA
CCCTACCATC
TACACCAAAC
NTCCCTTACC
'TTCNTTTTC
CACCCTTCCT
TAATACTACT
GAATTGG CC A
TTGTTTTGTC
ACACTTACTG
ACACAT? CAT
NTCACCCACC
TCCAATCCAC
TCATCCCATC
ATTNACATAC
GTTCCTATAC
TAAAAAGCC T
ACTAATACTA
ACTACTTT C?
CATCCCATCA
TCTNTCNCNT
ACTCCCCAC-A
C CCA CAT CA
ATTTCCAGC
AACACCCCCA
ACATAAT CAT
TCACTCCTAC
TNCCCTAAAA
TTAATAATAA
TAT? TA SCA
CTCCTNTCAA
TCCTTNCACC
CTTACACTAA
TCCACCCACA
GTCACATCAN
TTCACTAACC
T CAT TC CT C
CTCCCATCT
CCCCCCNANCG
TAATAATTC
CCACCTCTTT
TTCCTTTNAA
ACTTTCAACT
ATCTTC C AT CTAACAC
TCCCAACAC
AACCCT CATS
ACCACCCGAA
TTNTATAAAC
120 180 240 300 360 420 480 540 600 660 720 AAATTNTAAA GAAANTCATT GGTTCATACAi CGTAAGAAGp. CATCAAP.ACA GAACT GAGGC 780 AAGTTGGGAA GAGAAATGGG ATTAGTAGGA GAGGGTCAAG AAAAGGCAA GGTATGTGCA 840 CATGCATGAA TACATTGTAT ACATGTATGA AAG?4GCCACA ATGATGA2NTT ALCCCCANATG 900 GNNGTTTGGC AAGTAAAAGA GTCG 924 INFORMATION FOR SEQ ID NO:27: SEQUENCE CHARACTERISTICS: LENGTH: 482 base pairs (3 TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (qenomic) (xi) SEQUENCE DESCR~IPTION: SEQ ID NO:27: TCTCTCCTGA GGGGGGTTTT NTGGA14GAAT AGAAGAANuAJ4 ACCNCCTC'T rGTTTCNTCC TGTGGNGNNC CCTGCTGNTJA AAGNNGATTT NCNCGGTGNT ATACA14NTAA GAAGGAGGAT 120 CTCTCCCCCC ATTGTNA14AG AACCCCGTGT GTGGGGAGGG GGTGTNGCCA CNANCCAGAN 180 NTGGCCCNNG GGTCNTCTCC CCACTCN'rNT GNATAACNTC TNNCCTCCAC kkANACCCCA 240 NANAAAANJCA. CCCCNCNTGT GAGNNCNGCA GANGCGCCCT NTNACAAGAN AAGAGNNCAT 300 *.GTGNTGTGGC CCTGTGCTNN GACANTNTAN ACTCTTCTNT NG14GGGGNGN GGNCTGTGGT 360 TTTATA.AGAG NGTGTNNCCG TGGGGGGGAG AGTANTCNTT TTATATAGAG AGANAGNGNC 420 CTGTGNAAAC TNCCTCTaLLG AAGAGCACCN TGGTGTCTC TCCCATCTNC TAGNAGGGGA 480 GG482 INFORMATION FOR SEQ ID NO:28: SEQUENCE CHARACTERISTICS: LENGTH; 460 base pairs TYPE: nucleic acid STRANDEDNESS: doubie TOPOLOGY: linear MOLECULE TYPE: DNA (genornic) (Xi) SEQUENCE DESCRIPTIO4: SEQ ID NO:-28: TAGCTTCTCT GTGAGcGGTA GAACTCAAGC TCCCCCATGA ACAGGCTTTG GGGTTCCTGC CATCCCCTGG GGCTGTTCAT TAGGTGCCCA CACAGACTTC TCATGC-CATG ACTCACACTT 120 GACGTCACAG AGCACACAJA GAGCACAAAA GCAGGCTGAC CACATCCGGC CATGCACACC 180 CCTTTAJCAG TCCCAAGCTT TCTCTCTCTC TTCTAAGTCA CTGCCCTGGG AAGACGGTTT 240 54 CATACCCAAG CTGATGTGC.A CTTATTTCT'T TGTGTTATTG CTCTGACAGT CTCACAGTGC TCTGCAAACA CTCTGCATTC GCCTTTACCA CACCAGAAGA AATTCCTCTT TGTGCAGGGA AAAATACATT CGTCTTAGTA GCTTCTACTT TCCAGCTTG'r CCCTAGTCTG TCTGATATGT GGTTACGTJAN TGTTAGGGGC CAC GGAAGGG GGGGGGGGGG INFORMATION FOR SEQ ID NO:29: SEQUENCE CHARACTERISTICS: LENGTH: 465 base pairs TYPE: nucleic acid STRAN~DEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: TCCCAAGACA AGAGGGGCTG ?AGAACGGGG GGGGGAAGAA TCAGGAGTGT CCCACATAALA GACGGCACCT ANATCTGTCT CTCTCGGTGT CTCCTCCCCA GGTGAGCTCT CTAGACAAGA GAGAGACTGT CACAGAGAGA GAGAGATGTG GGAGATCAGA GNCNCCGAC:A CCTAGGGGAC AAATGGGGAT CTCTTTTTTT GAGACAGGGG GTCTCTGTGC AACACTTGCT GTTCTGGAGA TGTTCTGTAG CCCCCAACTC AGAGAGCCTC CTCCTTTNCA CAAC-TGTGTC GCCGCCGCCG CATCACCAGG CTATATTTAC TATTATCTCT ATTACTATTG TTGTGTGTTG GGATGCTCAC GC.ATAACCCT ANCTATCCTA GTGATAGACC CCACC INFORM'ATION FOR SE-Q ID SEQUENCE CHARACTERISTICS: LENGTH: 568 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic)
GTCGCTGCTT
CCTGGGGCAG
TCACCCCT GT
TTTCTCTCTC
ACCAGGGTGT
CCGCCGCCGC
TGTTGAGACA
120 180 240 300 360 420 465 (xi) SEQUENCE DESCRIPTION: SEQ ID TNNCNNTTNC CTGNGGCCGN GTANCTCTGA GNGANAGTNT CCCCGAGAGG GGGGGTCTCA CNNTAGNTNT ANANAGTATN GNGTGCTCGA GTTTNNAGAG AGCTCTCTCT NNNTCTCTCT CCCCNGAGCT ATNGN4T TAG GGNTATGGCA CNNCNCGT CT CTCNNCNCCN TATNGAGNGG TGNGNTATNG GGGNGAGAGT NTCTGCCCGA GACCCACATT CTCNGAGT14N GGN AGAGTNT GGGAGACACA CANCTCCGGG 14ANATCTNTC TCCNCCCCCC CAGGGGCGGT GGTNCANATN GNCNACAGAG CCNCNGNNTT NTATGTGGAG AGGGGATATC NCANCNCACN
CCCNGAGCAC
AGGNTCCACA CNCAGAGANG TGTCTCTCCC CANCACACAA GCACNTCTGG TGAGNTCTAJq GTTTTGNGAG AGACNNTGCC CTGTCTCCCT TTTCCCCGCT CTNACACACA
TGAGAGGGTG
TGCACATCTT CCCCATGTCC CTCTCTA.A. CCNCCCCAGA NTTTTGNGGT TNTGTGCAAjN ACCCTTTTCA CNCTCANGGG
AGATNTTT
INFORMATION FOR SEQ ID NO:31: SEQUENCE
CHARACTERISTICS:
LENGTH: 920 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (qenomic) 360 420 480 540 568
B.
a.
(XI) SEQUENCE DESCRIPTION: SEQ ID NO:31: GAGGGTTAN~T TGGCCCAANT CGGCAATCAT CCNGGGAAGA
AGA.NGNCAGC
ATCGGAAGAT CAAGGACGCA ATTCGNGGGG GGGGATGGAT
AGNNGCNAJAJ
AGNNGGATTG GNAGGNAAAA TTAAACGGGA GTTGTAATCC
AAAAGGACG;T
ACAAATCCGG NAGTAAGCAG GAAGCACAGT GAANTTGGGG GAGGCAGNG-1 AAAAATNGTT TTTTTAATCC CAATANGGTC AACANGTAGG
CAA.NTGGATN
TATATCTTAG CGCAAGNTTN TCACCCATTG GTCCAACCCA
TATAACATGG
TNTNTGAGCN TGGCACAATT TTT14ACCCAT TAGTTCCCA
GGCAGATCGC
GAANA)AAATC CCAATTCCAT GGTGGCCCAG TGTGTCCAGC
CACCAATANT
CAATTAAATC ACCACATGAA GGAATACATA ACACAATAC ATC TGATCCA TATPAATTTGC TCACNTAGAC ATACAAAATC CTGTACATTC
CATCTCTTA
AACAAACTAT AAATGTGTAG AGAGGAATTT TAATATCCAC
TTCCATGTTC
TCCTCTCTCC CAGTCTCCTC CTCCTCCTTT AAAACTTTTT
TCTCCCACCC
TTTGTCCNAA GGACGGGCCT TGTTNTATCC TGNACCTGCN
TTCGTCTGCA
ATCCCACAGG CAGGACTGGA GCAATGGCTC ATTGGTTAAG
AGCACTTGCT
PAGACCAGGG TGCAATTCTC AGAGCACTNC ACTGCTNCAC
ACTGAAAJGAC
GGTTTGGCA
GTAGAAGAGA
INFORMA~TION FOR SEQ ID NO:32:- SEQUENCE
CHARACTERISTICS:
LENGTH: 176 base pairs TYPE: nucleic acid STRANDEDNESS: double 3GTTTNGGCA I GGGNACNGA kCAAGGCAAAJA
GGNGNAJNTA
TATTAGATAT
CGGTc3GTNAA
CACCATGCCA
TTCTTGAATT
ATTGATAAGA
GAATATT
CAT
TCTTGGCTGC
ATCATTTTTT
TAAGGCCATC
GATCTTGAAG
CCCACNNGTA
120 180 240 300 360 420 480 540 600 660 720 780 840 900 920 TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: TTGACCATAT TATTTTTATT CACGTTGGGA CAAAAGAGCA AACGCAAAGG ATAGGAAACG AAAGGAATTA ATTTCCTTTC AATAGAGATA TCGGTTTTTT TTAGAGGGAA AAAATTGAGT 120 ATTAGAAAAT PAAAATAGGT TTCGGAATTT CCGGAAAGAC CACTAAATTG, TAGGTT 176 INFORMATION FOR SEQ ID NO:33: Wi SEQUENCE CHARACTERISTICS: LENGTH: 336 base pairs TYPE: nucleic acid STRANJDEDNESS: double TOPOLOGY: linear 0: 0%(ii) MOLECULE TYPE: DNA (genoinic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: AAAAGGGNTN CCGAAN4AAAA ANAATTNGGA. TCTTNTGGGG GCCCNGAGGN AAALrAAAANA NTAANCNGGG GGNGACCCAG NGAANAGACA AATTNTTTTN CCNGGAGTCC TTGGGGTGNN 120 ANGCCA.AACN GNCGTTTAN4N GNAAENNNGNC GNGNTACCNC TTCGGAGNGG GGGCGCTGN 180 AAAGAATNGT GAGAATN-CN4G TTACNNGTGT TGNTTNATCN GAGATAGTNG TNTGTAACPA 240 *CCCCGATTCA GCCNGAAAGT TACGCATATG CGNANCGTTG TGTGAATCGA ACCTGGNNAA 300 AACAGACCCA TNGNCAAGNG GCAGACCNAA CGGAAC 336 INFORMATION FOR SE-Q ID NO:34: SEQUENCE CHARACTERISTICS: LENGTH: 92 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: TGAATAAGGG TACAJAAGATT GTGTTTCAGA GGAGAGAGGT AACAAGAA.AA GACTCCTAAC GCAATGGCCA GAGGGCCAAG AAAAAGGGAA A 9 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 838 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID GGNGTNATTT TCTTCTNGTG AANTCTTTNC CAAATCCGNG GGTNTGNCCC ANNGCCCCNN
TTTATACACN
ANATATGNGA
NGACCCTCTN
AAAATATNNC
GCACAGNGCC
ATNNACANAC
TTTTTNCCCC
CCCNTTTNNA
TTTAAAAAAN
GTNTNCCTTN
CNCCCNCTGT
CCTCTTNTCC
NNATTACNCN
CTCAGTTTGA
NGGNGNTTTA
NNTTCGCSGGG
CTGTGTTNTN
CCCNADNAAAT
TCAGAAATNT
GNCGCCCCCT
AACANTTTTT
CC.ATAT NC CC
TTCCCCCTTT
N4GAGATTTT T
TNNNCCAAAA
GTNTCCCCAN
TTTATATATN
GN GGGAGATT
TCCCCCTCNC
AT NCC CCTTN
TTNTAATNTG
N NAAA CCC CC
GTTNGGGCTN
CCTNTTT GAG
TNNAAAAACN
TCCT CNTNNT
CNCTATATGT
NTTGGNGflG N GN CCC NATA
TCTCTCTGNN
CGAAAANAAT
TCTACCNCCC
GGNNAAAAAA
NCTNTTNANA
GGGTNTNCCA
ACNTTTAAA-N
TCNGGCCCCT
NNCTAATTCC
14NNCTNAATT NTCGAINAT GY
GGGTATNTGG
TAACNCAGAG
GTAGNGCNCT
TTTNTNCAPA
CTCAAAN4ACA
ATCTNNGNTG
GANAAATATG;
NCCCTTCACT
AACCCTCTCC
TNGCCCCCCT
NTTNTTCNAN
NTNGGGNAGG
CCCATNTTAA
GTAANACAN
ATCTGTGTAA
CNNCTGAGAN
AANANANAAT
CCNCNNT±
GNNTTNTCCC
TANACTCNTA
CTCTTTGTGG
CTAATTCCTC
TTTCTNACTC
TCTANATNNC
TTCCAACC
NNTGTTNCNA NCGCANGNTN NCCCCNCCTT INFORMATION FOR SEQ ID NO:36: SEQUENCE CHARACTERISTICS: LENGTH: 314 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID ND:36: CAAACCAGAA ATGGCCCAAG GGTCATCTCC CCACTCAGTA TGAATAACAT CTAACCTCCA CAAPAAACCCC AAAAAAAAAC ACCCCAGATG TGAGAACAGC AGAAGCGCCC TATAACAAGA AAAGAGAACA TGTGATGTGG CCCTGTGCTA AGACAATATA AACTCTTCTA TAGAGGGGAG AGGACTGTGG TTTTATAAGA GAGTGTAACC GTGGGGGGGA GAGTAATCAT TTTTATATAG 120 180 240 58 AGAGAAAGAG ACCTGTGAAA ACTACCTCTG AGAAGAGCAC CATGGTGTTC TCTCCCATCT ACTAGAAGGG GAGG INFORMATION FOR SEQ ID NO:37: SEQUENCE CHARACTERISTICS: LENGTH: 226 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: AGGGGGGGAA ACCCCTTCGC CNCGGGCCTA TCGNAANTTT TNNTCCACCG TAAAANATTT NC CANGNGCN C CATGTANGG ATTGN GGGNG TAGTGGGGGG AAC GATTNT G GAGGGGCCTA AAAGGNANAT AGAGGACGTA TTGTATTTGG TTTTGCNGAG CCAGTACCTT NGAAAAAGGT TGGTATTTTT GATCCGGCAA CAACCACNGT GGTAGNGTGT TTTTTT INFORMATION FOR SEQ ID NO:38: SEQUENCE CHARACTERISTICS: LENGTH: 843 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (Xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: GAATTAAAAC GGGAPAGATT GGAATTCAAT TTCTTACAGC CAAAAGCTAG ACCGGGCATA
TAGGAGATTA
ATTATTTAA
CAGTCCTAGG
TTGATCAGGC
TCACTGAAAG
GGCCACAGAC
ACAGGAAACA
AACCAGCCTG
AGCACACT CC
TCTATCTGCT
TTTC GATTTA
AGCAGGTTCC
TTTCAGAAGA
CAGCAATCAT
GAGATTATTC
CCAGGGTAAG
CTGTCCAGGC
AGTTAATGAG
GGGCCATATC
TTGTTATCAC
GCACCTTCCA
GGGAAGTTCC
GTTCAAACAC
ACAACAGT GT
TAGGTTTGGA
CCCTGTAGCC
AGGACTGGCA
AAAAATTAAT
TCAACTAGGT
AGATATGTTT
AAGC CTGCCC
AAGATAGGCC
GGGTCTTCAG
TTGTTGTAGT
GATACAAAAT
AGC-ACTAGCA
AGCCATAAAG
GGGACGTCTG
GTCCTCCAGC
G.2ATGAGC CA
CAGATTTAAA
TAGAGGTAAT
GAkAGACGG
ATTACCTTTT
TAAAAGAATA
GGCCATAAAG
ATAAGGAAAA
GCAGGAAGAC
CCCTGACTTA
ATTGTATGTA
GTTTAGGGGT
GGTAT GCAAG
AAAGTGTAGA
CTAAT GGTTG
AACCCCAAAA
AAAAAGGAGC
GGARTGCAGG
ATCTCCCCCTA
TAGCACGTAC
ACCACGCCAA
we 59 AACCCCCTAG CTTTGTCTAT ATAACCGTCT GACTTTTGAG TTTCGTGTTC AACTCCTCTG TATCTTGGGT GAGACACGTG TTGGCCCGGA GCTT1CGTTAT TATTAAACGA CCTCTTGCTA TTACATCATG ACCAGTCTGG TCCTGTTGTA. AGACATTGGC AAAAGAGCCT GAAAACTAGA
AAA
INFORMATION FOR SEQ ID, NO:39: SEQUENCE CHRACTERISTICS: LENGTH: 943 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genormic) (xi) SEQUENCE DESCRIPTION: SEQ ID No:39: TTTTTTTTTT GGAAAAACGG GT'ITAATAAG GGGNANGN'JAT CCGAACCCCC: ACTCGGGNGA 720 780 840 843 AAGGAAAN4AA
CTAGTCCCAT
CCACCCCATT
AN\ATGTCCGT
AAGTTTGTGT
AGACCNhTGT
ATGGGTCTGC
GCATTNT GTC
GGGAAGTTAG
NTTTCAGNTT
CCCGGAGCAG
CNTTTGCGGT
ACTGCC.AACC
TTTGACTCTT
TACTTGTGAT
AANAATANGG
TNTTCGGGGG
TTTTCGGGGG
TTNATCCAAG
TCNNGAGNAG
TTGGGAGGGA
AGACAC-TNTG
CAAAGGAGGA
ACACACCCGG
TNACCCAATT
CTGACAT GGG
TAAAACCAGG
CTGAT GCCNT
CAAAAGTTGT
AGTCCCACAA
GGGGAANAAN1
GGGAAAGGGA
TAAGTCNGTT
GNGTTTTGGG
TTTGTAATT G
GATCCAATTT
AAGTNTAT GA
AA!PCCAGCAG
CCAGTTGCAG
TACTTTCGTT
AGGCTTTGAA
AGCNTGGGCC
CAGTTTAGTN
CCTGTAGCAG
GGANCTTNGC
GANTTGGN'GG
NGGCATGAAT
TTTTTITTGGT
TGTTNNAATT
GTTCAGCNGG
TNTAGTTCCC
GTT1GGTCCCT
C-AGACTAGA
CC TTT CCAC C
GGCTTAGCAT
ACTTCCATTA
AATGAGATGG
TT GGAATTCA C-GCAzGTrGGTG
TAATGCTTTA
AATGGGGTGA
ANATCAAAGT
AGNATTTNNG
TTTTTTTGTG
ATTTGGCTGT
TCTCNTATCA
TTTCAGTNTC
CCCAA-NGAGT
GCAGANTCTT
TCATAGAATG
NTCANTGAGC
CAGGGTAGAA
CCAC GACAAA
AGGCNGGCAC
TCCTTTCGGA
NGAGTTTCAA
NCAGGAAAGC
TT CCTTAGTLA
GCCCGGGGTG
CTTTNTAACA
GAACCCTGCC
TGGCTCCATG
GCAGGCAGGT
AAGGCGCTT
GTTGAAAA.CC
120 180 240 300 360 420 480 540 600 660 720 780 840 900 GTGCAIACNT TTAA.TTGNN G AAGTAAGAAG TCG INFORX~kTION FOR SEQ ID 1-O:40: SEQUENCE CHARACTERISTICS: LENGTH: 904 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID
ACTTCTCTAC
CTAAGACTTG
CTTrCTCAGG
AGCTACACAC
ACATGGCTGC
AATATTAATA
TTTACACACT
TGTTTTAATA
TTATTGTTTT
TTTTTAAAC C
AATATCTCCC
TCCAGAATCC
GTTTTTAAAT
CC CAACAT C
TGATTGAGGT
TTGCCATGGT CCTTGTGGAA TCTTTCAATC TGTGTCCTTA GAACGCTAAG
ACCTTGGCTC
CAGGT GTTTT
CCCTCCAAGC
CTCAGCACCT
AATGAGACTT
CTGTGGAGCT
TTTTATAT GT
GTTC-AAAPAA
TCTCTCCTGC
TTTTACCCTG
TAATTTTATT
CACCTGrCCTT
CTCCCCCCCT
GGTCCCCCCT
CCAGGGCGGG
CGTTTAAGAA
CAATCTGGAG
CCCTAAATGA
AACCT GATGG
GTTACAAGGT
TTGTCTTTTA
AAAAGTTTTA
CAAAGGAAAC
GAAACATTAA
TTT TAT TAAA CAAAACC CCC
AATAACACCT
CTTTTTTGCC
CTGGGACTTG
AATAAAC CAT
TGGCTCTGCC
AGGGAACAGA
CTCAAGGCTC
CAGTCAGTCA
AAAAACCTAA
CACAATGATC
CAAGCAAACT
AAATAAGGAT
AAAAAATAAA
CTGGAAATTT
GATTGATACC
GTTTGATTTC
GCCACCCCGT
CCAAGTCCGG
CAACCCCCAC
GTGTCTCCTG
TCAGGGGGCT
TTTGCATGGG
GAT CTATATC
AAAAAGTTCA
TTTTCCAGAA
CCCTGAATTA
ACCCCCTT.AA
TTAAAATTTT
CACCAATTTT
CCCCGTTAAA
GAAAAGGGCT
GCAGACTGAG
TGCTGGGAAA
TGGCCTTGAA
TTTTTTTGTT
ACAGACAATC
TTTTTACATT
AAT.GAAGTCT
ACCTGATAAG
AAAATTCTAT
CTGACGGGCG
TTTTTTGTTC
C CACTGTGGG
AAATTTAGAA
120 180 240 300 360 420 480 540 600 660 720 780 840 900 904 p
C
p.
C
CP
A.AAG
INFORMATION FOR SEQ ID NO:41:.
SEQUENCE CKPARACTERISTICS: LENGTH: 917 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: AAGGGGGGNG AAATTTAGNG GACNAAAATT ATTCCTTAAG GGCCNCCTTT CTTCAGGGAA NANGGGGGaDA GGAGATANTN CGGC CCTT OT CC GCCTTTTN GGANACGATA GGGNC GGTTC GGNTTGGAAA TTTTTCCTCC AAAATTNCCA ACAAAAATNG TTTTTCCCCT TCCTTCAAAA AGAAAATTGG TTTTT"TTGNN GGCTTNGGGG NGTCNGGAAG TCANAACCCN GNGTATTATT GCNTTCCAGC CCCACCCGTN AGTTCATTGG TAATTCCTAT TCGTTCGGNT CAANATAATT CGGNACTTCC GCTTCCNAAT GGATCCCTTC TCCcCNGGTT
CCCACCCCCA
CAGTGACNTA
TGTGTTANAA
CNNGCNTTCG
AC CCC CTCCN
CANNTNGTGT
AGACAGTC GG TccA7AAAGCT
NTCCAATCCG
NGACCACCNT
GATCCTTNTT
AAAACAN'NA
GGCGGGCNGT
'CCACGCCTT
CAATTCCNGA
CCNATCTC CA
CGCTGTCCTC
GAGCGCNTCG
TGGTTNTTTA
CGGTCTTTCC
NAANAANCT C
NTCTGCCTTC
CNTCCAGNTT
CCGCGGCGGG
TAGGCCGTTC
TTTCCGGGNC
61
AANGATTNGG
GATATTTCC G
GGTGGGTCTT
GGCTCATTTT
CGCCTCGCCC
TCCACGTGAC
CAGCTTNTGT
GGCCGGGCAG
CCTATNCTN C T'rCCATTNNG
GNTNTCCGTG
TGATCCGCTT
AGTCTC GAGT
TTCCGNTTCG
GNTTNTTCGG
GCTCGTCCCG
I4TGGGGNATN
CCTGATTTTT
GNGTNTrCCAN TTTTTCCGGA TTATCGCAAG
CNTTTCTAGC
CACGTTGCTT
TATTCTCAGC
GTTCTTTCCG
CNTCCCAGTN
GNTGTGCCGC
TAGGGCGGGC
TTAAACCATT
AAGGAAGNAA
420 480 540 600 660 720 780 840 900 917 es *0 ~0 0 *00*
S
*00* *0
S
00 S S S GNCNAGTAAA
GGANCTC
INFORMATION FOR SEQ 1D NO:42: Ci) SEQUENCE CH-ARACTERISTICS: LENGTH: 835 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear Cii) MOLECULE TYPE: DNA (genomia) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: GGNCCCCTAN NGATTGGCCN4 TTGATCAAGA NGGGACCATC CTGNACCTGG
NGGTNGNTGT
TTCCGCTTGG
NTPNTTTTTT
GATACCTCAG
TCCGAfl.CTTT
TACCAAGCTG
GCTTCTGTAA
CACACACACA
CAAAC.L~kTAA CTTTGIkTGA
AAAAGTGACA
TGATCTAATC
ATAGCTCATC
GACGGAGATG
TGTTNTGGTC
NTCCTT GTGA
TCAATTCCCN
GTTGGTAACC
CTTGTCCCCT
CACACACACA
AAGAAAAPAA
CAGCAAGATA
ATCCTTACTC
AGTTTTATTT
TGAGGATGAG
GTTGTTTTTG
CAGACCGTTT
ACCCAGGGTG
GACTAACCAT
TGAGTTCAGT
AACTACCCCC
CACACAGAGA
TAAAATCTCA
AAGTAAAC CA
CAGCCCTTCC
GAGGCAGGGG
TTTGAAC CTC
CGG.AGTAGTT
TGATTTAGCC
CAGNTGGTTC
TGATGTCAAG
CCCTGGAAC C
AATACACGCA
GAGAGAGAGA
TTTA.ATTTTC
AAGCACACTG
TGCTATGTTG
CTCATGTAGC
TGACCCTCCT
TCNGNGGGTT
GCNGCNGACA
AGCAGGATAG
TTGAGTGTTT
CACATGGGGA
TGCGCGCGCG
GAGAGAGAGA
ATTAGTATAA
TAGAAGGGAT
GCAGTCTTGC
CCAGGAGGAT
CATT CT CCAG
TGAGGCGCGG
GTA.ATGGGGC
ATGTACAGCC
AA.ATGCTTGC
GAGAGAACAT
CGCGCACACA
GAGAGAAGCA
TACCTTGATT
TACGCAACTG
TGGGAGCCAT
GGTCAAATCC
TTCTCCATAT
CCTGAGTGCT GGCACTGAAA GACNCCACNA GTAGCCTTGG CAGGCTAGAA ANGNT INFORMATION FOR SEQ ID NO:43: SEQUENCE CHARACTERISTICS: LENGTH: 924 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: GTNTTTTNGC C GNGGGAATT TAAGGGNGAT TTGGAGACTT TNGAATTTTC GAAJLNGTTC CA 0*6
AAATAGANNT
TAGATTTNTA
CTTGTTTATG
AGANGAGGAT
AC.ATAATTGG
GAAGATGGAG
ATAGCAAMCAT
ATGTANTCTT
GTCAAAT GAG ACAATC CTAT
GTGTGTGTGT
TGTGTGTATG
TCATTGTAAG
CTTCAGCTGT
ACAGGAGCTT
TNAGGNCAAT
TGGAAACCCT
GGAA1GGTGN
TGGGGAATAG
AACATTAAGG
AGTATTGTAT
TAAAAAAGAT
TGTTTATCAG
ACNGCATAGA
CCCAAATGTT
TTGTGTGTGT
TCTATTGCAT
ATATTGTGCT
TAAkAGGCTAG
AGCAAGNTAA
GGGNTTGGGG
GGGGGTTCCA
GATAGCAGCC
AACAATGAGA
AAATATATCC
TTCAGATAGA
AGTAAT CTAA
GTTTTACTTC
TCCCCAGAGA
TGCGTAGACT
GTGTGTCCCG
TAGTAGAGAT
GTAT GTGATA
ACTCACTACC
TAGO
CAGNGGNGCT
GTTTAATCCC
NGAAACAGAG
GTCTTGGTAA
ATGCATTCTG
GATANGACTA
TTTCACATAA
TCAGAAATTG
ACAGAGAGAC
CAAGCTCGTA
CACATGCTTG
GTTAAGGTTG
AGAATCAATG
AAAAATAGNG
TTTTTAAATC
TTCATCATCT
GTTTTTATTA
TATTNTTCNG
TACTT GCAAA
TACCTGTTAT
CCATTACTAC
CAGCATCTCC
TGGGAAATCA
TCAGCTCATA
AGTAT GCATG
AATGTATTTT
TAACAAGGCT
CNATCAGT GT
AN*ANAAGT.AT
TGAAATATNA.
TTACTGTTAG
GAAACAACNG
TTGCTCCAAG
TTTTTTCATT
TAAAGTATAT
TACAGAGC CT
TTGAAATTAC
AGATCAGTGT
TGTGCATGCA
CTGCTCATGG
GGAGAGAT GA
GAANTTCCCC
0505 5 0050 00 S 00
S.
INFORMATION FOR SEQ ID NO:44: (ai) SEQUENCE CHARACTERISTICS: LENGTH: 435 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 6 GATTCCAGAG AGAGGAGTGA ACTGGCAGAT AP.G5 TGTGCTTTCG CTCACTATGC ACCCATGACA
CAAGJ
GCAGAGTATA CACTGGTTGG GTAAATGAAG
AGGA(
GGATAT GGAC TTCAAATTTG ATGAACAAGSC
AATT
TATGAAGACC C GTTTGCAAPA GCAGTGGTC.A
TAAG
GAGAGAGAGA GAGAGAG1NAA GAGAGAGAGN
GTGT
TTGGTTNATA ACAANATNTA CCTTTGGGCN
CTTTJ
NCAAGCTAGA
AAGGT
INFORMATION FOR SEQ ID No: SEQUENCE CHARAkCTERISTICS: LENGTH: 919 base oairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) 3
CAGTCA
nLTCACA
GAGACA
CAAATG
rkGAGAA
GTTGTT
NGAAAG
GCATAATGGC
GGGTACAGGC
GAGTGGGAAG
AGTATCGTGG
AAGAGAGAGA
GTTGTTGTTG
ACT1'ZTNCACA
TTAGATACC-A
CTGGACCATG
TCGGCTTAGT
GCTTGANTGG
GAGAGAGAGA
TTGTTGTTTA
AAGGAGCTTG
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: CCCCNGTTAC CCNGA.NGTTT ACNNGTTGGA TTAALANGGGN NNNAAAACGG GTGGGGNNAA
ACGAATTTTT
GAAA~NGGAAA
NTTNGTTCAG
NCCNGGAANG
TTGGGGNGAT
GTTTTCCTTC
NGTGAAACAA
TAGATTGAAC
NATAANNTN
TAAAGCTTAT
CCNTCGGGGA
AGP.ACCATCC
GGAGCTAGAG
TTAGCAGAGG
TTGGCAAGTT
TGTNCNCGAC
GGAAATAAAA
TAGGGTTCGG
TACCTTGGGN
TTTNNGCCCC
CAGAGAGAGG
ACCAGGNTNT
CTGCAGAGTT
TGNTGACCAT
TTCAGTNTCA
GAATGTGGGA
AGGGAACCTG
CT CCAA'TAG
TTGTNTTGAC
AATGAAGTC
CCNTCCCCGG
ANATT'1TTTTT
GCCCGGGAGG
AGGGA'ITACC
ACCTGGACCA
GTTAGGTTC
C
GAAGAGACCA
GC CTGTTACC
NTCAGCAAGT
CCCGCTGGGG
GGTGGCGATG
TGCGTTTGAAL
GAGCTGTGAT
C AC CCA G14CT
TTGGGGNTGG
TNAAGGAAGT
NAAGGCAAN-4
NTGNAATTTN
NTTTNGGGAA
TTCAGGGGNT
GNCGGGGGGG
TGAAGTTGTC
GTCACCTTCG
AGANACATTC
TGGGAGGGAT
GGTNTGAGTT
CAGGCTGTGT
ATTGAATTGN
NGAAATAAGT
TCCTTNCCAC
TTGAANTNCA
TTTAAGAAAA
ANGCAGAAAC
TCCAAGGACG
GGGGAGGGGG
ACCNTTTNAC
TTGCCAGGAC
AGGGCATGGG
TCGAGAGAAG
ACACACAGGC
GTGTGTGCTG
G(INTNNTCC C
TTTAAGGTGG
AAAAAA'JTNG
NTTAAJAAATT
NNTGGGTNTT
GTTCCAGNGN
GGGACCAA
CCGTTNTAGA
C14ACANACTT
ACAAGTTTCT
CGTCCCCCAG
AGAATGCTTA
TGCT CAGGAA GAAGGGC GAG
AAANGGANNT
64 INFORMATION FOR SEQ ID NO:46: SEQUENCE CH.AR~ACTERISTICS: LENGTH: 915 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SE-Q ID NO:46: TTTTTTGGAA TNTTGGAACC NCGNTTTGGA AGAAGACCTT TNNNNTNCAA TTGGGGAANA ATAACC GGGG
GNNGGNAGGG
TTGTTThAAA
TTATTTTCCT
ATCAC CACGG ATTTC CAT
CATACC-TGCN
GATTC CTGAA
GTCAGTAGAC
AGGGTAGACA
ATTTTCCTGG
GTCCTAGCTG
CACATGGCTG
CCCATCTGGG
C CAAACC TT G G14GGAGGTTA AGAGGNTTG C
TTTAACNTTT
NGTTTAAAAN
AA-ACCTCN GT
TAGTTNTGGC
GGAGTGAAGG
A'NAAANAGC
GCTGACAGGC
CAGGAGTGGA
GGTGCTCCTC
CCTCAGNTCA
GACCCNTACA
GGAAGGGGGG
NTATNNCGGT
NGGGCNTGNT
GAAGGTGAAG
GTNTTTTTAT
GAAACCTTCT
CTTCCCTTTC
TTTGGGA.AAG
CGNAGGGCAG
CCGCCCACTT
AGAAGTTGGT
AGTTACATCT
AACC GGAAAC
AATTTANGGN
AAAi-ANATTC C
TGNGGAAGTT
CCCTTCAACC
CCGGGTTATT
TTCGNTTT1NA
TTGATCCTGC
CTTNTCGTCC
GGGGAGGGAC
CCCGGGGTGA
TGGCTCCTGC
ATCGAGTCTT
CCAAGTGTCT
CCAAGAGGCG
TTGTACTNAN
NGGGGGGAGG
TGGAATTGTC
ANGAGGTGGG
TNTTTGTC CT
TGGAGGNGAG
CTNGTGTTTC
TTCTTCCATT
AGAGTGTCCA
AACCACAAGG
NTTC GCTGTC
TGAGCCCTGA
CTCAGGGGTT
GAAACATGCT
GGATTNCCAC
TAATTTNTTG
CNAANGGATT
GCCNTTGCAT
TCGTACATTT
TTAAAT14TCN
CTGAGTGNGA
CCCTTCCGAA
GGGCTTGCGT
CAGAGGCCCC
TCACCCCAGA
CTCATTNTCT
CAGTGTTAGC
T CAT TTAJATT
AANGXNAAAG
GCNAGNTAGA NAGGT INFORMATION FOR SEQ ID NO:472: SEQUENCE CHARACTERISTICS: LENGTH: 849 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE. DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID tiO:47: GTTPLANANG A-DAAAGN4GGG GGTGACAGGG GGNGANACCC NTTGCGCCGG GCTATGGATT
NTNGGCACCG
GAGGGGCCAA
GTGTTNGATC
AAAAGATNTC
ACAATTCACA
GGTTTTCTCA
TCAAAGGGA
TAAGTAATCT
GAATCACAAT
CGCGTCCTTN
CCTCCTTNTA
AGGCCTTAAA
AAACCATGTC
AATATTAAC
ANAAGATTTN
AAGGANAAGG
GGN'APACAAC
AGGAGATCTT
AGATTTGTTC
ANAAAT GGH
NTTTGAAGGA
CCCGGANTAC
TTCCTAACCA
GTGATGGTTT
TCCCTCCCTT
CTTGTGATCC
CAGNNACTTC
CAGGNGACAN
AGGAN GATTG
CACGNGNAGN
GATTTTTTTC
ACAGGGAGNT
TCAGT CAGGT
GTGCTTTGTC
TGNNGANGCG
TANGANTNTT
CAAAGTCNGG
CCTTTTTTCC
TCCTGTCTCA
CTCCTAATCC
GGAAGGTGGN
ATTGGTTNGG
GNGTTTTTGT
GGGTCGAGCT
CNAGGAGGTG
GNTTGCCTAG
CTGTGGAGC.A
TTCCCAGAGA
GTTAATCTCA
AATATNTTTT
TTTCACAGGA
GCCTCCTAGG
CATCTTCAG-A
NGGOGANGG
GAGCAGTACT
7 GCAGCAGAG
ANGTTGGGGG
GTCCCANTAG
ATCTTTCATT
AT'?GACTCAA
GGTCCCCCGT
CCACATAAAC
CCTCCATCCC
TCTCANNATG
TOT TAAGATG TAT CCTTTA.A
GGAAAGTTTN
TGGAAAGAGT
ANRAAGNGAGA
ATGNGAGGGN
CCGGTAGGGG
AGTTCCTC CC
TCAATAAACN
AGTNACCAGT
CCACAATTCT
TCCTTTCCTT
CAGCCCAGTC
ACCCAAATGT
GACCAAATTA
120 180 240 300 360 420 480 540 600 660 720 780 840 849 9* INFORMATION FOR SEQ ID NO:48: SEQUENCE CHARACTERISTICS: LENGTH: 925 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID 110:48: AAAAAAANAA ATNTTGGNGG ACCNAANACC ACC.AATGGGT TTTGGGGTCC GANCGNNCAA
ACNTGNTTTC
GTTCCAAACC
TNAGTATCTA
AGNTGTNTAA
GN7AAAAANNA
NNATTAATAA
CACACTGTGA
GTTCCGTTTT
NGAGATGNGG
ANT GTTNTT1C
CNATGTTTTC
GTTAAGAGGG
TNAACCCGTG
ATGTGGTANG
GCAGNCAATT
CAGTGGTTTA
GTTCGGNTGA
NTTGAGCNTT
TGGNTTTNTT
GCNCAATTTA
GCCATTTNGA
AAGGGGCTGA
GAGACAGTAG
GAGGAGGTTA
TGTNACANNA
AGGTAGSNCA
CAGACCNAGN
TGNNTAAACT
GGCGGGGNGG
GATTGACACC
GGGGN GTT GG
NNTANTCGGA
TCCACGACAG
TNTCGGGAGN
ANACTGGCAN
TNCA14GGNNN
TGC-GGTTTTA
GGAATCCNTT
TGAGTTAAAC
TTANGATNCT
NCAANTNCGC
NGANAGGTGC
GATGGNGCCA
AGGTGTTNGG
NGGACNANGG
AGGGTTNAAG
T GGC-GANGTT
TTCNGAACNN
CAATNNTAGG
ATCGGCCNTT
AGACCCCACG
CAC CNACTGA
GGGCNAGACG
TCCCCNGNGC
120 180 240 300 360 420 480 540 600 66 CNTTCTAGCC TNGAGCAGNT TCNAGAGAAN TATTCGNCGG GTATAGGTCG CCCCNANGAC GCNAAACGAC CGNGAGCGAG GGCGGAACAG CCAATCAGTT CGANTTATCG TGTNTGTTNG CGGGGTTTGA TCCCNGAGTT AGNTCAATGA GCCCANAACC CTGAGTGGAG GNACCGTCAT GGGAGGAGAG GNGAGTCACC NGGTACCTGG CATACNGATG GACCATCCAG TANTTGGATN GGAGGGCGAT ATNGTNANTC TTAGGGGNTC TCCTGAGGAG GGNATACCCG TGAGT'rCCGT AAGGGCGTTN GCAAGTAAINA AGTCG INFORMATION FOR SEQ ID NO:49: SEQUENCE CHARACTERISTICS: LENGTH: 827 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: GCCAGTTGCC CTCAGATGNC CNATACCCCA CNGGGGGNGT CTCNCCCCTC TCTCAANTGT
C
ACACACACTT
TGTGTGT GCA
GCCCTAAAAT
G14CCCNNATA
TTTNTAANAC
ATATACACNA
GCGCAANGAG
ATCT CGGGAC
TTGNGAAAAC
NANATNTNTC
TCTCCCCNTN
NGGANAATAT
CCCCATAGAC
CNTGTGNGTG
GTTNTNTGTT
CCCNGACANN
AT CTCTCCCC
GTGTGTNAGA
AGNGCAGNGT
TCNNCCTCAG
TCANNTGTGT
TCTCAAAACA
ACATCTCTCG
TNCCCCCCTG
ACNGGGGACC
TGTGTGNTGC
CNCCACTNGG
GAATGTGTGN
NNNATATCTN
CACAC CCC CA
GCTTACTCCT
CNCATTCTCT
ATAGTGCTCT
TGTGCATGNG
NGNNAANLANA
NGACCANTCC
ATAGCTCTAG
CCCAAACACA
NCCTCATNTN
NTNCCCATNN
TTNTTTNNTN
CACCCCAAAT
CGCCCCCTCT
ATCTCCCANA
GNGTGTNACC
CGTGTAACAC
AATATATCCC
CTCCCCGGAG
GGGGAAAACA
GGGGTNTCTC
NACATACCCC
GCGCTNTCAC
N GGGTCTCA
GNGCGGGGGG
AGAAAACTCA
AANACACAGA
CCNAGNCCAC
TCNCCATCTC
CTCNNTTANC
ACCNANCCCC
AAAT N TTAT N
TTCCCCAGNG
CCNNGNCTCN
CACCACAGNT
TGGAGACNAC
AGGGCTCTTA
CACTNTTNAG
GNNACCCTNT
ACCCCC.ATAA
TCGGGCNNGC
C CC CGTGTCC
CCCGTGGANA.
120
ISO
240 300 360 420 480 540 600 660 720 78 CCCCCCCCNG GNATCAACCC CCCCGGGTAN ACAACCCCCG GAACCCC INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 899 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear 67 (ii) MOLECULE TYPE: DNA (genomic) (mi) SEQUENCE DESCRIPTION: SEQ ID 140:50: AAA.AATT GTA AGGAGTTGGG GGNATCCCCC ATAATTNAAA NAGGC-AACAA NCCNTAAAGG
GAGGGNNGGG
GGAATTTGTT
AGGAAGT SAT
TGAATTTGAG
GAGGAGCAAG
ACAGATGAGA
ATNTGGAAAA
TACGTTCAGC
GTCTTCAATG
GATTCGCTCA
GTT GGATTNT
GCAGGGGGGG
TCCCGGGGCT
AANGGCCAAN
ANAATTTTNN
ACCNGGGTTA
TGAGT GCAGG
GGNTGGGCAG
ACGTTATTGG
GTTCTGGNTT
CCC CCCACCC
GACCTGCCAA
GTAGTTAAGC
NTGAGGCATG
C GGG C GGAC C
GAGTTGGAAT
ATTGGNTTAA
TAATGGAAAT
TCAAGTNAAA
T GAAGTGAGA
TGTAGGTGGT
AGGACAGGCA
CAGGCTTGAT
TTACGGAAGT
T CAGAAAGCA
AGTCTTAACT
TTCAGGCAAG
GGCAGGGGAC
CCGCGGCTAC
AAANAGTA.N
NGGGCACTTC
CNTGATTCTT
CTTGGGAGNA
GNGGTGGTCC
CPAGTGTTAC
GC TTGGCC C
TNTCGTCACT
AGSCSGCTT
GGTTNTSGCT
CTCCAAAGTT
TGAGCASTGS
CC GTGAGST C
GGGC-TGCAAC
AATTGGSANS
CSNGNNGAGG
CAGGTCATC
TTCCTGSGGT
TGAAATGCAA
SCAACTGTGN
GAGANTAGTG
TTCCGGGTGC
SCTGTGCTCT
SCGACATGGT
GAGCTGGT ST
TTAGCCACTC
TCTTGAAAGA
ATAAAACCCC
GAAAGGATAT
CCACCCAAG
GSCGGGGAG
ATCCCTGTAG
ACTITTCC CT S
GCTAATCAGA
NTAGGTGTAS
CTGTCCTGCC
G.AGCACAGGG
GGTGGGTCTT
ACTAGACCCA
CCCCAC GAG TTTGSTTSAT CCANACACAA 120 180 240 300 360 420 480 540 600 660 720 780 840 899 GCGGCAGTTT CTGAATAACT TTCCTTSTAG INFORMATION FOR SEQ ID NO:51: SEQUENCE CHARACTERISTICS: LENGTH: 952 base pairs TYPE: nucleic acid STPANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (geriomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: AAAACATTCG CNAGACTTGT AATAATTNCC NGTTNGGGGA AkANASNGGN GGGCNSGGGA NCCGAGSTTC CCCCCAAATT TCTTANNAAT TGAGSCANAT AACCCAN4NGN TCNNNAAGSN GGGGTTTTTC. CCNTTNGCCC CCTTSSGGNT ACCNTNAGTT AACSSSGANA ACCCGCCNTG TCCTNNGGGA GSQGGGTTCC NCSTNGTGGG TTTCAGTTCG GACCAGGTCG TTNACTCGAA AACNGCTCCG CCGCTNGSCN GNCTCTTCAN NSCTAACGNG GTAAGTATTT TCATGTGTCC
NTGNGCTTCG
TNANGGGGGG
TNACAANTTG
CTNCGGAGTT
CNGTATNCAC
GAACGT GTTA 120 180 240 300 360 CACTCCAAGT ATCCCCATCT GCANCAACCN CCCCTTACCN CCACCNTCTN CACGNCTCCA ACCNGCNANC NCAAGATNCC TCCNCACTGC NNCATCACAC, GGACNCACCT ACGCNGCCAA CCCGNANACC GACANNCNAC NCCCNCNCTN CTTTGCTATC NACAGTCGCA AACCNTCTCC CGACCNTCNG ATCATGTCNT CCGGTCGNTG TCCACTTCNC CACCCGGCAG NTAACCCACC CAGATCGACA GATNTGTTAG GCGCTCTCT GACGTTNACC ATAACANTNT CACACACAAT TTCACTCACG CTCAAAGACC TACCTGAALAT
CG
INFORMATION FOR SEQ ID NO:52: SEQUENCE CHARACTERISTICS: LENGTH: 967 base pairs TYPE: nucleic acid STRANDEDNESS: double ID) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic)
ACACCCACAC
TCCACACTGC
ACAGACTTCN
TCNGCTACNA
ACATAACCNC
AGACAATTCN
NC GANAG CAN
CCANTTGTAA
CCTCATCNGN
CACNACCCAN
TGNATTGCAT
CCANCCZAGG
TCCTCACTAT
GATNTNTCCG
NNGAGNCC
NTCNC CAAC
S
S.
5S (xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: CCCNGCCCTT AAAANACATT ANGCCTTTTC CCNCCGGAAN CCCCNNCCNC AAAN CCT TC C
CGCCTTCGTA
CAAATNTTAG
NAGGCGTNT
CGAAAAACAT
NCNNAAAATT
TTNTCCAATT
NGCAACATTC
NTCAGAAACT
NCNCACATTC
ATAAATNNAA
NCCCCTGGAT
CCNTACCTGC
GCACAACCCG
TCTCATCGGT
ATTTGTC CCC NGCC CANAAC
TAAACGCAAN
TTCACGANAC
TNANACNCAT
TTCTTCTTTN
TCAG GCTTCT
CGGCTCAATC
TCACTCTACA
ACATNCTTAN
CACTCTTTAT
CACTACCAT
GCACATATCA
CATAAAGTCT
AACAA.AAATT
NAAAAACAN
AACACCCCCC
NTCAATTTTT
TATTGCTCCN
TNT CCATTT C
TCATTCTCAC
ATGTC CAGAC
ACCATGTCAC
GCCAGCCTCT
TTCCATTATC
CTCCTAGTTC
AAACTATCCT
AC CATAAAGA CC CGCCCCN AAT TCT TTN C
GCGCGNTTTT
TNCCGTCCAA
ACCTTTCTCC
CCCACCAGCA
TCTAACACCA
TTCCNCACTT
CCAATCCCCA
ATTTCTATCG
CCAT CTTTAA
TGACAATCAT
CNATCTAACA
CT CAT CAGAT
CAAAAAJNNAC
TTTTCCACNC
TNTTNCAACC
CTTCACTCCC
TTC CCNTCCC
CCGACTCACC
CNTCTTCNG
CCCACATAAC
TCAAATCTCT
ACAAACCACN
CACTAATCCT
CATTTTCT CC AAT CTTACTT
CTCACTAACC
C CCANTNCCC CACNNC CNAA
CGAAANC
GCATTCCCC
TNCCAAAATT
CACCTTNTC
TTCTNCCCNA
ACATAGTCT T
TCTCTCANCN
ACC CATATTT
CCTCTGCATT
GCATCAGT
CAAATGARCT
CAACTCTTTT
120 180 240 300 360 420 480 540 600 660 '72 0 780 840 900 69 GGCCAGAACT CAATGAGGTN GTCCCATTTG ANTTACCCCA AAGGNJGCNTT AGCAAGTA.AA
AGGGNCG
INFORMATION FOR SEQ ID WO:53: SEQUENCE CHAkRACTERISTICS:.
LENGTH: 700 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ji) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:53: GGNGTGCTGG GATTATAGAT GCACTCCCCC AAATCCAGCT TTTTACCTGA TACCGGAGGA S
AGGAACGGAA
GTCTGTCTGT
TTTATTATTT
CATCTTTTGN
GGGTCTGGGG
CTNTCAGNGN
CAAGAGCCCC
NANCTNNAGG
C GATTACTTT
NNTTCATAAN
GTCCNCCGGC
CTCAGCTTCC
ATGTATGACT
CTCAGGCAGC
AGAAGTCTGT
ATCTAGGGNA
NNAGCNNNNN
GAACC CCCNA T14CAAACCNT CN4CN NCNC TC TTGCACCGrGA
TGAGCTGGTG
NGGGTCTNTC
TGCAACAGAA
NATGCAGGGA
GCATNTCCTN
AANTT14CCNT
NCAACCTNG
TGCCACNCCC
I-CNCTTN4CC
AGCAGTTTCA
TTAT GGCTGT
TGGGGGTCTG
AACAAC14GGC
GATCTNGAGT
TCNGCGTCTT
CGAGCAGCCC
CNACAATTGG
TCGCNCNATG
CAT GGGNMGC CC CCTN CCCC
CCCACTGAGC
GCACCACCAT
TTAGNCAGTC
TGTAAATN GT
TTATNCAGAG
GGTTTGGGNG
AGGGATTTTN
GGNNTTTCCC
C CNANCC CCC
ACACTCCCTT
CATCTCCCTG
AC-CTGGCTTC
TGTTAAZCTAC
TTTGACAAAT
GAAAGGTGT
AANGANGGAT
GCTTTCAACG
CCNCCCCCCC
AA.AAC GT CGT
CNCCCNCNTN
120 130 240 300 360 420 480 540 600 660 700 TNTTAACNGG NGGCGCAAGN CCTTTCTTNC INFORMATION FOR SEQ ID NO:54: SEQUENCE CHARACT *ERISTICS:.
LENGTH: 229 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (geno-mic) SEQUENCE DESCRIPTION: SEQ ID NO:54: NCNACGAGAN GTC.AANGTGN AANCTGNCGA TGATNAAAAN AACCGANCTT AGGGTGC4CAA NGGGTTACCC AGGANGGGGN CAAAGCAAGN TCCAGGCCCA TNANGGACCT GCTGGTNCAT NGCCNGNAAA NACCTACTTA TCCTNGAANA GCCCGAAANG TCCGCTNNGA CCANNTAAGT NCANNNCAAN ANGNACCACN CCNTTAACAC CACCGTATGA NCCCNAANT INFORMATION FOR SEQ ID 140:55: SEQUENCE CHARACTERISTICS: LENGTH: 465 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genoinic) 4~~a (xi) SEQUENCE DESCRIPTION: SEQ ID CCCCTTTCGN NGGCCTCAAT NANTN4ATTGN CTACCCNANA GTGGCGGTCT CAAATAAJANC AGCCTTCATG AAATACGATG GCGGGGGGAT TAGAGGNNTT GCTGAAGOGG CTTGCAACCC CATAAGAACA ACAATGCCAA CCACCCAGAG ATTAAAACAC TACTGAAAGA CTATACATGG ACTGACCCTG GNCTCCAACT CAGAGCAAGA GCCTNGTTGG NGCACCAGTG GAAGGGGAAG CCCTTGNTCC GGNCTCCCAG NCCAGGGGTA ATNTNGGGGG CGGNGGAGCA GTAAGGGAGG GGGCTACCCA TATNGNGTGG CGGAGGAGAT CGNNGCTNAT GGACAGGAAA GGAATNACAT TGGANATCT C NATAAAGNNN NCATTTCTTA TTCNA INFORMATION FO0R SEQ ID NO:56: SEQUENCE CHARACTERISTICS: LENGTH: 564 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: TTGGGGCCGN TNAACTCTGN GTNNNAGTAT NCCCNANAGG GGGGGTCTCA CACCNCATNT GNGGGN4GCCC NTTCNCNACA ACACATTTTG TCNGGNGGTT CACANATTTT GAGAGTC1NCC NGANAGGGGA GAGAGACNCA CACNAGTCTC GTTCGCGAGN GNACNCTTCT CTNCACATCT ANAGTATANC CCAGNGTCAC GGGGGGTNGT GTCAGNNACA GNGTTTCCCC CNCCNGTNTT TCCCCCTNCC GGGNAGACAA NGTNNTAGAG AGAACAGGGG TTATCCACAC ATCNCACTGN AGGANNANAN TTGTGCTNAG AGCCCCTGCN CTTCTGGTGG TANCTCTGGG TCTNCTCTGG GTCCCCCCCG GGGGGGTGTN NCCCTCNCCG GGAGAGAGTN
NNCATCATGA
TNTTGAAAGA
CTTCNAGGGC
GCATAT GTAG
TGCCAAGGTT
GTGGATGGCG
CTGGNAAACG
CAN4CGGGTCN
ATAGNGAGAG
TTCTCCCCGT
ATATGTGGCG
CCCCCCNCAG
GNGGCACAGG
GCC CATATTC
TTAGAGANAA
120 180 240 300 360 420 465 120 180 240 300 360 420 480 71 ATCTCCATCN CANATGANAkA AATNTGNGGG 14GAGAANCCC GGGGGATATC ACTNTTTTAN AANNGACCCC ACCCCCCCCC CCCT INFORMATION FOR SEQ ID NO:57: SEQUENCE CHARACTERISTICS: LENGTH: 822 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: GATTTGCNCT CATATNTCNT TTACCAAACA GNGGGNGTCT GCCCCCCTGT NATAN'ACCTC a.
TTGTTNTCGC
GGGGGATTTC
CCCCTCNCCG
ATAT CT C TNG N CT CTNC CCA
CCTCTCNTCT
CTCTAAAACA
P.PAACCCCCG
TGNGTTTATA
CNNCGAGNAG
CNTTNACNCC
TC CCC CCGC C
GGGGTGCTNN
TCTCTGNTGT
AAAAAGANAC
NNATCTT CTC
GAAATATNAT
NCCCCNAATA
CAN GNNNCTT
TGTTCTCCAC
ATTTCCAAGG
GGCTCTTTTN
CNGNCCCNCC
CNCCCNNACC
TN GGGGC CC C
AGANCTNTNC
C CCN'AAAAA
TCTAANCTCG
ACANNNNGNG
CTCTTCCNCC
NTCTGTGCCG
ATCNC CTCTN
AGAATGTNCN
TATATTTTTN
CAACNNCCCG
CCAATNCCCT
CCNTGTAGAA
NCTGAGACAC
AAAAAAAAAN
CTTTTAINTC C
TTCCCCTNCC
CC'ITNATTCT
CAATNTNTTN
TNATATCTCT
CAGGGGGGCC
NTCNAP.ACCN
ANCGGGGGAA.
TTTTTCGCGT
CCCCCCTTTT
AAAGAACA1N ACAGNGC CCT
AGACCGCGNG
T CAGAAAACC CAAAACCC CA
CNTATCTCTN
TGTNACANGG
GCCCCCTTCC
CCAATCTCC C
CCNTTGTCCT
ACGTTCCCCA
TCCGGGGGCC
TT
NGNTGTGGGN
GTGTGGGGTC
GGGNNGAAAA
CCACCCCNCC
AAGGGNNTCC
NGGACTCANA
CNCCCTGAAA
NCTATATCNC
CCCCTNGTTT
TTTAAATNGG
NTTTTCCNTT
CTGTTTCCCT
AANCCCGGAA TNAANTNCNT TNTTCAANCC INFORMATION FOR SEQ ID NO:58: SEQUENCE CHARACTERISTICS: LENGTH: 553 base pairs TYPE: nucleic acid CC) STF~ANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: TTTGGGTGCG GTCTCCTCTG TGTTAGTGTA TCCCCCATAG GGGGGGTCTC ACAGGGAGCC CTTCTCTTTT GGGGGGTTAT ACACAGGGGA CACACATGTG AGTGGGAGAG TGGGGGGGTG GGTGGAAGTG AGAAACAGAG GTGGTGTAAA ATGTGTTGAA TCTCTGGTTT GATAAATTTT ATCCCTGATC TCTCTCCTAT CCCC.ATTCTC TTTCAGAGAT AGATTTTCTG CTCTCACATG TTTGGTCCCT TATGTTCTC.A GATACATGTG CTCTTCCCCC TTGGGTCTTC TCTCTGTCTC ATAGAGTGTG TTTTCTCCCC GGGGTTTCCC TTGTTCACAA TATCTTCTCA AGGGTATAGC CCCCCAGTCC CCAGGCCCTT GGTTCCCCAT TTT INFORMATION FOR SEQ ID 14:59: SEQUENCE CHARACTERISTICS: LENGTH: 904 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (qenornic)
ATATAGAGAG
AGAGAGACAC
ACACATT CCC GT CTCTCTCC
CTCTCTCTTC
TGTCTCCCCC
GAAGAGCTCT
TTTCTTCCAA
AACACATGAC
TTTATTTTTT
CTTTGTCTAC
ATTCTCAGAC
TTTATTCTCT
CC CAT GATAC
GGGGAATCTC
TTTTCGAGGC
1210 180 240 300 360 420 480 540 553
S*
0* (xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: GCCATTTCCT CTCACATCCT ACTTTACGTA AACTCTCCCT GTCTTGCCTC
CATCTCCC
CCCCCCCGCG
TCTGCSCTC
GATCTTACTT
CTCT1TTTCTT
CTTTAGATTC
TTATTTCTCT
TCGCTCTCT
AAGCCCCTAA
TTTTCTTACT
CTTTTCAAAA
TTTTTCCGAA
CCTAAATAAA
CTTGCTAAAA
TTTCTCCCC
CT CCCCCCTC TGTGCC TG
TTTCCTCTCC
ACCCCCTCTC
TAT CCTTT CT
TCTCCCTCTG
TCTTAAATTT
ATCTTTTAAA
CCTCAGGCGC
AADATTTTTCA
CGCCCC CC C
AAGGCTCCCC
ATTTCGCCTT
CGTCCCTT
TGTTTTCTGT
GTCCCGGTCT
CCCTTCTCTC
TACTTTATAT
GCACTTTTTC
TCC.ACTGTCG
CATCTCTTCT
Al GTCTCAG
ATATAAACCC
ATCTAAATCC
CC CT CCTCTG
CCCTTCTCCC
TTGATGCTTT
TTCTC3TCCTC-
GCTCCTCCCC
CCTCCATACC
TTTCTTCCGT
TCACACTTAC
TATTCTCCTC
TCAAAAACAC
ACAGTCTCTC
ATCTCTTTTT
CCCTCTCCTT
AAATTTTTTT
CCCCCTCATT
C CC CTGGCC T TCC CCCC CC
CTCCTTCTTC
CAGATGCTCT
CTGCTTTTTT
ATACACCACA
TCTCTCTCTT
TAGATTTCTC
CCTTATTAAA
TCCTTTAC
TTTTTTTACA
TAATATTTCT
TTTTTTTTCC
GCCCCATTT
AALTTAATCAA
TTAACCCCTC
TCTCTCAAAA
ACTTCTTTCT
CCCTTCCGC
CTTTTCTCCA
TTCTCTCTCT
TTCTTTTTCT
CCCTTTTTCT
TTTACACTTC
ATATTTTTAC
CTCCTTTCTT
CACTCTCTTT
TCCCCCCTAA
TTTTAATTCC
CCATTTTACC
TTT-TTTTTTT
120 180 240 300 360 420 480 540 600 660 720 780 840 900
TTTT
INFORMA!TION FOR SEQ ID NO0:60: SEQUENCE CHARACTERISTICS: LENGTH: 698 base pairs TYPE: nucleic acid STRANDEDN4ESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genoraic) (xi) SEQUENCE DESCRIPTION: SEQ ID 140:60: CYCACCACTO AAAGAGATAG ATTAARAACA AYXACAAAACA ACAACCAAAA AAATACAAAC
AAACAAACAA
AAAAT GAGTT TGCCAnAG.AAT
TTCTCTGTGT
CT CTGNCTCC
CATTTTTCAG
GAAAAACAGG
NTGTGGCCTT
AAATHANTTT
CC GGNGGGGA
TTTTTCCCGN
AAAAAAAC CC
AAGGTTAGGG
GTTTGAGGAC
AGCCTTTGNT
CAGT GNCAGA
CTTATAGTCT
CCACNGNGGG
CCN4GGGGGGT
TTTNGGCCGG
NAAACCCCCC
CC CT NAAAAG
CAAACAAGTC
TTAGGTTAGG
CTAAGTTTGN
ATAGACCAAG
ATTAALAGGCA
TTTGGCAAGG
GGGAACGCT G
CTTTCCCCTT
GTTTNGGGGN
GGACTAAAAA
NAAAAATTT T
GCTCAACTGT
GTTAGGGTAT
CTTTTTTCTT
GCTGGCTTCG
TGTGCCATCA
GATGCCAGGG
CTTCCCCGGG
TCAAAATTNT
CCCCCCNNTT
AAAAAGGGGG
TNTTTTCC
CTTGAGTCAA
AGCTCAGGCA
TCTTTCTTNT
A.ACT CAGAGO
CTGTCCAGCT
NACGAACCAG
TTATTTTCTT
TT GGGNTTGG
TGGNTTTTTT
GGANCCCCCC
TAGATTTTAA
GTAAGGTACT
GAAACAGGGT
ATCCACCTGC
CTTAGGTATT
AGGCAGGGTT
GGGTCANATC
GGN4GGGGTCC
TTTAGAAGGC
NGGGGNGGAA
904 120 180 240 300 360 4120 480 540 600 660 698 INFORMATION FOR SEQ ID NO:61: Wi SEQUENCE CHARACTERISTICS: LENGTH: 851 base pairs TYPE: nucleic acid STRA14DEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (Xi) SEQUENCE DESCRIPTION: SEQ ID NO:61: GAAANAANTC GGGAGAAAAA NAAANNNCCN TTAAGAGCTT GCCCCCANAG AAAAANTANN AANTNAAAAA CTGNTAGACC ANNNGAAAAG GAAGCGCAGT NANAAAATGG TTCCTACGGG TTAANTAAGA AGCANGACNG AAAGANN4GNN TNNATNTAAC CGGGGNTAGN AAACGGCCCN CTTGTANNAG GACCNAATCG AANTAGTACG ATCATGNTAC ANAGGGAAGG GGACGTTACC
CNCGGANGAA
CAAGGAAATT
GANAAGGCAT
AGGNGGAATA
TCNGTANKNA
TTTC.AGATCA
GATTCAAGA
NNTATTCCCC
AACNATATGA
AATGCNTTTT
ACCCGGCACA
ACTGTGGANA
CGATA}JAANT
GTCATANAAC
ANAACNCCCG
CCCAGATCAT
NNGNTGACAT
CNGNATGNAN
TCCCATGAGG
TTTGTNTGNG
ALGATCTCNNA
CGGGAGGAAT
GAT OAT GGNT CAT GNAPAZkA
GTGGCCGTGA
CGNTGNAGAT
GGTGAAATGA
GGACNTCTTA
GNGGNNACCC
AACCCANTGC
AGGGAGAAGA
CNATNGTNAT
CAGGCGAAAG
ACNTTCAATA
TT CCTTTTTT 14CCATNGATG
TGTACAAATN
T GATGAANAC
AGGNAGTCAN
CC GAC CTNT C
TTCTGAACGN
NNAGNNNAGC
AGCATACGTA
AAAGATJJNCC
AACGGCAAAC
TTNTTGAAAC
ACAACANAGA.
CTTATACCAG
GAANAAATAC
AAA14AGAAGC 14ANNAANCCA
TGGNCACTTT
AAACCAAGCA
NGAATATTGA
AGCANNTTAG
TNANCTNGAG
NCGTCGAGAT
ACTCAAGTP.N
CNGAGAGTTA
ANAGCCCNAA
300 360 420 480 540 600 660 12 0 780 840
S.
0* S
S
AATTAATCCA
A
INFORMATION FOR SEQ ID NO:62: SEQUENCE CHARACTERISTICS: LENGTH: 936 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomxic) (xi) SEQUENCE DESCRIPTION: SEQ 11D NO:62:.
GGTTTTAGGA GGGAAAACCA ATAGGCCCTT GAGTTCT TAT TCTTAAGACA
CTAAGGAAAA
TTGTAAAGGA
CCGTTCTTCC
TAAATATATT
TGCTCCTTCC
CAGTCAATTT
AGTAAGTGAG
ACCAAGCACA
ATGCTAGGA
AGCAGCTGAG
GGGAGAGGTG
ATCATGGAGG
AAGGTTTAGG
ATAAAGGCCA
TTGAGGGGTT
CAGTGTTGGA
TAGGTTT TAT
TTCTTACAGA
CCAAGCAGCC
AATCTTCATG
ACCOPAACAGG
TOT GCTACAA
GGGACAGACG
GGAAAAATTA
GAGTTCACCA
CAT GGAATTG
TGCAGATATG
GGCAAZGCATT
GCAGAGAGAA
AAkTCCTTAGA
GCTCACGAAC
TATGGGTGGC
AGCCAGAGAG
ATTTGTCCCC
CCAGCCCGAT
TGAGTAACCA
GGTTOCCATT
CGCCCT OTTO
TATTCATCCC
GGAGCAATCT
AGGAAGAAGC
CTTGGGATTT
ATGTCGAGAC
AGGAACAGAT
AAGGAAAAGC-
CCATTAGGGT
GGATGTTTCT
T GGTAGTT GG GTTTTGAqTA
CACATTTTCT
GTGTTATCAA
AAACACTTG
CCCTGTCAGG
AGGAAAAGAA
AGGGAGGGGT
TCCCTTTATG
TCGGACCTTA
TAGCCTACCC
GTTTTGAGAT
GCCAGGGTGT
ATCAACTAGC
OTATCCTTCC
GTAGAATACA
CCTGTGTCTG
GTGCTGCACC
AGAGTTCTTA
120 280 240 300 360 420 480 540 600 660 '72 0 CTGAATTTGG GAATGACATG GGAGACCAAG GGcCAAAGTC CAGATGAGCA
GAGTGGGGAG
GAGGGTTGGA A.AGTTCCAAG GAGAGAGGCG TGGGGGTAAG GGAAGCTCGC AGGGCTCCGC CTCTGCCAGT GACCTTGGAC CGCTTTCTCT GAGGATCAGA GTTATCTGTA GGGGAGATGA GGTTGAAAGA TACCCACAAT AACTTTGGCAk AGTAGA INFORMATION FOR SEQ ID NO:63: SEQUENCE CHARACTERISTICS: LENGTH: 911 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (qenomic) (Al) SEQUENCE DESCRIPTION: SEQ ID NO:63: GGGAATTTAA GGGNGATTTG GAGACTTTNG AATTTTCGAA NGTTCCAAAA TAGANNTTNA 444.
.4.4 .4 4 4.
4444 4 4 4.
4.
4 4* 44 4 4 44 4 4.
GGNCAATGGG
AALACCCTGGG
AN GGTGNGAT
GGAATAGAAC
ATTAAGGAAA
ATTGTATTTC
AAAAGATAGT
TTATCAGGTT
GCATAGATCC
AAATGTTTGC
TGTGTGTGTG
ATT GCATTAG
TTGTGCTGTA
AGOCTAGACT
NTTGGGGCAG
GGTTCCAGTT
AGCAGCCNGA
AATGAZGAGTC
TATATCCATG
AGATAGAGAT
AATCTAATTT
TTACTTCT CA
CCAGAGAACA
GTAGACTCAA
TGTCCCGCAC
TAGAGATGTT
T GT GATAAGA
CACTACCAAA
NGGN4GCTTTT TA!ATCCCTT C
AACAGAGGTT
TTGGTAATAT
CATTCTGTAC
ANGACTATAC
CACATAACCA
GAAATTGCAG
GAGAGACTGG
GCTCGTATCA
ATGCTTGAGT
AAGGTTGAA.ZT
ATCAAT GTAA
AATAGNGCNA
TTAPATCANA
ATCATCTTGA
TTTATTATTA
TNTTCNGGAA
TY GCAPATTG
CTGTTATTTT
TTACTACTAA
CAT CTC CTAC
GAAATCATTG
GCTCATAAkGA, ATGCAT GT GT
GTATTTTCTG
CAAGGCTGGA
TCAGTGTGAA
NAAGTATTAG
AATATNACTT
CTGTTAGAGA
ACAACNGACA
CT CCAAGGAA
TTTCATTATA
AGTATATATG
AGAGCCTGTC
AAATTACACA
TCAGTGTGTG
GCATGCATGT
CTCATGGTCA
GAGATGACTT
14TTCCCCACA ATTTN4TATGG
GTTTATGGGA
NGAGGATT GG TA ATTGGAAC
CATGGAGAGT
GCAAZCATTAA
TANTCTTTGT
AAATGAGACN
ATCCTATCCC
TGTGTGTiTTG
GTGTATGTCT
TTGTAAGATA
CA'GCT GTTAA
GGAGCTTAGC
AAGN4TAATAG G INFORMATION FOR SEQ ID NO:64: SEQUEN4CE CHAR1ACTERISTICS: LENGTH: '781 base pairs TYPE: nucleic. acid STRAN4DEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DN4A (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID 140:64:
TTCAGGGGTA
AAAAATAAAA
ACGGAAAAAA
ATTTAACTGG
ACACTAATTA
TATCCCAAGT
CTATTAACTC
ATGTGAGAAA
CTGTGTCTGA
CAATCTACTA
TTGGTTTGTT
AAAGGCAGAG
TATTTCTTTC
ATCCTAAGGT AAACGGACAA AGTAAAGGGG AGGTTGGACC PATAAAGGGG
GATTAACCGG
AT GAACAAGT
CCTTATATTT
GATCATGTGT
GAGTTTAACC
CT GATTATTG
GGGAAAGTTG
CTCTTATCCA
AGTTTGAATA
AGGTCATTAG
AGAATACACC
TTTTTATTAA
ATGTTCCCTG
'PT CCTGTAAA
ACAAAGTCTA
GTACACCCAC
TTCCTTCTCC
Ar GAAZCTTT
AGGGACTGAG
ACATTCCAAT
TGATTTGTGC
GGTAGGGcTC CAcCCTAAAC
CTATTTGGTG
GAACGACAAA
GCAGGTAGCC
AAZCATTTTAC
AGTCTGACAG
ACATTTATTO
ATGAGACATA
TGTAATAG AG
TCTTCAAGTC
TCCTGGTGTC
TTATGATAGA
ATTTCTGCCA
ATAGTGACAA
TTGCCTTGGA
GC-AACGTTTC
TGGGGCATTA
ACAGGGTATT
CCATGTGCAA
AGAATGTACT
ACT GATAAGA
TAAAGGTGAA
TACAGAGTAT
ATTCTTGTGG
TT GTGCAATA
ACAACTAGAC
AGTTT CCTAT
TAGGCTATAA
CAATTTTATA
TTTTCCTTCT
TGCC-TAGCTT
TGACAACAGC
AATGAATGGG
GGGTCATTTT
TAGGAAATGT
CTTTACATGG
CAGTAAGGTA
TTCATATGTG
120 180 240 300 360 42 0 480 540 600 660 720 *780 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 389 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) t (xi) SEQUENCE DESCRIPTION: SEQ ID N0:65: TTGCTCTTAG GAGTTTCCTA ATACATCCCA AACTCAAATA
TATAAAGCAT
CTATGCC CTA GGGGGCGGGG GGAAGCT.AAG CCAGCTTTTT
TTAACATTTA
TCCATTTTAA AT GCACAGAT GT TTTTATTT CATAAGGGTT T CARZTGT OCA AATATTCCTG TTACCAAAGC TAGTATAAAT AAAJAATAGAT
AAACGTGG.AA
GTTTCTGTCA TTAACGTTTC CTTCCTCAGT TGACAACATA
AATGCGCTGC
GTTTGCATCT GTCAGGATCA ATTTCCCATT ATGCCAGTCA
TATTAATTAC
GTTGATTTTT ATTTTTGACA
TATACATOT
INFORMATION FOR SEQ ID NO:66:
TTGACTTGTT
AAATGTTAAT
TGAATGCTGC
ATTACTTAGA
TGAGAAGCCA
TAGTCAATTA
SEQUENCE CHARACTERISTICS: LENGTH: 340 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: AAATCGGGNT TNCGCGATTC GGTAATGACG NCNNATCCGT AAANNCATNC GCCGNNATNC NATTNGAAAA TNCCGGGNGC AANNCC-ATGT CTNATTGAGG TNNCAGANCC ATCCGGCACA GGCAATANGN AAAAAA.NGGG AGTTTCAC.AA TGTNTNTGAA TNTGNANCCA TTGGGCCCNA A-AM14TCCTN CGNTNNATGA ACCTTNNCGT NCAAAA-NTTT GGTNCC-ACNC AGCNGCTTTG CNAGCNTTNA ATAAACACCG GNr4TCCANAA TGNNACCAGN GNTGTTTNTN TCNANTN4GCA TNNCNNTTTG GAANCCCNCT TTTCCCAAAA CNTTNAAAAA INFORMATION FOR SEQ ID NO:67: SEQUENCE CHARACTERISTICS: LENGTH: 557 base oairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNJA (genornic) 4* a a a. a.
a (xi) SEQUENCE DESCRIPTION; SEQ ID NO:67: AGTC CGGGNA TGGTGGCANA TGCTTTTCAT NCCAGCACTT GGGAAGGCAA NACCTNAGGT TTANCCCAGN CTTTATTAGN ACCCCGTGTT CT14AAACACA NTTTGNGGGN NTTTAAGTGN AAACACTGTG TAAAACCTTG GCCCTGATGN TTTNGAACAG AAAATGTTTG AAGANTCCNA AAACATGTTG GGATGCCANA NGCATCCATC TCAACC-ANGT TTTGNGA.ATA AATGGCAGGT NAAACTAGTA TNGNANCCAC CGGGCNTGCA GATTTGTGGT GGGAACCAAG TCCTCCCATA CTGTGGTACN AACAGGGCTG GANCCACNGA ATCAGTGCAG NTCTGGACAC GGANGGNCTG GNCTAAGTNA ANNCAGGGGG C-GCAAGAGCA TNGGAN CNAA CGNCCCNCCC GGTGAGCTNT TCCATGCCTN4 NCCTCGNTTT ATTTGGCACT CAACTNAACT TAGGATG INFORMATION FOR SEQ ID NO:68: SEQUENCE CHARACTERISTICS:
AAAACAGTTA
AACNACAAAA
AGGGNTCTCC
CGN GTTNTTG
CATCATCAT,G
AAACAGGCTC
CTGTCTGGCC
CGNCAGAAAN
GGC-CATGT CC 12 0 IS0 240 300 360 420 480 540 557 LENGTH: 302 base pairs TYPE: nucleic acid STRANDEDNESSt double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (qenornic) 0@ee
S@
S S *0 5.55 *5@9
C.
G.e.
C. 59 *5 S S *5
C
Ge cc C C *9es 0 *9se 0@ 0 C C
S.
(xiJ) SEQUENCE DESCRI PTION: SEQ ID NO:68: GCCTATAAGT TTTGATTCCA TTCGTGAAAA TTTTTCCTAT ATCCCGAANA
GTCCACTTAT
TACTACTGCG GCCTATTTGG AAACTAACCG AAATTCAGTT AGTTCCCTAG
TAGCCTGCTC
TTGTA.ATATG TGTACTTTTC AATATTATAA AAAATTGGTC AGCAGATCTG
AGTAAA-ACAG
GTGkAATTCC GATCGGTAGT CCAATTTGGT TAAAGAACAG GATATCCAGT
GGTCCAAGGC
TCCAGTTTTG AACTCAAACA ATTATCAACC AGCTGNAAGC CCTATAGNAG
TACGNAGCCC
AT
INFORM1ATION FOR SEQ ID NO:63: SEQUENCE
CHARACTERISTICS:
LENGTH: 820 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:69: 120 180 240 300 302
GACTGCCTTT
AATAGGCAGC
GGTCCCCGAG
ATCACTGACC
CCGAGGACCA
AGCTGGCACT
CTGACAGACT
CC CTCTTTC C C-ACTTCT GTT
TACTCAAAGT
C CTCCAAGGA
GGGANCNAGT
TTTTTCTTCC
TT GGAGAAGA
AGAACCAAGC
GCTTCTGCGC
TAGTGAAC CT
GGATTCAGTC
TGCGTGCGAC
CTTCCAGCAC
GGCCGGTGTC
TTAACAGCAT
ACAGCGGGCC
AATCGGGNCT
CAAGGATACC CTGCAGCACC CAACAGTAAA AGACTTCATA
AGGCATTACC
TGATGACATG
CT GGCTAATC CTTAAT GTCA
TTTCATCCTT
TGT CCTCACA
ACATCCATTC
CGCCAAACCT
CT GAAAGACC ACAAGN4GGTN
GGCCCCAANT
ACT GAAGCCA
ACCAGCTTTG
AATACCTGGA
TGGGTGAGGC
CGCAzCAAAGT TCTcGATAAC C CAG CTAT CT TT GAGTT GAG
CCGCTGACGG
.AACTNAANAG
AAGGGTTTGG
TATTAAATTT
ACTGGAGGGA
GGTAAGAGGC
TAGAGACCTG
GGTAAGGGTG
TTCATGACTC
CCGGGCTGCC
CTCATTGATT
GTAGNAATCA
GGTTATTGNT
GCTTTATTNN
CTTC CCTAAC
TATATTCAAC
AGCA.ATCCAC
TTAGCCAGTC
CCATGGCCAT
CT CTGGCTC C AT TGCTC TAAT
GTGGACACTT
CTCAGAGGAN
AACGGC-NN CC
CNGGGACAAA
120 180 240 300 360 420 480 540 600 660 720 AACCGCAAAA AA.ANNAAACG CCTTN4TTGTA TTAAAANGCA NGNTTTTAGC CTTGGCCTGA AATGGNGNTA AGNTACGGCC CNCNGTCAAT TCCTACTATA INFORMATION FOR SEQ ID t40:70: SEQUENCE CHARACTERISTICS: LENGTH: 955 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID AANCCGANAN TTTNAAAAAA CAANNANAAN S GGCCANGAN NTNAATANTT TCTNAAAAAA
S
0@@O 0e00
OS
SO S *5 *5@0
SO..
00 0005 0 0@ S
S
S.
0e 0 0@
S.
0055
S
555e
NGANTACANG
TTTTGAGCAG
AGGGGGNGGG
NGCCNTGATT
TNTTTTNTT S
TCAGATACGG
CAGAANGGAT
GCNTAGCATN
TATACANGAT
GGAACANGCN
CTCCCGTANA
ANGTCCCCTA
NNNTTATNGN
GAAGNGCATA
N4ACACGGCAG
GGTTTATNGG
AAGGGGTTNG
CAN GNCTTTN
TNTTGMGNAC
ATGAGAGTTT
TCAGTCGNGA
TAATGNNNAG
NTNTCGNTAN
GCGANTTTCT
CTTGTAGGNC
TAG CAG CAAA
GGACACGATG
GTAGAC CAT?
GGNNGTTTAG
NCTACGTTGA
GNTTTCCACA
CCTCCTTATT
AGGTGCACAA
CCGGGGA14AG
GGAGNCAGGG
AGAACACACA
AGGGTGAAGA
TACAGACCNA
AAGNAAATAtJ
NAGTTGNCAG
TCATCA"AGAG
CCCGTGTTC
TCAGAATANA
CCCAAGTCAC
GCNTTNAAGT
TCCNANCNTC
GT? TA GNANA
TATGNGGGCGA
ANGGGGT SN?
TNTTTTGGAT
CTGAAGAAAG
GGTTTCATA
TGCNNATTAT
AAANTCNCAC
GGAGTNNTGN
ACCNACANTC
CCACGCCTAG
ATNNACNCNN
ANTGCNTA-NCA
CAGAA-NTNGG
NCATTAANAN
GAG GAGACAN
TTTTCAGTCA
GGAGTTNACA
TTNAGAGACG
TGATGTCTCC
NAGS GAAAGT
GAGNCCGTTG
AGAGNTCCCC
ACTS? GACTC
AGCCNCTACC
TTTGCAAGTA
AAC CATTGNC
GACATNANNG
AGACACATTT
NAGAAAAGAG
TGTNTACAGA
GNNCACTACC
CCGANAGAC
NCCAAANCGC
AN CCANACN
CTATTCAAGC.
TTNTCAXACC
CGTCAGATNG
CAGTCCTGTT
AGCN GAAAGA
ATGAG
120 180 240 300 360 420 480 540 600 720 780 840 900 955 0@05
S
00 S 0e 00 GNAAAGGAGA GAGTTCGCAT ATGANAGACC INFORMATION FOR SEQ ID NO:71: SEQUENCE CHARACTERISTICS: LENGTH: 886 base pairs TYPE: nucleic acid STR.ANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DN4A (genomic) (xi) SEQUENCE DESCRIPTION! SEQ ID NO:7J.: AAA14NCCNAA AACCTCCAAA TTTGCTACCA NTCTTCNACG
NTNGAAGNAN
GTNGACTTTT1
GTTCCCNAGG
TTATTTNACC
TTCCCGTTCA
ACAACTGTAT
CTTCCAGACA
GCGCGCTGTA
CACGGAGCAA
TTAAGGNt~T
C
CGATTGGGTT
TTTGGGGACA
GGGGACCGGA
GGCAGCCGGG
AAPJTNGNAA
AP$ACAAAAGG
cAATTGTTTC
CNTTGAGTTT
AGACNACCCG
TCCCGGTT'TT
GTTTTGCCAG
NTCTTAGAAG
CGAGGAGAGC
GCcGCCATTT
TTGAAGGCAT
CTGAGGGGAA
ACTAGAC GTG
CTGGGTGTCC
AGGGGGGGGT
TTNTTTCANC
CCTGGCCGGN
GCGGTTAGTG
TTAGTATTTC
TNACGTGATT
GGCATTCTTC
GACGNTNTCT
TGTCCGTTCN
GGGTAGTGGC
GCCGNTTCTT
CCGGGCTGCG
CGGCGCCTCA
CCAGACCGTG
TCTTNTTCAA
NT'rCAACGGT
GCCTAGGGAC.
GNCAT GGGGA
CAAGCTTCCC
CGGTTCC GAG
CGCCCCACNT
CCACAGCCGT
TN GAGTTATT
TTGTAGACGC
GGGGTGTGTC
GCGCCCAGCG
CTCACATTTT
ATGGGCCCCT
TTTTGGGTTC
CTCCTT'rTTA
GATGGCCCCA
GCCAATTTTT
GCC CCAGCAC CCC GGTNTAG
GGCTTTTTTA
GTGTTGAGGG
ATGGCAGGAG
CCCTNGACGC
TGGGAGGACT
CATCCAACTT
CNTGGGCCLG
TGANTCCAAG
CTTCCTTCCG
CATGGAGANT
CCNGALAGGCC
TGGTTGGCAC
CAAGATCTTA
TTGGGATTCG
TGTTGTGGGT
CGCGCG GGCT 120 180 240 300 360 420 480 540 600 660 720 780 840 TTGCCACGAT TGTCGCCTGG CACGAGGAGT AGAAGC TTTGATTTCC cACCAATccc INFORMATION FOR SEQ ID NO:72: SEQUENCE
CHARACTERISTICS:
LENGTH: 900 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (mi) SEQUENCE DESCRIPTION: SEQ ID 140:72: GGGNGTTNGC. TCTCAGATGC NAGNTACNNN TCAGGGGGNG
TCTCACGAGA
TGTGGGGGNT ANTNTGTATC CCCTNNNCTC NCTCGAGANC
CCNNNTCTCG
GACCNGGGGC CGGGGCCCAG ANACTCNCCA CCCCATATGG
NGACCCTNTA
CCAGGGNNTG TTTTGGGNAA AATATANCNN ANAGNGGT GT NTNTNANATC ACAGACCCNN ATTTTTTTTT ATAAAGACCC GGGGCATNTT
CTCNGCCCCN
TACANGNNAC CCACACACAG TGTGTCTCCT CTCAGCCCCC
TGGCACACTT
C14GNGGGGAT ATGAGATTCN CNAGACTGGG NCCGCNNTAN
TANNCNCCCC.
CTCATAGTGT NGTGTCCCCC CCTCACCCN14 TNTTGNGGTN
CCCTACACCC
AAANCTNATG
ANATTTTGGN
TAAGTGTCNN
TCGGGGGGTG
TCTCCTCNGC
TNTNTNGANT
C14T GT CTCCT
ACACAATNTA
120 180 240 300 360 420 450
GACTCTNCCC
T GT NCT CCTC
AGGGACNCTT
GTTNCCCCCC
TCCGGGGCCC
CTNCCCCTAA
TNGGN4TNTTT NCCNt-CNGCT TCTNt~TACNG
TTCTATACAC.
NCTTTATNAT
CAACCCCAAA
ANTTTTGAAC
TCNCTAAAAA
NTGNGACNCA
GGNGGTCNCC
NCTTANTTTN
NTTTNTTTTN
ATCCC-ANTNT
CCCCTTTAAT
ATTTTTTGTN
CANCTG14AAA
CNCNNNNGAC
CCTCCTTTGT
TTCCCCAAAC
T CTTTT~nNT
TCCCCCCCCC
GCCCTCCCTG
TCCCGNNNCN
TCT4AAANGT
NTNGCAAAAA
TAANCTTTTA
TGGTT GGGGT
GGNTNAAGGC
GGAAATCCCC
CAAAAAGGGC
CCCTCNCAAA
ANNANCCTGT
GGN'NTNANCT
GTCAAAATTC
CCIACTTCCC
GGTATTCCTC
C
C. 2)INFORMATION FOR SEQ 1D NO:73: SEQUENCE CHARACTERISTICS: LENGTH: 1033 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:73:
CCTACGTTCA
CTGACCCCCA
AC CAGATCAC TAcTAACTCC
GACTCCATCA
GCTT CAT GTC
ACAGTAGGCC
TCATCTTTAG
CCCCTTTCTG
TCGTGCTGNA
AGCAGCTATA
T CAGTC CCTT
TCCCATCGC
AAT GTCCTCT
TGTTCCACAG
TATGTATTTT
AGCCGINACCA
CCTATGCGTA ACAGATCTGC TGTGTCAGGA GCCTCCTACC CTCGCGCATC
ACCACGTCCT
TCGTGGGGAT
TGCATCGTGG
GGAAkGCCACA CC CACAATAG
CTATGTGTTG,
NAAAGTTATC
AACTATT"TAT
GAGAAGACC C
CATAGGATGT
CATCTTTTCC
TGAACATCTT
GCTCCTCTTA
TGGCCCCACG
TTTTTTTCAG
CATTTCTTCC
CTTATCTGAT
CTCTAGGC CA
TAACCTCAAT
TGGGC-AGGTG
TGTCATCAAG
AAGACAGAAA
ACCAGAGATT
CACGGGCAGA
GAGTGGGCAG
CAGCAGCAAG
TAGAAGGGTT
TCTTCGAATG
ACCTCT GTGG
GTACTGGTTT
ACACTGTTCC
AGGTCAAAAA
C-ACTGGT CAT
CCTCCTGTGG
GGCTGATCTT
GCTGAATGCC
CANC GNTAT C CGTTCT14ATA
TCATCACATG
AAATNTACTG
CATGGNGATC
CCCTTCCCTG
T GTAATTTCT
TGACTCAAAG
CACACTCCTC
CAATATAGCT
TTTTGTATTC
CCATCTCTCC
CTTCCCAAGT
TACCCTAGGC
GAG GAT GCAG
ACAGGCACCT
TCCCTTTGTA
CTCAAAATAG
NCTNGGCTTA
ATTAT C CCT G
CAAGGAGACA
CCCACGTCAG
GTTGATTGTG
TGAGT GCAC C
CTAACACATG
TATGTATGAG
AACAACCTCC
AATTT GTTAT
CATACACCTC
CTTGGATCAC
TCTGGAGTTC
ACCACATAAT
CCT GNCTATC
CTACCTACTT
NGTATTTTAT
TATCATGACA
AGGGAAACCA
ACTAAACCCT
CACCAGCGCT
GAGTCTGGCT
TGTGTCGTCT
CAATAAGGGC
TCACATACT C
GAATTACTCC
120 IS0 240 300 360 420 480 540 600 660 '720 '780 840 900 960 1020 82 TNCAAGTTCA
COT
INFORMATION FOR SEQ ID NO:74: SEQUENCE CHARACTERISTICS: LENGTH: 883 base pairs (13) TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: 9SEQ ID NO:74: 1033 NAATTTCCCA AAAANNONNC GNCCCNTTTT TTATCCACTT TNNC-OTTGAA
S*
S*S*
S.
5
GCGCGGNNAA
NAT CT CNCCC
TTTGCCGGA
CAANTTTTTA
TTTCTAANAN
TCNCTAGCAO
AATAC CACAO
TTATAACAC
AAAAOAATAA
ACACACACAG
TTAAATAGT T
OCCACACCTT
NACACCPAAT
AT GTCCATAG
AATTAAAAAA
CGOTTTNAAA
CCCOOAAATT
COTTTCCCNN
OTTAAAAANA
ACCCACTTTT
CCAGCCTCCA
ACAGAGCAGC
GAACAGCAT N
CATTACCTAT
CTACOCCAAC
TCTTTCTOCT
TGCTTTACAA
TCACAOSACT
GAATTGTAAG
AC CCNCAAT C
TTTTTCCTTT
ANCT AATT TA ACGC-ANTT CC
CATTTTGACN
CCAGOCACAA
TTTACCAGOG
TCATCTCTCA
AAACTTCTTA
AGAFLATCAAC
TAOTTTCNTA
AATGCTAONT
TCCCTCCC-AG
GAGATCGACG
GCCAAAA'AGC
TTTTNTTTNN
TTT CAATCGA~
AANNTTNCTT
TGOTNCCNAA
TGOTATOTA
AC.AGGAAGCC
OCTCT CAC-AO
CATTAGITAT
AOAAATCACC
CTTCANTCTT
CTACTGTCCC
CTTCGCC-CTT
CACCGCGTCA
GCATTTTTC
CCATTTTT GO
TTCACTTTCC
AAGGNTT CCC
TTTAACACCA
AnAACAGCTG
TCGGTTTTA
TNTATTGCAG
TTATAACAC
ITCCACCOCAG
TCTCTATTCT
ATCTTTTGCA
GGG
CGTTCTCCCT
AGTTTCACCT
AACTATCTTC
TTTCAC CAAA
AATNTTAAAC
CACCACCACA
CATACAATAC
CACAOCTCTA
TCTGGT GAT
CAOCTTTGCA
GATCCATTGT
TACANCN CAT TNTTTATNGG 120 180 240 300 360 420 480 540 600 660 720 780 840 28 3 INFORMATION FOR SEQ ID SEQUENCE
CHARACTERISTICS:
LENGTH: 892 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID CCCCCCCCCT CCACGTCGAC COTATCATA ACCTTGATAT COAAkTTCACC
TCTTAGCAAT
CTGACAC CCT
CGTACATAGA
GGGAGTGCGG
TATTTACATT
TATAGCTGAG
TTGAGAAATA
TTTCTGTCTC
GTCTGAATCC
AAACGTGTGT
GTANTGGGTC
AAAACNCANG
AAGGTGNAAT
TCCNTTCCGG
C TTCT GG CCT
TAGTCAAAAT
TCAAATGACC
ATGATTCATA
AGTCACCACA
CTGGTCTAGA
CACACCACTA
A CT GAG GCAC
TTGGCACTGA
TTGAAGCACA
NTGTTCAACA
NCTTTGGGCN
AGTCATCCTT
cTTCAGGCAC
CTAGAGCACT
CTATCACAGG
GTAGTACCAG
ACATGCATAA
GCCATTCCTT
GCAAATTTTT
GGTCTGACTC
CTGTGTGNCC
GAT NCT CTAA
TNGGGNNCCN
NNTCGGTTTA
NCACTGGNGC
83
CTGCATGGTT
GTTTCTATAC
GGTCTCAAAT
AATTACAGTT
CTGTATTAAA
GT GCTGATAA TCT CTATATA
CAGAACAAAG
CAGGTTNTCT
CCTTACCCTG
CcCNGAAACA
GGAATTTTAA
CCNCTGGACC
GGGCNAAANG
CCACAGGACT
CTGTGAGTTG
GAGATATCCT
ATGAAGTTAC
ATGTTACAGC
AGGTGGCAGT
TAAACATGTA
GATCGTATTC
TTCTGNACTC
GNNGCT CA CT
GNGNTGTNGG
ACANNAACTG
C GGNGNANN G
CCCCTNNNNT
GT CACACCCA
CAACCCCTTT
GCATATCAAA
AAAATAATTT
ATTAGCAAGG
GAGCATTATC
ATATGAGACA
CTGAAAAGCA
CTAGAGGTCT
AGNATGCCCC
ATTTGGNAGA
GCTTNCNACG
GGCCANTTCG
TC
CCAGTTCGTN CCCCTGGNAC CCNTCNCCGG INFORMATION FOR SEQ ID rNO:76: SEQUENCE CHAJRACTERISTICS: LENGTH: 884 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SE-Q ID NO:76: TGGGCCCCCC TCGAGGTCGA CGGTATCGAT AAGCTTGAC-G GACCCACGTG ATGGAAAGGG
AGAAGCAATT
GGACTGTACA
C CCCTCAAGT
GGAGCACATC
GGCTCTTTCA
TCCTTGCCTC
ACATTCAAAC
ACTGGGCCAG
TAGTGTCCTT
CACACACACA
AACCGT GGAA
AGGGTGGTGC
GAACCAGGAG
CTGTTTTCCT
TGTAGTAAGT
TGCTGGTAAC
TGTCCTCTGA
CACACACACA
TAAAGGTCCG
TAAGCAGCAG
GGCATCGCCC
TCTGCCTACC
GTTTTAATTT
AGCAGACTGG
CCTCCACAAG
CACACACACA
ACCAGAAACC
AT C C CCTGT
CTCCAGCCAG
TTCCTTTGGC
TCTACTAAAC
GTGGAGTATC
TGCTGTGGCA
CACACACGCA
AC GCTGGAAC
AACTGGCAGC
ACTCTCCAGC
CTCAAACCAT
AATAA-AAC CT
ACAGAGGGTG
TGGGGACACA
CGCACACACA
GGGAGATGCT
AGAGGGGTGT
TTTCTTCCCC
AATGTGCAAC
TTAGATTTTC
TGGAGCAAGC
TGGCTACCCA GGGCTGGGCA CACTCAACAC TCTGGCATTC TGTGGAAGTT CTGGGCAGTA, 84 AAAACAGAAG CATACGTCAC GCACAGGTTC CATAGTGTTA GGCATCTTAA TCTATCTAGA 660 ATACCTGGTG TTTAGTTTGT TTACAAAATT GATTGTTGTA CTTGGACAGT GGTGTTTTTT 720 TCCCAGGGCT TccAGGATTT AG-GGGTATAC CAGGCCCATT ACATTGGGTA AACGTGTGTG 780 TTAATTTTTT CTTTTTAP.AC CTCCTTG-.TT GACTACTTGT TTTCCTTTTT AATGGTCCCA 840 GTTCCCC?1'G GGGGGTTTGT TTTGAAA GGCTTTCCGG TTTC 884 INFORYATION FOR SEQ ID NO:77: SEQUENCE
CHARACTERISTICS:
LENGTH: 326 base pairs TYPS: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genom.ic) xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: AGCACACCAC AGAGAGGGGG TCTCCGTGCC CGAGAGGCAA AAGTCTCCCA CTGTGCTCC.T CTCCCCCCCT GG'FGGGGGTT AAGAGATGGG GGCTCTGGGG GGTGATAGPA CCCCTGGCGG 120 GACACCCCCC CGCTCTCGTG GAGAGAGACA GAGGGGGGTG CCCCTGATAT CTCACTAGAG 180 *GGGAGAGGTG AGAGGGCTCC ACAGTGTGGT GTGGTGGTGA GTGCTCTATC TCCAGGTGTC 240 TCACATATTT TCACAGCTCT TGACCACAGA. GAGATCTTGT TGACTCTGTG- CTCGCGGAAzT 300 CTAATGTGCC CCACATCATA TACACA 326 INF'ORMATIO14 FOR SEQ ID NO:78: SEQUENCE
CHARACTERISTICS:
LENGTH: 557 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomric) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: GGGGGGGTCT CACNNTANAN CACTCNGGNG TCTCCCATGT CTAGATCTCC CCCCNGCNCN NGNGANGAGT GTGNGGAGAT CCCTCTCTGN TCTCTACACT CTPAAGGGTA NGCGGGGAGA 120 GAGAGAGAGC ACANTCTATA GANCACANAG CACACNCGCT CNANGTGCCC NAN~TNACANG 180 N4NAGAGAGAN CCCCTCTCNC AGTATATNGG GGAGAGAGTN TGAGGGACNC TCCTCTTTTC 240 TCTCAACNCT GNGGGGGGAG N4GNGAGTGTT CTCTCTGNGG GGNGGAGNGG NACACTCNGN 300 TCTNCGTNTG NGTGCNCNNG TNTTCTGGGG GTCACANAGA AATCNCCTNT CTCAACACAA 360 CAACAACAAC CCCCCGCACG NGCACACACC ACAACAACAA NGGGAC.ANCG
CGNGGGGGNT
N4GNGCACACC CAGNGGAGAC ACTGTTTTCT GTTTNACACA CACACACACA
CACACACACA
CNCNCcCCCC ACANAGTTTT TNGGAAAANC GCNGGGGGGG GNGGGNCTTT TTGCCNCAAG CCTTTTTTNA NCN4CCCA INFORMATION FOR SEQ ID NOV'79: SEQUENCE
CHAPRACTERISTICS:
LENGTH: 376 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear ii) MOLECULE TYPEt DNA (genomic) SEQUENCE DESCRIPTION: SEQ ID NOV-79: CTCTCCCCCA AAGGGGGGGT CTCACCCTCC CGGACACCAC ACATCTCTCT TCTCTGACAC CCCACAGAGA TATATATAGO GACAACGCCG
CTGTCCCCAT
GAAGCGAGAC AAACTCTCAG GTACACATGA CACATGATCC
CCATGATCCC
TTCTAATATA GTT SAGAGAG TTGTGTCTCT CAAGTGTCTC
TGGTATTTTC
TTTTCTCTCA CAATGTCACA CGGGOGAGCT CGGAC5CGGT
GCACATGGGG
GTCTATGACA CACTAGTCTT GCCCCCGAAC CACAGAGACC
TCGACTCGGG
TCTGCCCCCC
CAGCTC-
INFORMATION FOSR SEQ ID SEQUENCE- CHAR-ACTERISTICS: LENGTH: 533 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE- TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID ATN4NCCCAAN ATCANATGNG GAANNNCCCA CATTTTNTAT
NTAC-AAANGN
TGTGNGTNNA ATTTGAGNTT TCACAGAGNT NACATTCTCT
GTGTCACAAN
CTACACTCCA CAGTGTGGTG NGAGATATAC TNTGA1NACAN ATGNGCTCTC CCNqNCATC-TT NTNCCCCACA GTNTACNNCN NCNATATATN GNNCNCNGNA NGNGNTGTNT TTNTTTAAAA AGATNTNANA NAGNGGGTAT
GCGTGNGGGG
CATATATGTN NNAGAGGGTC TCTCTGNGGC CCNATGGAGG
CA-NATCCC
GTCTCT CT GA
GATATAGAGA
CGGCACACTC
TAACC CCATG
GAGAGTTCGT
TTTAGTCTC C
GTTTTC-TGTG
CCCTTTCTCT
TCCTCNCCCC
GANNGGTATG
TATGTNNANA
CCNCTCNGAG
120 180 240 300 360 376 120 180 240 300 360 86 NNATATAGAA AAGAGTNTTT NANGGTGTTT GTGC-ACACAG ATAAGGGGAG
AGAGAGAGAG
AGAGANAGAG AGAGANAGAG AGAGAGAGAG, AGAGAGANAN GGNGTNTTNG GNTT CNTCCC CCCCNATATA CAGAAAAANC GGGGGGGGGT TAGGNGGNNG GGGGTTTNCT
TTA
INFORMATION FOR SEQ ID NO:81: SEQUENCE
CHARACTERISTICS:
LENGTH: 346 base pairs TYPE: nucleic acid STR.NDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genoznic) (xi) SEQUENCE DESCRIPTION; SEQ ID NO:8l: TTTCACACGA GAkTGTCGCGA. CTCTCGCGAG ACTCTCAGCG CGGAGATATA
GACCCACAAG
GGGAATCCCC CGGGTTTTTT GCCACAGGAG AGCGCGAGGA GAGAGATATT
CTTATTATGG
CTATAGACAC CCCCGTGGGT GGGGGACATT TGTGGTGTTT CCACAGGGGG
GGGGATGTAC
CCCGGATAT-C AGAGTATTCT CTAAAAAAGG TGAGAAGAGG TCTTCTCTTT
TGALGAGTPATG
GGGACACTCG AGGAGAGCTC TCTATCTATC TCTCACAGCG CCCCTGTGTG
GGCGGATCCT
CCACACCAGA TGTTAGTGTG NAGATCTCCC CATCTTCTAT
ATTGAA
INFORMATION FOR SEQ ID NO:82: (iJ) SEQUENCE
CHARACTERISTICS:
LENGTH: 461 base pairs TYPE: nucleic acid STRAINDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genormic) SEQUENCE DESCRIPTION: SEQ ID NO:82: GAANACCCAA AATTGNGCTN GTGGGCAAAN NTTTTNCCGT
TTCTTGTGCT
AGNNAAVAAAT TCAAAACCAA NACCACANAA GCGCGTTATC
CTGNCTNTCT
TGTCACACTG NGGCTGTACA GACATCNANC GCTTTCTAGA
GAGACGNGAG
CTCTTTCCCC CANNCGCATT ATANCCACAT ATTAGNGTAN
NANATTCAGC
TGGGNGTGT C TC CNTAGTGT GAAGCAACAC AGGGAkAACTN
TTCGCNCACA
TGTTCACAGA NATAAGNAGG CTCCTAGACC NNTATNACTG
TGGGNAGAGN
CCTATANNTC GGGGTCTATC TCTGTGAGAN AGAGN TTCCT TTCTCCCATN TGGGGTGNTA TNTACATCNC AGAGAGCAGA NAACTGTGAG
C
TGNGCGGCNA
GCCNTTNCCC
AGTCAGGGGA
TGTGNTNCAC
TGTCCTCTGG
ATGTTACCTC
CCTACCTCAG
120 240 300 360 420 461 87 INFORMATION FOR SEQ ID N~0:83: SEQUENCE CHARACTERISTICS: LENGTH: 367 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genoniic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:83: GGGGTNTCAC AGAGANAGGG CACANCTCTC CCNAGAANGG GNCNNCCCTC.
GTAACACcTC TCNCCGT GTC TCTTTCTTTC TTTTTTNTTT TTT GGGGGGC GGAGGNGGAG NNCGNCCGAG GGTCGGGCNN NNCNGNGGAN AGCTCTNTCN TCNCCNNANC CCCCCTGTNT CTTATAANNN ACATCTCTTC NTCNCAGGGT NTCTCNTTTC TACAACAACC CCCACACGCN AAAGCTCCCC ACNNNC-NGNG AAGAAKATCT CNGCGGAGAG GTGGNGGAGA GAGTGANATC TGNATNTCTG ANT GCCC
TTTTTNNGGN
T CTT TTCGN CANN GATATA CACAC CNAGA
GGGGTCTCNC
GNTTC CCC NC .0.
1.*
Claims (22)
1. An isolated nucleic acid comprising a nucleotide sequence set forth in SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, or SEQ ID NO:83.
2. An allelic variant or homolog of the nucleic acid of claim 1.
3. An isolated nucleic acid encoding the protein encoded by the gene comprising the nucleotide sequence set forth in SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82 or SEQ ID NO:83.
4. A host cell containing the nucleic acid of claim 1, 2 or 3.
5. A nucleic acid that selectively hybridizes under stringent conditions with the nucleic acid of claim 1, 2 or 3.
6. A nucleic acid having a region within an exon wherein the region has at least homology with the nucleic acid of claim 1, 2, or 3.
7. A nucleic acid having a region within an exon wherein the region has at least homology with the nucleic acid of claim 1, 2, or 3.
8. A nucleic acid having a region within an exon wherein the region has at least homology with the nucleic acid of claim 1, 2 or 3.
9. A nucleic acid having a region within an exon wherein the region has at least homology with the nucleic acid of claim 1, 2 or 3.
10. A nucleic acid having a region within an exon wherein the region has at least homology with the nucleic acid of claim 1, 2 or 3.
11. A nucleic acid having a region within an exon wherein the region has at least homology with the nucleic acid of claim 1, 2 or 3. -89-
12. A protein encoded by the nucleic acid of claims 1, 2, 3, 5, 6, 7, 8, 9, 10 or 11.
13. A nucleic acid comprising a regulatory region of a gene comprising the nucleotide sequence set forth in SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID NO:80, SEQ ID NO:81, SEQ ID NO:82, or SEQ ID NO:83.
14. A construct comprising a regulatory region of claim 13, wherein the regulatory region is functionally linked to a reporter gene. A method of identifying a cellular gene that can suppress a malignant phenotype in a cell, comprising: transferring into a cell culture incapable of growing well in soft agar a vector encoding a selective marker gene lacking a functional promoter, selecting cells expressing the marker gene, and isolating from selected cells which are capable of growing in agar a cellular gene within which the marker gene is inserted, thereby identifying a gene that can suppress a malignant phenotype in a cell.
S
16. A method of identifying a cellular gene that can suppress a malignant phenotype in a ~cell, comprising: transferring into a cell culture of non-transformed cells a vector encoding a o. °9 o selective marker gene lacking a functional promoter, selecting cells expressing the marker gene, and o(c) isolating from selected and transformed cells a cellular gene within which the marker gene is inserted, thereby identifying a gene that can suppress a malignant phenotype in a cell.
17. A method of screening for a compound suppressing a malignant phenotype in a cell comprising administering the compound to a cell containing a cellular gene functionally encoding a gene product involved in establishment of a malignant phenotype in the cell and detecting the level of the gene product produced, a decrease or elimination of the gene product indicating a compound effective for suppressing the malignant phenotype.
18. A method of suppressing a malignant phenotype in a cell in a subject, comprising administering to the subject an amount of a composition that inhibits expression or functioning of a gene product encoded by a gene comprising the nucleic acid set forth in SEQ ID NO:76, SEQ ID NO:77, SEQ ID NO:78, SEQ ID NO:79, SEQ ID SEQ ID NO:81, SEQ ID NO:82, SEQ ID NO:83, or a homolog thereof, thereby suppressing a malignant phenotype.
19. The method of claim 18, wherein the composition comprises an antibody that binds a protein encoded by the gene.
The method of claim 18, wherein the composition comprises an antibody that binds a receptor for a protein encoded by the gene.
21. The method of claim 18, wherein the composition comprises an antisense RNA that binds an RNA encoded by the gene.
22. The method of claim 18, wherein the composition comprises a nucleic acid functionally encoding an antisense RNA that binds an RNA encoded by the gene. Dated this TWENTIETH day of DECEMBER 2004. Vanderbilt University Applicant Wray Associates Perth, Western Australia Patent Attorneys for the Applicant
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU27484/02A AU780210B2 (en) | 1996-04-15 | 2002-03-20 | Mammalian genes involved in viral infection and tumor suppression |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US60/015334 | 1996-04-15 | ||
AU45105/97A AU742243B2 (en) | 1996-04-15 | 1997-04-11 | Mammalian genes involved in viral infection and tumor suppression |
AU27484/02A AU780210B2 (en) | 1996-04-15 | 2002-03-20 | Mammalian genes involved in viral infection and tumor suppression |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU45105/97A Division AU742243B2 (en) | 1996-04-15 | 1997-04-11 | Mammalian genes involved in viral infection and tumor suppression |
Publications (2)
Publication Number | Publication Date |
---|---|
AU2748402A AU2748402A (en) | 2002-05-16 |
AU780210B2 true AU780210B2 (en) | 2005-03-10 |
Family
ID=34427344
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU27484/02A Ceased AU780210B2 (en) | 1996-04-15 | 2002-03-20 | Mammalian genes involved in viral infection and tumor suppression |
Country Status (1)
Country | Link |
---|---|
AU (1) | AU780210B2 (en) |
-
2002
- 2002-03-20 AU AU27484/02A patent/AU780210B2/en not_active Ceased
Non-Patent Citations (3)
Title |
---|
"GENE TARGETING-A PRACTICAL APPROACH"EDITED BY JOYNER,A.L., * |
16-20 BY HASTY, P. AND BRADLEY, A. * |
IRL PRESS AT OXFORD UNIIVERSITY PRESS (1993) CHAPTER 1, PGS * |
Also Published As
Publication number | Publication date |
---|---|
AU2748402A (en) | 2002-05-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7536053B2 (en) | Systems, methods and compositions for sequence manipulation with optimized CRISPR-Cas systems | |
JP2008301825A (en) | Mammalian gene involved in viral infection and tumor suppression | |
KR20230057487A (en) | Methods and compositions for genomic manipulation | |
KR20230053735A (en) | Improved methods and compositions for manipulation of genomes | |
CN112996912A (en) | RNA and DNA base editing via engineered ADAR recruitment | |
KR20200038236A (en) | Composition comprising curon and use thereof | |
KR20210125990A (en) | Anellosomes for transporting protein replacement therapy modalities | |
KR20210131310A (en) | Anellosome and how to use it | |
KR20230127221A (en) | RNA targeting compositions and methods for treating CAG repeat disease | |
KR20210131309A (en) | Anellosomes for transporting secreted therapeutic modalities | |
JP2024504630A (en) | Site-specific genetic modification | |
KR20230129162A (en) | RNA targeting composition and method for treating type 1 myotonic dystrophy | |
KR20180091099A (en) | Improved eukaryotic cells for protein production and methods for producing them | |
AU780210B2 (en) | Mammalian genes involved in viral infection and tumor suppression | |
KR20210131308A (en) | Anellosomes for transporting intracellular therapeutic modalities | |
US20230086489A1 (en) | Novel design of guide rna and uses thereof | |
US20230374476A1 (en) | Prime editor system for in vivo genome editing | |
US20050112553A1 (en) | Mammalian genes involved in viral infection and tumor suppression | |
AU9604198A (en) | Mammalian genes involved in viral infection and tumor suppression | |
CA2474810A1 (en) | Methods for retrotransposing long interspersed elements (lines) | |
RU2775176C2 (en) | Nucleic acids encoding crispr-associated proteins, and their use | |
CN114107495A (en) | Use of DUXAP8 in diagnosis, treatment and prevention of endometrial cancer | |
WO2023227770A1 (en) | Functional nucleic acid molecule | |
JP2024157566A (en) | Systems, methods and compositions for sequence manipulation with optimized CRISPR-Cas systems | |
US20070224681A1 (en) | Collapsin response mediator protein-1 (CRMP-1) transcriptional regulatory nucleic acid sequences |