[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

EP0941366A2 - Biallelic markers - Google Patents

Biallelic markers

Info

Publication number
EP0941366A2
EP0941366A2 EP97946582A EP97946582A EP0941366A2 EP 0941366 A2 EP0941366 A2 EP 0941366A2 EP 97946582 A EP97946582 A EP 97946582A EP 97946582 A EP97946582 A EP 97946582A EP 0941366 A2 EP0941366 A2 EP 0941366A2
Authority
EP
European Patent Office
Prior art keywords
polymorphic
segment
allele
column
nucleic acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP97946582A
Other languages
German (de)
French (fr)
Inventor
Eric S. Lander
David Wang
Thomas Hudson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Whitehead Institute for Biomedical Research
Original Assignee
Whitehead Institute for Biomedical Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Whitehead Institute for Biomedical Research filed Critical Whitehead Institute for Biomedical Research
Publication of EP0941366A2 publication Critical patent/EP0941366A2/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • the variant form may confer an evolutionary advantage or disadvantage relative to a progenitor form or may be neutral.
  • a variant form confers a lethal disadvantage and is not transmitted to subsequent generations of the organism.
  • a variant form confers an evolutionary advantage to the species and is eventually incorporated into the DNA of many or most members of the species and effectively becomes the progenitor form.
  • a restriction fragment length polymorphism Is a variation in DNA sequence that alters the length of a restriction fragment (Botstein et al . , Am. J. Hum . Genet . 32, 314-331 (1980)).
  • the restriction fragment length polymorphism may create or delete a restriction site, thus changing the length of the restriction fragment.
  • RFLPs have been widely used in human and animal genetic analyses (see WO 90/13668; W090/11369; Donis-Keller, Cell 51, 319-337 (1987); Lander et al . , Genetics 121, 85-99 (1989) ) .
  • the presence of the RFLP in an individual can be used to predict the likelihood that the animal will also exhibit the trait.
  • VNTR variable number tandem repeat
  • STRs short tandem repeats
  • VNTRs have been used in identity "and paternity analysis (US 5,075,217; Armour et al . , FEBS Lett . 307, 113-115 (1992); Horn et al . , W0 91/14003; Jeffreys, EP 370,719), and in a large number of genetic mapping studies.
  • Other polymorphisms take the form of single nucleotide variations between individuals of the same species .
  • polymorphisms are far more frequent than RFLPs , STRs and VNTRs .
  • Some single nucleotide polymorphisms occur in protein-coding sequences, in which case, one of the polymorphic forms may give rise to the expression of a defective or other variant protein and, potentially, a genetic disease. Examples of genes, in which polymorphisms within coding sequences give rise to genetic disease include -globin (sickle cell anemia) and CFTR (cystic fibrosis) .
  • Other single nucleotide polymorphisms occur in noncoding regions. Some of these polymorphisms may also result in defective protein expression (e.g., as a result of defective splicing) . Other single nucleotide polymorphisms have no phenotypic effects.
  • Single nucleotide polymorphisms can be used in the same manner as RFLPs and VNTRs, but offer several advantages. Single nucleotide polymorphisms occur with greater frequency and are spaced more uniformly throughout the genome than other forms of polymorphism. The greater frequency and uniformity of single nucleotide polymorphisms means that there is a greater probability that such a polymorphism will be found in close proximity to a genetic locus of interest than would be the case for other polymorphisms. The different forms of characterized single nucleotide polymorphisms are often easier to distinguish than other types of polymorphism (e.g., by use of assays employing allele-specific hybridization probes or primers) .
  • the invention provides nucleic acid sequences comprising nucleic acid segments of from about 10 to about 200 bases as shown in the Table, column 7, including a polymorphic site. Complements of these segments are also included.
  • the segments can be DNA or RNA, and can be double- or single-stranded. Segments can be, for example, 10-20, 10-50 or 10-100 bases long. Preferred segments include a biallelic polymorphic site. The base occupying the polymorphic site in the segments can be the reference
  • the invention further provides allele-specific- oligonucleotides that hybridize to a segment of a fragment shown in the Table, column 7, or its complement. These oligonucleotides can be probes or primers. Also provided are isolated nucleic acids comprising a sequence shown in the Table, column 7, or the complement thereto, in which the polymorphic site within the sequence is occupied by a base other than the reference base shown in the Table, column 3.
  • the invention further provides a method of analyzing a nucleic acid from an individual.
  • the method determines which base is present at any one of the polymorphic sites shown in the Table.
  • a set of bases occupying a set of the polymorphic sites shown in the Table is determined. This type of analysis can be performed on a number of individuals, who are tested for the presence of a disease phenotype. The presence or absence of disease phenotype is then correlated with a base or set of bases present at the polymorphic sites in the individuals tested.
  • An oligonucleotide can be DNA or RNA, and single- or double- stranded. Oligonucleotides can be naturally occurring or synthetic, but are typically prepared by synthetic means.
  • the oligonucleotides of the present invention can comprise all of an oligonucleotide sequence presented in column 7 of the Table or a segment of such an oligonucleotide which includes a polymorphic site.
  • Oligonucleotides can be all of a nucleic acid segment as represented in column 7 of the Table; a nucleic acid sequence which comprises a nucleic acid segment represented in column 7 of the Table and additional nucleic acids (present at either or both ends of a nucleic acid segment of column 7) ; or a portion (fragment) of a nucleic acid segment represented in column 7 of the Table which includes a polymorphic site.
  • Preferred oligonucleotides of the invention include segments of DNA, or their complements, which include any one of the polymorphic sites shown in the Table. The segments can be between 5 and 250 bases, and, in specific embodiments, are between 5-10, 5-20, 10-20, 10- 50, 20-50 or 10-100 bases.
  • the polymorphic site can occur within any position of the segment.
  • the segments can be from any of the allelic forms of DNA shown in the Table.
  • Hybridization probes are oligonucleotides which bind in a base-specific manner to a complementary strand of nucleic acid. Such probes include peptide nucleic acids, as described in Nielsen et al . , Science 254, 1497-1500 (1991) .
  • primer refers to a single- stranded oligonucleotide which acts as a point of initiation of template-directed DNA synthesis under appropriate conditions ( e . g.
  • primer site refers to the area of the target DNA to which a primer hybridizes.
  • primer pair refers to a set of primers including a 5' (upstream) -primer that hybridizes with the 5' end of the DNA sequence to be amplified and a 3' (downstream) primer that hybridizes with the complement of the 3' end of the sequence to be amplified.
  • linkage describes the tendency of genes, alleles, loci or genetic markers to be inherited together as a result of their location on the same chromosome. It can be measured by percent recombination •between the two genes, alleles, loci or genetic markers.
  • polymorphism refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population.
  • a polymorphic marker or site is the locus at which divergence occurs. Preferred markers have at least two alleles, each occurring at frequency of greater than 1%, and more preferably greater than 10% or 20% of a selected population.
  • a polymorphic locus may be as small as one base pair.
  • Polymorphic markers include restriction fragment length polymorphisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as Alu.
  • allelic form is arbitrarily designated as the reference form and other allelic forms are designated as alternative or variant alleles.
  • allelic form occurring most frequently in a selected population is sometimes referred to as the wildtype form. Diploid organisms may be homozygous or heterozygous for allelic forms.
  • a diallelic or biallelic polymorphism has two forms.
  • a triallelic polymorphism has three forms.
  • a single nucleotide polymorphism occurs at a polymorphic site occupied by a single nucleotide, which is the site of variation between allelic sequences . -The site is usually preceded by and followed by highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations) .
  • a single nucleotide polymorphism usually arises due to substitution of one nucleotide for another at the polymorphic site.
  • a transition is the replacement of one purine by another purine or one pyrimidine by another pyrimidine.
  • a transversion is the replacement of a purine by a pyrimidine or vice versa.
  • Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele.
  • the polymorphic site is occupied by a base other than the reference base. For example, where the reference allele contains the base "T" at the polymorphic site, the altered allele can contain a "C", "G” or "A" at the polymorphic site.
  • Hybridizations are usually performed under stringent conditions, for example, at a salt concentration of no more than 1 M and a temperature of at least 25 °C.
  • stringent conditions for example, at a salt concentration of no more than 1 M and a temperature of at least 25 °C.
  • 5X SSPE 750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4
  • a temperature of 25-30°C, or equivalent conditions are suitable for allele-specific probe hybridizations.
  • Equivalent conditions can be determined by varying one or more of the parameters given as an example, as known in the art, while maintaining a similar degree of identity or similarity between the target nucleotide sequence and the primer or probe used.
  • an isolated nucleic acid of the invention may be substantially isolated with respect to the complex cellular milieu in which it naturally occurs.
  • the- isolated material will form part of a composition (for example, a crude extract containing other substances) , buffer system or reagent mix.
  • the material may be purified to essential homogeneity, for example as determined by PAGE or column chromatography such as HPLC.
  • an isolated nucleic acid comprises at least about 50, 80 or 90 percent (on a molar basis) of all macromolecular species present.
  • the novel polymorphisms of the invention are listed in the Table.
  • the first column of the Table lists the names assigned to the fragments in which the polymorphisms occur.
  • the fragments are all human genomic fragments.
  • the sequence of one allelic form of each of the fragments (arbitrarily referred to as the prototypical or reference form) has been previously published. These sequences are listed at http://www-genome.wi.mit.edu/ (all STS's (sequence tag sites)); http://shgc.stanford.edu (Stanford STS's); and http://ww.tigr.org/ (TIGR STS's).
  • the Web sites also list primers for amplification of the fragments, and the genomic location of fragments. Some fragments are expressed sequence tags, and some are random genomic fragments. All information in the websites concerning the fragments listed in the Table is incorporated by reference in its entirety for all purposes.
  • the second column lists the position in the fragment in which a polymorphic site has been found. Positions are numbered consecutively with the first base of the fragment sequence as listed in one of the above databases being assigned the number one.
  • the third column lists the base occupying the polymorphic site in the sequence in the data base. This base is arbitrarily designated -the-- reierence or prototypical form, but it is not necessarily the most frequently occurring form.
  • the fourth column in the Table lists the alternative base(s) at the polymorphic site.
  • the fifth column of the Table lists a 5' (upstream or forward) primer that hybridizes with the 5' end of the DNA sequence to be amplified.
  • the sixth column of the Table lists a 3' (downstream or reverse) primer that hybridizes with the complement of the 3' end of the sequence to be amplified.
  • the seventh column of the Table lists a number of bases of sequence on either side of the polymorphic site in each fragment .
  • the indicated sequences can be either DNA or RNA. In the latter, the T's shown in the Table are replaced by U's.
  • the base occupying the polymorphic site is indicated in EUPAC-IUB ambiguity code.
  • tissue samples include whole blood, semen, saliva, tears, urine, fecal material, sweat, buccal, skin and hair.
  • tissue sample For assay of cDNA or mRNA, the tissue sample must be obtained from an organ in which the target nucleic acid is expressed. For example, if the target nucleic acid is a cytochrome P450, the liver is a suitable source.
  • PCR DNA Amplifica tion
  • PCR Protocols A Guide to Methods and Applications (eds. Innis,-- et-al . , Academic Press, San Diego, CA, 1990); Mattila et al . , Nuclei c Acids Res . 19, 4967 (1991); Eckert et al . , PCR Methods and Applications 1, 17 (1991); PCR (eds. McPherson et al . , IRL Press, Oxford); and U.S. Patent 4,683,202.
  • LCR ligase chain reaction
  • NASBA nucleic acid based sequence amplification
  • the latter two amplification methods involve isothermal reactions based on isothermal transcription, which produce both single stranded RNA (ssRNA) and double stranded DNA (dsDNA) as the amplification products in a ratio of about 30 or 100 to 1, respectively.
  • ssRNA single stranded RNA
  • dsDNA double stranded DNA
  • the first type of analysis is carried out to identify polymorphic sites not previously characterized (i.e., to identify new polymorphisms) .
  • This analysis compares target sequences in different individuals to identify points of variation, i.e., polymorphic sites.
  • groups of individuals representing the greatest ethnic diversity among humans and greatest breed and species variety in plants and animals patterns characteristic of the most common alleles/haplotypes of the locus can be identified, and the frequencies of such alleles/haplotypes in the population can be determined. Additional allelic frequencies can be determined -for subpopulations characterized by criteria such as geography, race, or gender.
  • the de novo identification of polymorphisms of the invention is described in the Examples section.
  • the second type of analysis determines which form(s) of a characterized (known) polymorphism are present in individuals under test. There are a variety of suitable procedures, which are discussed in turn.
  • Allele-specific probes for analyzing polymorphisms is described by e.g., Saiki et al . , Nature 324, 163-166 (1986); Dattagupta, EP 235,726, Saiki, WO 89/11548. Allele-specific probes can be designed that hybridize to a segment of target DNA from one individual but do not hybridize to the corresponding segment from another individual due to the presence of different polymorphic forms in the respective segments from the two individuals. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles.
  • Some probes are designed to hybridize to a segment of target DNA such that the polymorphic site aligns with a central position (e.g., in a 15-mer at the 7 position; in a 16-mer, at either the 8 or 9 position) of the probe. This design of probe achieves good discrimination in hybridization between different allelic forms.
  • Allele-specific probes are often used in pairs, one member of a pair showing a perfect match to a reference form of a target sequence and the other member showing a perfect match to a variant form. Several pairs of probes can then be immobilized on the same support for simultaneous analysis of multiple polymorphisms within the same target sequence .
  • the polymorphisms can also be identified by hybridization to nucleic acid arrays, some examples of which are described in WO 95/11995.
  • One form of such arrays is described in the Examples section in connection with de novo identification of polymorphisms.
  • the same array or a different array can be used for analysis of characterized polymorphisms.
  • WO 95/11995 also describes subarrays that are optimized for detection of a variant form of a precharacterized polymorphism.
  • Such a subarray contains probes designed to be complementary to a second reference sequence, which is an allelic variant of the first reference sequence.
  • the second group of probes is designed by the same principles as described in the Examples, except that the probes exhibit complementarity to the second reference sequence.
  • a second group can be particularly useful for analyzing short subsequences of the primary reference sequence in which multiple mutations are expected to occur within a short distance commensurate with the length of the probes (e.g., two or more mutations within 9 to 21 bases) .
  • Allele-Specific Primers An allele-specific primer hybridizes to a site on target DNA overlapping a polymorphism and only primes amplification of an allelic form to which the primer exhibits perfect complementarity. See Gibbs , Nucl eic Acid Res . 17, 2427-2448 (1989) . This primer is used in conjunction with a second primer which hybridizes at a distal site. Amplification proceeds from the -two-primers , resulting in a detectable product which indicates the particular allelic form is present. A control is usually performed with a second pair of primers, one of which shows a single base mismatch at the polymorphic site and the other of which exhibits perfect complementarity to a distal site.
  • the single-base mismatch prevents amplification and no detectable product is formed.
  • the method works best when the mismatch is included in the 3 ' -most position of the oligonucleotide aligned with the polymorphism because this position is most destabilizing to elongation from the primer (see, e.g., WO 93/22456).
  • the direct analysis of the sequence of polymorphisms of the present invention can be accomplished using either the dideoxy chain termination method or the Maxam Gilbert method (see Sambrook et al . , Molecular Cloning, A Laboratory Manual (2nd Ed., CSHP, New York 1989); Zyskind et al . i Recombinant DNA Laboratory Manual , (Acad. Press, 1988) ) . 5. Denaturing Gradient Gel Electrophoresis Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution. Erlich, ed. , PCR Technology, Principles and Applica tions for DNA Amplification, (W.H. Freeman and Co, New York, 1992), Chapter 7.
  • Alleles of target sequences can be differentia-ted using single- strand conformation polymorphism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described in Orita et al . , Proc . Na t . Acad . Sci . 86,
  • Amplified PCR products can be generated as described above, and heated or otherwise denatured, to form single stranded amplification products.
  • Single- stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence.
  • the different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence differences between alleles of target sequences .
  • polymorphisms of the invention are often used in conjunction with ⁇ - polymorphisms in distal genes.
  • Preferred polymorphisms for use in forensics are biallelic because the population frequencies of two polymorphic forms can usually be determined with greater accuracy than those of multiple polymorphic forms at multi-allelic loci.
  • the capacity to identify a distinguishing or unique set of forensic markers in an individual is useful for forensic analysis. For example, one can determine whether a blood sample from a suspect matches a blood or other tissue sample from a crime scene by determining whether the set of polymorphic forms occupying selected polymorphic sites is the same in the suspect and the sample. If the set of polymorphic markers does not match between a suspect and a sample, it can be concluded (barring experimental error) that the suspect was not the source of the sample. If the set of markers does match, one can conclude that the DNA from the suspect is consistent with that found at the crime scene.
  • frequencies of the polymorphic forms at the loci tested have been determined (e.g., by analysis of a suitable population of individuals) , one can perform a statistical analysis to determine the probability that a match of suspect and crime scene sample would occur by chance .
  • p(ID) is the probability that two random individuals have the same polymorphic or allelic form at a given polymorphic site. In biallelic loci, four genotypes are possible: AA, AB, BA, and BB . If alleles A and B occur in a haploid genome of the organism with frequencies x and y, the probability of each genotype in a diploid organism is
  • the cumulative probability of identity (cum p(ID)) for each of multiple unlinked loci is determined by multiplying the probabilities provided by each locus.
  • cum p(ID) p(IDl)p(ID2)p(ID3) ....
  • the object of paternity testing is usually" to ⁇ determine whether a male is the father of a child. In most cases, the mother of the child is known and thus, the mother's contribution to the child's genotype can be traced. Paternity testing investigates whether the part of the child's genotype not attributable to the mother is consistent with that of the putative father. Paternity testing can be performed by analyzing sets of polymorphisms in the putative father and the child. If the set of polymorphisms in the child attributable to the father does not match the set of polymorphisms of the putative father, it can be concluded, barring experimental error, that the putative father is not the real father. If the set of polymorphisms in the child attributable to the father does match the set of polymorphisms of the putative father, a statistical calculation can be performed to determine the probability of coincidental match.
  • polymorphisms of the invention may contribute to the phenotype of an organism in different ways . Some polymorphisms occur within a protein coding sequence and contribute to phenotype by affecting protein structure.
  • the effect may be neutral, beneficial or detrimental, or both beneficial and detrimental, depending on the circumstances .
  • a heterozygous sickle cell mutation confers resistance to malaria, but a homozygous sickle cell mutation is usually lethal.
  • Other polymorphisms occur in noncoding regions but may exert phenotypic effects indirectly via influence on replication, transcription, and translation.
  • a single polymorphism may affect more than one phenotypic trait.
  • a single phenotypic trait may be affected by polymorphisms in different genes. Further, some polymorphisms predispose an individual to a distinct mutation that is causally related to a certain phenotype.
  • Phenotypic traits include diseases that ha-ve teiown but hitherto unmapped genetic components (e.g., agammaglobulimenia, diabetes insipidus, Lesch-Nyhan syndrome, muscular dystrophy, Wiskott-Aldrich syndrome,
  • Phenotypic traits also include symptoms of, or susceptibility to, multifactorial diseases of which a component is or may be genetic, such as autoimmune diseases, inflammation, cancer, diseases of the nervous system, and infection by pathogenic microorganisms.
  • autoimmune diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non-independent) , systemic lupus erythematosus and Graves disease.
  • cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, leukemia, liver, lung, oral cavity, ovary, pancreas, prostate, skin, stomach and uterus.
  • Phenotypic traits also include characteristics such as longevity, appearance (e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity to particular drugs or therapeutic treatments.
  • Correlation is performed for a population of individuals who have been tested for the presence or absence of a phenotypic trait of interest and for polymorphic markers sets.
  • a set of polymorphisms i.e. a polymorphic set
  • the alleles of each polymorphism of the set are then reviewed--to-determine whether the presence or absence of a particular allele is associated with the trait of interest.
  • Correlation can be performed by standard statistical methods such as a K - squared test and statistically significant correlations between polymorphic form(s) and phenotypic characteristics are noted.
  • allele Al at polymorphism A correlates with heart disease.
  • allele Bl at polymorphism B correlates with increased milk production of a farm animal.
  • Such correlations can be exploited in several ways .
  • detection of the polymorphic form set in a human or animal patient may justify immediate administration of treatment, or at least the institution of regular monitoring of the patient.
  • Detection of a polymorphic form correlated with serious disease in a couple contemplating a family may also be valuable to the couple in their reproductive decisions.
  • the female partner might elect to undergo in vitro fertilization to avoid the possibility of transmitting such a polymorphism from her husband to her offspring.
  • immediate therapeutic intervention or monitoring may not be justified.
  • the patient can be motivated to begin simple life-style changes (e.g., diet, exercise) that can be accomplished at little cost to the patient but confer potential benefits in reducing the risk of conditions to which the patient may have increased susceptibility by virtue of variant alleles .
  • Identification -of -a polymorphic set in a patient correlated with enhanced receptiveness to one of several treatment regimes for a disease indicates that this treatment regime should be followed.
  • Y ijkpn ⁇ + YSi + P j + X k + ⁇ 1 + ... jS 17 + PE n + a n +e p
  • Y ijknp is the milk, fat, fat percentage, SNF, SNF percentage, energy concentration, or lactation energy record
  • is an overall mean
  • YSi is the effect common to all cows calving in year-season
  • X k is the effect common to cows in either the high or average selection line
  • ⁇ to ⁇ xl are the binomial regressions of production record on mtDNA D-loop sequence polymorphisms
  • PE n is permanent environmental effect common to all records of cow n
  • a n is effect of animal n and is composed of the additive genetic contribution of sire and dam breeding values and a Mendelian sampling effect
  • e p is a random residual. It was found that eleven of seventeen polymorphisms tested influenced at least one production trait. Bovines having the best
  • D. Genetic Mapping of Phenotypic Traits The previous section concerns identifying correlations between phenotypic traits and polymorphisms that directly or indirectly contribute to those traits.
  • the present section describes identification of a physical linkage between a genetic locus associated with a trait of interest and polymorphic markers that are not associated with the trait, but are in physical proximity with the genetic locus responsible for the trait and co-segregate with it.
  • Such analysis is useful for mapping a genetic locus associated with a phenotypic trait to a chromosomal position, and thereby cloning gene(s) responsible for the trait. See Lander et al . , Proc . Na tl . Acad . Sci .
  • Linkage studies are typically performed on members of a family. Available members of the family are characterized for the presence or absence of a phenotypic trait and for a set of polymorphic markers. The distribution of polymorphic markers in an informative meiosis is then analyzed to determine which polymorphic markers co- segregate with a phenotypic trait. See, e . g. , Kerem et al . , Science 245, 1073-1080 (1989); Monaco et al . , Na ture 316, 842 (1985); Yamoka et al . , Neurology 40, 222-226 (1990); Rossiter et al . , FASEB Journal 5, 21-27 (1991).
  • LOD log of the odds
  • the likelihood at a given value of ⁇ is: probability of data if loci linked at ⁇ to probability of data if loci unlinked.
  • the computed likelihoods are usually expressed as the log 10 of this ratio (i.e., a lod score) .
  • a lod score of 3 indicates 1000:1 odds against an apparent observed linkage being a coincidence.
  • the use of logarithms- allows data collected from different families to be combined by simple addition. Computer programs are available for the calculation of lod scores for differing values of ⁇ (e.g., LIPED, MLINK (Lathrop, Proc . Na t . Acad . Sci . (USA) 81, 3443-3446 (1984)) .
  • a recombination fraction may be determined from mathematical tables. See Smith et al . , Ma thema tical tables for research workers in human genetics (Churchill, London, 1961); Smith, Ann . Hum . Genet . 32, 127-150 (1968) . The value of ⁇ at which the lod score is the highest is considered to be the best estimate of the recombination fraction.
  • Positive lod score values suggest that the two loci are linked, whereas negative values suggest that linkage is less likely (at that value of ⁇ ) than the possibility that the two loci are unlinked.
  • a combined lod score of +3 or greater is considered definitive evidence that two loci are linked.
  • a negative lod score of -2 or less is taken as definitive evidence against linkage of the two loci being compared.
  • Negative linkage data are useful in excluding a chromosome or a segment thereof from consideration. The search focuses on the remaining non-excluded chromosomal locations .
  • the invention further provides variant forms of nucleic acids and corresponding proteins.
  • the nucleic acids comprise one of the sequences described in the Table, column 8, in which the polymorphic position is occupied by one of the alternative bases for that position. Some nucleic acids encode full-length variant forms of proteins.
  • variant proteins have the prototypical amino acid sequences encoded by nucleic acid sequences shown in the Table, column 8, (read so as to be in- frame with the full-length coding sequence of which it is a component) except at an amino acid encoded by a codon including one of the polymorphic positions shown in the Table. That position is occupied by the amino acid coded by the corresponding codon in any of the alternative forms shown in the Table .
  • Variant genes can be expressed in an expression vector in which a variant gene is operably linked to a native or other promoter.
  • the promoter is a eukaryotic promoter for expression in a mammalian cell.
  • the transcription regulation sequences typically include a heterologous promoter and optionally an enhancer which is recognized by the host.
  • the selection of an appropriate promoter for example trp, lac, phage promoters, glycolytic enzyme promoters and tRNA promoters, depends on the host selected.
  • Commercially available expression vectors can be used. Vectors can include host-recognized replication systems, amplifiable genes, selectable markers, host sequences useful for insertion into the host genome, and the like.
  • the means of introducing the expression construct into a host cell varies depending upon the particular construction and the target host. Suitable means include fusion, conjugation, transfection, transduction, electroporation or injection, as described in Sambrook, supra .
  • a wide variety of host cells can be employed for expression of the variant gene, both prokaryotic and eukaryotic. Suitable host cells include bacteria such as E. coli , yeast, filamentous fungi, insect cells, mammalian cells, typically immortalized, e . g. , mouse, CHO, human and monkey cell lines and derivatives thereof. Preferred host cells are able to process the variant gene product to produce an appropriate mature polypeptide.
  • the protein may be isolated by conventional means of protein biochemistry and purification to obtain a substantially pure product, i . e . , 80, 95 or 99% free of cell component contaminants, as described in Jacoby, Methods in Enzymology Volume 104, Academic Press, New York (1984); Scopes, Protein Purifica tion, Principles and Practice, 2nd Edition, Springer-Verlag, New York (1987); and DeuLscher (ed) , Guide to Protein Purifica tion, Methods in Enzymology, Vol. 182 (1990). If the protein is secreted, it can be isolated from the supernatant in which the host cell is grown. If not secreted, the protein can be isolated from a lysate of the host cells.
  • the invention further provides transgenic nonhuman animals capable of expressing an exogenous variant gene and/or having one or both alleles of an endogenous variant gene inactivated.
  • Expression of an exogenous variant gene is usually achieved by operably linking the gene to a promoter and optionally an enhancer, and microinjecting the construct into a zygote .
  • Inactivation of endogenous variant genes can be achieved by forming a transgene in which a cloned variant gene is inactivated by insertion of a positive selection marker. See Capecchi, Science 244, 1288-1292 (1989) .
  • the transgene is then introduced into an embryonic stem cell, where it undergoes homologous recombination with an endogenous variant gene. Mice and other rodents are preferred animals. Such animals provide useful drug screening systems .
  • the present invention includes biologically active fragments of the polypeptides, or analogs thereof, including organic molecules which simulate the interactions of the peptides.
  • Biologically active fragments include any portion of the full-length polypeptide which confers a biological function on the variant gene product, including ligand binding, and antibody binding.
  • Ligand binding includes binding by nucleic acids, proteins or polypeptides, small biologically active molecules, or large cellular structures.
  • Antibodies that specifically bind to variant gene products but not to corresponding prototypical gene products are also provided.
  • Antibodies can be made by injecting mice or other animals with the variant gene product or synthetic -peptide- fragments thereof. Monoclonal antibodies are screened as are described, for example, in Harlow & Lane, Antibodies , A Labora tory Manual , Cold Spring Harbor Press, New York (1988) ; Goding, Monoclonal antibodies, Principles and Practice (2d ed.) Academic Press, New York (1986) . Monoclonal antibodies are tested for specific immunoreactivity with a variant gene product and lack of immunoreactivity to the corresponding prototypical gene product . These antibodies are useful in diagnostic assays for detection of the variant form, or as an active ingredient in a pharmaceutical composition.
  • kits comprising at least one allele-specific oligonucleotide as described above. Often, the kits contain one or more pairs of allele- specific oligonucleotides hybridizing to different forms of a polymorphism. In some kits, the allele-specific oligonucleotides are provided immobilized to a substrate.
  • the same substrate can comprise allele- specific oligonucleotide probes for detecting at least 10, 100 or all of the polymorphisms shown in the Table.
  • kits include, for example, restriction enzymes, reverse-transcriptase or polymerase, the substrate nucleoside triphosphates , means used to label (for example, an avidin-enzyme conjugate and enzyme substrate and chromogen if the label is biotin) , and the appropriate buffers for reverse transcription, PCR, or hybridization reactions.
  • the kit also contains instructions for carrying out the methods.
  • the polymorphisms shown in the Table were identified by resequencing of target sequences from three to ten unrelated individuals of diverse ethnic and geographic backgrounds by hybridization to probes immobilized to microfabricated arrays or conventional sequencing.
  • the strategy and principles for design and use of such arrays are generally described in WO 95/11995.
  • the strategy provides arrays of probes for analysis of target sequences showing a high degree of sequence identity to the reference sequences of the fragments shown in the Table, column 1.
  • the reference sequences were sequence-tagged sites (STSs) developed in the course of the Human Genome Project (see, e . g . , Science 270, 1945-1954 (1995); Nature 380, 152-154 (1996)).
  • a typical probe array used in this analysis has two groups of four sets of probes that respectively tile both strands of a reference sequence.
  • a first probe set comprises a plurality of probes exhibiting perfect complementarily with one of the reference sequences.
  • Each probe in the first probe set has an interrogation position that corresponds to a nucleotide in the reference sequence. That is, the interrogation position is aligned with the corresponding nucleotide in the reference sequence, when the probe and reference sequence are aligned to maximize complementarily between the two.
  • For each probe in the first set there are three corresponding probes from three additional probe sets. Thus, there are four probes corresponding to each nucleotide in the reference sequence.
  • probes from the three additional probe -sets aaee identical to the corresponding probe from the first probe set except at the interrogation position, which occurs in the same position in each of the four corresponding probes from the four probe sets, and is occupied by a different nucleotide in the four probe sets.
  • probes were 25 nucleotides long. Arrays tiled for multiple different references sequences were included on the same substrate.
  • target sequences from an individual were amplified from human genomic DNA using primers for the fragments indicated in the listed Web sites.
  • the amplified target sequences were fluorescently labelled during or after PCR.
  • the labelled target sequences were hybridized with a substrate bearing immobilized arrays of probes. The amount of lable bound to probes was measured. Analysis of the pattern of label revealed the nature and position of differences between the target and reference sequence. For example, comparison of the intensities of four corresponding probes reveals the identity of a corresponding nucleotide in the target sequences aligned with the interrogation position of the probes.
  • the corresponding nucleotide is the complement of the nucleotide occupying the interrogation position of the probe showing the highest intensity (see WO 95/11995) .
  • the existence of a polymorphism is also manifested by differences in normalized hybridization intensities of probes flanking the polymorphism when the probes hybridized to corresponding targets from different individuals. For example, relative loss of hybridization intensity in a "footprint" of probes flanking a polymorphism signals a difference between the target and reference (i.e., a polymorphism) (see EP 717,113) .
  • hybridization intensities for corresponding targete-s from different individuals can be classified into groups or clusters suggested by the data, not defined a priori , such that isolates in a give cluster tend to be similar and isolates in different clusters tend to be dissimilar. Hybridizations to samples from different individuals were performed separately. The Table summarizes the data obtained for target sequences in comparison with a reference sequence for the individuals tested.
  • the invention includes a number of general uses that can be expressed concisely as follows.
  • the invention provides for the use of any of the nucleic acid segments described above in the diagnosis or monitoring of diseases, such as cancer, inflammation, heart disease, diseases of the CNS, and susceptibility to infection by microorganisms.
  • the invention further provides for the use of any of the nucleic acid segments in the manufacture of a medicament for the treatment or prophylaxis of such diseases.
  • the invention further provides for the use of any of the DNA segments as a pharmaceutical.
  • Wl-7718b 248 AGGAACAAAAAATTACAAAGAACCATGCAGGAAGGAAAACTATGTATT[A/G1AT
  • ATrGCACTG GTTTTTGAAATACCTTTGTAGTTACTCAAGC[A/C, ⁇ GTTACTCCCTACACTGATGC AAGGATTACAGAAACTGATGCCAAGGGGCTGAGTGAGTTCAACTACATGTTCTGGGGGCCCGGAGAT AGATGACTTTGCAGATGGAMGAGGTGAAAATGAAGAAGGAAGCTGTGTTGAAACAGAAAAATAAG
  • Wl-7718a 42 TCAAAAGGAACAAAAATTACAAAGAACCATGCAGGAAGGAAAACTATGTATTA
  • WI-7227C 291 TTATTATGGGAAAGGAAATGGCATTGCTGCTTTCAACCAGCGACTAATGCAAT
  • Wl-7227b 93 GTGTTATTATGGGAAAGGAAATGGCATTGCTGCTTTCAACCAGCGACTAATG
  • Wl-7227a 24 G GTGTTATTATGGGAAAGGAAATGGCATTGCTGCTTTCAACCAGCGACTAATG j CCACAATGCCTCTCCCACGATGTCAAGGACTCCTGTCTGTCCTGGAGGTGGGAGACAAGGAACCTCCG
  • Wl-1 95b 1 30 AGTGAGCTGGGGAAGGCAGGATTT
  • Wl-1126b 230 AAAATGCAAATCCAGCTGT CTTTTT[T/C
  • Wl-3429b 64 TCCTGACTGTTAACAAGCACTCCAGGCAATTCTTAAGACCAAGCACGGAGC
  • Wl-3429a 62 TCCTGACTGTTAACAAGCACTCCAGGCAATTCTTAAGACCAAGCACGGAGC
  • Wl-6786b 1 1 1 TTTTTGGCAGGGGACACTCCTTCTGGGTGCTCTATTGCTCAGTTTCATCATT
  • Wl-6786a 1 06 TTTTTGGCAGGGGACACTCCTTCTGGGTGCTCTATTGCTCAGTTTCATCATT
  • AAAAGGACAG TTTCCATCTTA CCAGATATCA TTTCATTTCTG CAACATTTATCAAACATGGTAGGGAAMGTTCTCACTCTGCACTATAAAAAGGACAGCCAGATATCA
  • CAGAAMTCA ATGAGACCCTGCTTTGMCGTTAMCGTTTTGGMTMTGGAAMGGAGCTAGGACMTTCTTGCTT
  • AAAAATTAAC CAGGGTCTTGCTCTGTCTCCCAGGCTAGAGTGAGGTGACACMTCMGACTCACAGTAGCCTCMCCT
  • WI-7079 293 TTTTACAGCTCTTGGCAT ⁇ TCCTCGCCTAGGCCTGTGAGGTMCTGGGAT
  • Wl-7104b 249 GTGAGGCCTTGCACCAGGTGGGGGCCACAGCACCAGCAGCATCTTTG[CtFJF
  • WI-9161 61 1 CCTGGC GGM CTGTCTAGTCTCTCCTGTMGCCAMGMATGMCATTCCA
  • Wl-7023b 206 A[C/A]ACACACATTCTTGCTCTACCCAMGCTCTGGCTGGCAGCACTM
  • WI-7093 54 GGGAGAGCTCTTGTTATFATTMTATTGTTGCCGCTGTTGTGTTGTTGTTA
  • ACTTCTCCC TCTGACCTAGG MAGMCTACAGAGGACGATGTCCAAMCMAAMTGGCATCACCTGTCAAAMTGGAGTTCCACT
  • WI-205C 1 46 ATCTTACTTTGTTTAAMMCTGCATATGCCTTTA I I I I I GTTTTAGTTCCC
  • Wl-205b 1 46 ATCTTACTTTGTTTAAAAMCTGCATATGCCTTTATTTTTGTTTTAGTTCCC
  • WI-1943C 1 65 TACAGGGCACCGNTGAGCATTCCAGATGACTCCAMGCCCCGGCTGGAGTAT
  • Wl-1943b 1 65 TACAGGGCACCGNTGAGCATTCCAGATGACTCCAMGCCCCGGCTGGAGTAT
  • Wi-6336b 234 GTACCCCAGTGCATTATGTCTTGGTAGAGCC[C/T]TGAGGACACTGACAGT
  • Wl-6564b 54 GTTCCTTGGCAGGAGMCATGCATATGACTTTAAMTMAGACCMCA
  • Wl-6817 1 45 MGATGTTGGACACCTTGTGTTCAMTCTTGGTTCAGGTGCGGCCTGTGCAG
  • Wl-6826b 1 54 TMGCTGMTTGCAMTTATGGCMCACACACTGGACTGGGGTATACGTTG
  • WI-6826 1 54 TMGCTGMTTGCAMTTATGGCMCACACACTGGACTGGGGTATACGTTG
  • Wl-7056b 1 8
  • WI-7136 58 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGTAGCTTTCTATATATG
  • WI-7146C 21 0 MCGC[A/G]GTTCATGTACMGGCCCCTCTGCMCTGGAGAGAAMTTA
  • WI-7146 202 ICCMCGCAGTTCATGTACMGGCCCCTCTGCMCTGGAGAGAAMTTA
  • WI-7153 1 61 AGTACCTATCTTTAMGTATAGTACATTTTACATATGTAAATGGTATGTTT
  • Wl-7169b 1 61 TTTCMGTCATCTTAGCAGCTAGGATTCTCAMTGGMGTGTTATATATA
  • Wl-7464b 1 68 GAMGMAGCCCTACAMTAGGCCCAGGAGMGCMCGTTCACCMCMTTAT
  • Wl-7464a 1 03 GAMGMAGCCCTACAMTAGGCCCAGGAGMGCAACGTTCACCMCMTTAT
  • Wl-7506b 1 1 8 GMGMMTATTTTAAMTATTGGACCACTCTTGTTCTACCATCCCTACCCACT
  • Wl-7534b 1 43 AGAGTGCTGCTAAM ⁇ GGATTGGTGTGATCTTTTTGGTAGTTGTMTTT
  • WI-7534 1 35 AGAGTGCTGCTAAMTTGGATTGGTGTGATCTTTTTGGTAGTTGTMTTT
  • Wl-7543b 1 62 CTCTGCAGCCCTCAGATFATTTTTCCTCTGGCTCCTTGGATGTAGTCAGTTA
  • Wl-7577g 1 57 ATTGTATMTGTGGCCTGTTATACATGACACTCTTCTGMTTGACTGTATTTC
  • WI-7743C 1 06 GAGGGGCAGMCAGCCGCTCCTGTCTGCCAGCCAGCAGCCAGCTCTCAGCC
  • Wl-7743 1 06 GAGGGGCAGMCAGCCGCTCCTGTCTGCCAGCCAGCAGCCAGCTCTCAGCC
  • Wl-7765b 1 26 ACTCAMCCAMTCACTGMCTTTGCTGAGCCTGTAMATAAMGGTCGGA
  • Wl-7774b 1 70 ATGATTGAAMTMTGCTGTCCTTTAGTAGCMGTAAMTGTGTCTTGCT
  • Wl-7785b 1 65 TAATTIATTTTGTCCATTGATGTATTTATTTTGTAMTGTATCTTGGTGCTGC
  • WI-7789C 84 GCCCTCCTGGTGACTCGGGGGCTGTCTCAGACGACTAGCCCAGGACCCATCT _
  • Wl-7830d 1 50 T AGGTTGATCGTTGTGTTGTTRTGCTGCACTTTTTACI I I I I IGCGTGTGGA
  • WI-7830C 54 AGGTTGATCGTTGTGTTGTTTFGCTGCACTTTTTAC I I I I I GCGTGTGGA
  • Wl-7830b 1 34 AGGTTGATCGTTGTGTTGTTFTGCTGCACTTTTTAC I I I I I GCGTGTGGA
  • Wl-7900d 1 28 TATGATGTATTTCTGAGCTAAMCTCMCTATAGMGACATTAAAAGAAATC
  • WI-7900C 84 TATGATGTATTTCTGAGCTAAMCTCMCTATAGMGACATTAMAGAAATC
  • WI-7900 84 TATGATGTATTTCTGAGCTAAAACTCAACTATAGAAGACATTAAAAGAAATC
  • WI-8024C 206 TTCCC[A/G]CTCTAGMCAGCTGGCCCTGGTCGTCAGTACACMGGAMGAGC
  • Wl-8024b 206 TTCCC[A/G]CTCTAGMCAGCTGGCCCTGGTCGTCAGTACACMGGMAGAGC
  • WI-8321 1 78 TTTTGCTATGGTTCTAGTTFATCMCCTACTTTATTAGCTGMCTGTTGGC
  • WI-8321 1 78 TTTFGCTATGGTTCTAGTTTATCMCCTACTTTATTAGCTGMCTGTTGGC
  • Wl-8332b 123 AGGTGGAGGGTNTCCGGGGMGCAGTTAGATGAGTTMGTGTGATGCACA
  • Wl-8378b 31 1 MCTGCCCCCATGATCCMTCACCTNTCACCAGGCCCCTCCTCCMCACGTGGGG
  • WI-8378 308 MCTGCCCCCATGATCCMTCACCTNTCACCAGGCCCCTCCTCCMCACGTGGGG
  • WI-8426 1 84 G AGGCTGGGAGTATGGANGGNCCCGGGGCCCTTGGCNATNGNATFCAGTGAG
  • Wl-9676h 1 34 AGGCCAGGGTCTCAGCTTTAMGCCTTGGMTCCTATGCATTGTTTGTTT
  • Wl-9676d 1 34 AGGCCAGGGTCTCAGCTTTAMGCCTTGGMTCCTATGCATTGTTTGTTT
  • WI-9832 1 1 6 A TTTGTMGTGGACTAMGTTTGAGGACCAGACATGGMGGTTGGCTTTGGC
  • AAAGCATGAC CGCTTATGTTA AATAAAATGA ATAGTMTTCC CMGTGAATATTGATACATGGCTGACMAGCATGACMTMMTGMCAC[A/G]TACGGGMTTAC

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides nucleic acid segments of the human genome including polymorphic sites. Allele-specific primers and probes hybridizing to regions flanking these sites are also provided. The nucleic acids, primers and probes are used in applications such as forensics, paternity testing, medicine and genetic analysis.

Description

BIALLELIC MARKERS
RELATED APPLICATIONS
This application claims priority to U.S. provisional application Serial No. 60/030,455, filed November 6, 1996, the entire teachings of which are incorporated herein by reference .
BACKGROUND OF THE INVENTION
The genomes of all organisms undergo spontaneous mutation in the course of their continuing evolution, generating variant forms of progenitor sequences (Gusella, Ann . Rev. Biochem . 55, 831-854 (1986)). The variant form may confer an evolutionary advantage or disadvantage relative to a progenitor form or may be neutral. In some instances, a variant form confers a lethal disadvantage and is not transmitted to subsequent generations of the organism. In other instances, a variant form confers an evolutionary advantage to the species and is eventually incorporated into the DNA of many or most members of the species and effectively becomes the progenitor form. In many instances, both progenitor and variant form(s) survive and co-exist in a species population. The coexistence of multiple forms of a sequence gives rise to polymorphisms.
Several different types of polymorphism have been reported. A restriction fragment length polymorphism (RFLP) Is a variation in DNA sequence that alters the length of a restriction fragment (Botstein et al . , Am. J. Hum . Genet . 32, 314-331 (1980)). The restriction fragment length polymorphism may create or delete a restriction site, thus changing the length of the restriction fragment. RFLPs have been widely used in human and animal genetic analyses (see WO 90/13668; W090/11369; Donis-Keller, Cell 51, 319-337 (1987); Lander et al . , Genetics 121, 85-99 (1989) ) . When a heritable trait can be linked to a particular RFLP, the presence of the RFLP in an individual can be used to predict the likelihood that the animal will also exhibit the trait.
Other polymorphisms take the form of short tandem repeats (STRs) that include tandem di-, tri- and tetra- nucleotide repeated motifs. These tandem repeats are also referred to as variable number tandem repeat (VNTR) polymorphisms. VNTRs have been used in identity "and paternity analysis (US 5,075,217; Armour et al . , FEBS Lett . 307, 113-115 (1992); Horn et al . , W0 91/14003; Jeffreys, EP 370,719), and in a large number of genetic mapping studies. Other polymorphisms take the form of single nucleotide variations between individuals of the same species . Such polymorphisms are far more frequent than RFLPs , STRs and VNTRs . Some single nucleotide polymorphisms occur in protein-coding sequences, in which case, one of the polymorphic forms may give rise to the expression of a defective or other variant protein and, potentially, a genetic disease. Examples of genes, in which polymorphisms within coding sequences give rise to genetic disease include -globin (sickle cell anemia) and CFTR (cystic fibrosis) . Other single nucleotide polymorphisms occur in noncoding regions. Some of these polymorphisms may also result in defective protein expression (e.g., as a result of defective splicing) . Other single nucleotide polymorphisms have no phenotypic effects.
Single nucleotide polymorphisms can be used in the same manner as RFLPs and VNTRs, but offer several advantages. Single nucleotide polymorphisms occur with greater frequency and are spaced more uniformly throughout the genome than other forms of polymorphism. The greater frequency and uniformity of single nucleotide polymorphisms means that there is a greater probability that such a polymorphism will be found in close proximity to a genetic locus of interest than would be the case for other polymorphisms. The different forms of characterized single nucleotide polymorphisms are often easier to distinguish than other types of polymorphism (e.g., by use of assays employing allele-specific hybridization probes or primers) . Only a small percentage of the total repository of polymorphisms in humans and other organisms ha-s been identified. The limited number of polymorphisms identified to date is due to the large amount of work required for their detection by conventional methods. For example, a conventional approach to identifying polymorphisms might be to sequence the same stretch of DNA in a population of individuals by dideoxy sequencing. In this type of approach, the amount of work increases in proportion to both the length of sequence and the number of individuals in a population and becomes impractical for large stretches of DNA or large numbers of persons .
SUMMARY OF THE INVENTION
The invention provides nucleic acid sequences comprising nucleic acid segments of from about 10 to about 200 bases as shown in the Table, column 7, including a polymorphic site. Complements of these segments are also included. The segments can be DNA or RNA, and can be double- or single-stranded. Segments can be, for example, 10-20, 10-50 or 10-100 bases long. Preferred segments include a biallelic polymorphic site. The base occupying the polymorphic site in the segments can be the reference
(Table, column 3) or an alternative base .(Table, column 4) .
The invention further provides allele-specific- oligonucleotides that hybridize to a segment of a fragment shown in the Table, column 7, or its complement. These oligonucleotides can be probes or primers. Also provided are isolated nucleic acids comprising a sequence shown in the Table, column 7, or the complement thereto, in which the polymorphic site within the sequence is occupied by a base other than the reference base shown in the Table, column 3.
The invention further provides a method of analyzing a nucleic acid from an individual. The method determines which base is present at any one of the polymorphic sites shown in the Table. Optionally, a set of bases occupying a set of the polymorphic sites shown in the Table is determined. This type of analysis can be performed on a number of individuals, who are tested for the presence of a disease phenotype. The presence or absence of disease phenotype is then correlated with a base or set of bases present at the polymorphic sites in the individuals tested. DETAILED DESCRIPTION OF THE INVENTION DEFINITIONS
An oligonucleotide can be DNA or RNA, and single- or double- stranded. Oligonucleotides can be naturally occurring or synthetic, but are typically prepared by synthetic means. The oligonucleotides of the present invention can comprise all of an oligonucleotide sequence presented in column 7 of the Table or a segment of such an oligonucleotide which includes a polymorphic site. Oligonucleotides can be all of a nucleic acid segment as represented in column 7 of the Table; a nucleic acid sequence which comprises a nucleic acid segment represented in column 7 of the Table and additional nucleic acids (present at either or both ends of a nucleic acid segment of column 7) ; or a portion (fragment) of a nucleic acid segment represented in column 7 of the Table which includes a polymorphic site. Preferred oligonucleotides of the invention include segments of DNA, or their complements, which include any one of the polymorphic sites shown in the Table. The segments can be between 5 and 250 bases, and, in specific embodiments, are between 5-10, 5-20, 10-20, 10- 50, 20-50 or 10-100 bases. The polymorphic site can occur within any position of the segment. The segments can be from any of the allelic forms of DNA shown in the Table. Hybridization probes are oligonucleotides which bind in a base-specific manner to a complementary strand of nucleic acid. Such probes include peptide nucleic acids, as described in Nielsen et al . , Science 254, 1497-1500 (1991) . As used herein, the term primer refers to a single- stranded oligonucleotide which acts as a point of initiation of template-directed DNA synthesis under appropriate conditions ( e . g. , in the presence of four different nucleoside triphosphates and an agent for polymerization, such as, DNA or RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature . The appropriate length of a primer depends on the intended use of the primer, but typically ranges from 15 to 30 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template . A primer need not reflect the exact sequence of the template, but must be sufficiently complementary to hybridize with a template. The term primer site refers to the area of the target DNA to which a primer hybridizes. The term primer pair refers to a set of primers including a 5' (upstream) -primer that hybridizes with the 5' end of the DNA sequence to be amplified and a 3' (downstream) primer that hybridizes with the complement of the 3' end of the sequence to be amplified.
As used herein, linkage describes the tendency of genes, alleles, loci or genetic markers to be inherited together as a result of their location on the same chromosome. It can be measured by percent recombination •between the two genes, alleles, loci or genetic markers.
As used herein, polymorphism refers to the occurrence of two or more genetically determined alternative sequences or alleles in a population. A polymorphic marker or site is the locus at which divergence occurs. Preferred markers have at least two alleles, each occurring at frequency of greater than 1%, and more preferably greater than 10% or 20% of a selected population. A polymorphic locus may be as small as one base pair. Polymorphic markers include restriction fragment length polymorphisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as Alu. The first identified allelic form is arbitrarily designated as the reference form and other allelic forms are designated as alternative or variant alleles. The allelic form occurring most frequently in a selected population is sometimes referred to as the wildtype form. Diploid organisms may be homozygous or heterozygous for allelic forms. A diallelic or biallelic polymorphism has two forms. A triallelic polymorphism has three forms. A single nucleotide polymorphism occurs at a polymorphic site occupied by a single nucleotide, which is the site of variation between allelic sequences . -The site is usually preceded by and followed by highly conserved sequences of the allele (e.g., sequences that vary in less than 1/100 or 1/1000 members of the populations) .
A single nucleotide polymorphism usually arises due to substitution of one nucleotide for another at the polymorphic site. A transition is the replacement of one purine by another purine or one pyrimidine by another pyrimidine. A transversion is the replacement of a purine by a pyrimidine or vice versa. Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele. Typically the polymorphic site is occupied by a base other than the reference base. For example, where the reference allele contains the base "T" at the polymorphic site, the altered allele can contain a "C", "G" or "A" at the polymorphic site.
Hybridizations are usually performed under stringent conditions, for example, at a salt concentration of no more than 1 M and a temperature of at least 25 °C. For example, conditions of 5X SSPE (750 mM NaCl, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30°C, or equivalent conditions, are suitable for allele-specific probe hybridizations. Equivalent conditions can be determined by varying one or more of the parameters given as an example, as known in the art, while maintaining a similar degree of identity or similarity between the target nucleotide sequence and the primer or probe used.
The term "isolated" is used herein to indicate that the material in question exists in a physical milieu distinct from that in which it occurs in nature. For example, an isolated nucleic acid of the invention may be substantially isolated with respect to the complex cellular milieu in which it naturally occurs. In some instances,-" the- isolated material will form part of a composition (for example, a crude extract containing other substances) , buffer system or reagent mix. In other circumstance, the material may be purified to essential homogeneity, for example as determined by PAGE or column chromatography such as HPLC. Preferably, an isolated nucleic acid comprises at least about 50, 80 or 90 percent (on a molar basis) of all macromolecular species present.
I . Novel Polymorphisms of the Invention
The novel polymorphisms of the invention are listed in the Table. The first column of the Table lists the names assigned to the fragments in which the polymorphisms occur. The fragments are all human genomic fragments. The sequence of one allelic form of each of the fragments (arbitrarily referred to as the prototypical or reference form) has been previously published. These sequences are listed at http://www-genome.wi.mit.edu/ (all STS's (sequence tag sites)); http://shgc.stanford.edu (Stanford STS's); and http://ww.tigr.org/ (TIGR STS's). The Web sites also list primers for amplification of the fragments, and the genomic location of fragments. Some fragments are expressed sequence tags, and some are random genomic fragments. All information in the websites concerning the fragments listed in the Table is incorporated by reference in its entirety for all purposes.
The second column lists the position in the fragment in which a polymorphic site has been found. Positions are numbered consecutively with the first base of the fragment sequence as listed in one of the above databases being assigned the number one. The third column lists the base occupying the polymorphic site in the sequence in the data base. This base is arbitrarily designated -the-- reierence or prototypical form, but it is not necessarily the most frequently occurring form. The fourth column in the Table lists the alternative base(s) at the polymorphic site. The fifth column of the Table lists a 5' (upstream or forward) primer that hybridizes with the 5' end of the DNA sequence to be amplified. The sixth column of the Table lists a 3' (downstream or reverse) primer that hybridizes with the complement of the 3' end of the sequence to be amplified.
The seventh column of the Table lists a number of bases of sequence on either side of the polymorphic site in each fragment . The indicated sequences can be either DNA or RNA. In the latter, the T's shown in the Table are replaced by U's. The base occupying the polymorphic site is indicated in EUPAC-IUB ambiguity code.
II. Analysis of Polymorphisms A. Preparation of Samples Polymorphisms are detected in a target nucleic acid from an individual being analyzed. For assay of genomic
DNA, virtually any biological sample (other than pure red blood cells) is suitable. For example, convenient tissue samples include whole blood, semen, saliva, tears, urine, fecal material, sweat, buccal, skin and hair. For assay of cDNA or mRNA, the tissue sample must be obtained from an organ in which the target nucleic acid is expressed. For example, if the target nucleic acid is a cytochrome P450, the liver is a suitable source.
Many of the methods described below require amplification of DNA from target samples. This can be accomplished by e.g., PCR. See generally PCR Technology: Principles and Applications for DNA Amplifica tion (ed. H.A. Erlich, Freeman Press, NY, NY, 1992) ; PCR Protocols : A Guide to Methods and Applications (eds. Innis,-- et-al . , Academic Press, San Diego, CA, 1990); Mattila et al . , Nuclei c Acids Res . 19, 4967 (1991); Eckert et al . , PCR Methods and Applications 1, 17 (1991); PCR (eds. McPherson et al . , IRL Press, Oxford); and U.S. Patent 4,683,202.
Other suitable amplification methods include the ligase chain reaction (LCR) (see Wu and Wallace, Genomics 4, 560 (1989), Landegren et al . , Science 241, 1077 (1988), transcription amplification (Kwoh et al . , Proc . Na tl . Acad . Sci . USA 86, 1173 (1989)), and self-sustained sequence replication (Guatelli et al . , Proc . Nat . Acad . Sci . USA, 87, 1874 (1990)) and nucleic acid based sequence amplification (NASBA) . The latter two amplification methods involve isothermal reactions based on isothermal transcription, which produce both single stranded RNA (ssRNA) and double stranded DNA (dsDNA) as the amplification products in a ratio of about 30 or 100 to 1, respectively.
B. Detection of Polymorphisms in Target DNA
There are two distinct types of analysis of target DNA for detecting polymorphisms. The first type of analysis, sometimes referred to as de novo characterization, is carried out to identify polymorphic sites not previously characterized (i.e., to identify new polymorphisms) . This analysis compares target sequences in different individuals to identify points of variation, i.e., polymorphic sites. By analyzing groups of individuals representing the greatest ethnic diversity among humans and greatest breed and species variety in plants and animals, patterns characteristic of the most common alleles/haplotypes of the locus can be identified, and the frequencies of such alleles/haplotypes in the population can be determined. Additional allelic frequencies can be determined -for subpopulations characterized by criteria such as geography, race, or gender. The de novo identification of polymorphisms of the invention is described in the Examples section. The second type of analysis determines which form(s) of a characterized (known) polymorphism are present in individuals under test. There are a variety of suitable procedures, which are discussed in turn.
1. Allele-Specific Probes
The design and use of allele-specific probes for analyzing polymorphisms is described by e.g., Saiki et al . , Nature 324, 163-166 (1986); Dattagupta, EP 235,726, Saiki, WO 89/11548. Allele-specific probes can be designed that hybridize to a segment of target DNA from one individual but do not hybridize to the corresponding segment from another individual due to the presence of different polymorphic forms in the respective segments from the two individuals. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles. Some probes are designed to hybridize to a segment of target DNA such that the polymorphic site aligns with a central position (e.g., in a 15-mer at the 7 position; in a 16-mer, at either the 8 or 9 position) of the probe. This design of probe achieves good discrimination in hybridization between different allelic forms.
Allele-specific probes are often used in pairs, one member of a pair showing a perfect match to a reference form of a target sequence and the other member showing a perfect match to a variant form. Several pairs of probes can then be immobilized on the same support for simultaneous analysis of multiple polymorphisms within the same target sequence .
2. Tiling Arrays
The polymorphisms can also be identified by hybridization to nucleic acid arrays, some examples of which are described in WO 95/11995. One form of such arrays is described in the Examples section in connection with de novo identification of polymorphisms. The same array or a different array can be used for analysis of characterized polymorphisms. WO 95/11995 also describes subarrays that are optimized for detection of a variant form of a precharacterized polymorphism. Such a subarray contains probes designed to be complementary to a second reference sequence, which is an allelic variant of the first reference sequence. The second group of probes is designed by the same principles as described in the Examples, except that the probes exhibit complementarity to the second reference sequence. The inclusion of a second group (or further groups) can be particularly useful for analyzing short subsequences of the primary reference sequence in which multiple mutations are expected to occur within a short distance commensurate with the length of the probes (e.g., two or more mutations within 9 to 21 bases) .
3. Allele-Specific Primers An allele-specific primer hybridizes to a site on target DNA overlapping a polymorphism and only primes amplification of an allelic form to which the primer exhibits perfect complementarity. See Gibbs , Nucl eic Acid Res . 17, 2427-2448 (1989) . This primer is used in conjunction with a second primer which hybridizes at a distal site. Amplification proceeds from the -two-primers , resulting in a detectable product which indicates the particular allelic form is present. A control is usually performed with a second pair of primers, one of which shows a single base mismatch at the polymorphic site and the other of which exhibits perfect complementarity to a distal site. The single-base mismatch prevents amplification and no detectable product is formed. The method works best when the mismatch is included in the 3 ' -most position of the oligonucleotide aligned with the polymorphism because this position is most destabilizing to elongation from the primer (see, e.g., WO 93/22456).
4. Direct-Sequencing
The direct analysis of the sequence of polymorphisms of the present invention can be accomplished using either the dideoxy chain termination method or the Maxam Gilbert method (see Sambrook et al . , Molecular Cloning, A Laboratory Manual (2nd Ed., CSHP, New York 1989); Zyskind et al . i Recombinant DNA Laboratory Manual , (Acad. Press, 1988) ) . 5. Denaturing Gradient Gel Electrophoresis Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution. Erlich, ed. , PCR Technology, Principles and Applica tions for DNA Amplification, (W.H. Freeman and Co, New York, 1992), Chapter 7.
6. Single-Strand Conformation Polymorphism Analysis
Alleles of target sequences can be differentia-ted using single- strand conformation polymorphism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described in Orita et al . , Proc . Na t . Acad . Sci . 86,
2766-2770 (1989) . Amplified PCR products can be generated as described above, and heated or otherwise denatured, to form single stranded amplification products. Single- stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence. The different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence differences between alleles of target sequences .
III. Methods of Use
After determining polymorphic form(s) present in an individual at one or more polymorphic sites, this information can be used in a number of methods. A . Forensics
Determination of which polymorphic forms occupy a set of polymorphic sites in an individual identifies a set of polymorphic forms that distinguishes the individual. See generally National Research Council, The Evaluation of Forensi c DNA Evidence (Eds. Pollard et al . , National Academy Press, DC, 1996) . The more sites that are analyzed, the lower the probability that the set of polymorphic forms in one individual is the same as that in an unrelated individual. Preferably, if multiple sites are analyzed, the sites are unlinked. Thus, polymorphisms of the invention are often used in conjunction with ~- polymorphisms in distal genes. Preferred polymorphisms for use in forensics are biallelic because the population frequencies of two polymorphic forms can usually be determined with greater accuracy than those of multiple polymorphic forms at multi-allelic loci.
The capacity to identify a distinguishing or unique set of forensic markers in an individual is useful for forensic analysis. For example, one can determine whether a blood sample from a suspect matches a blood or other tissue sample from a crime scene by determining whether the set of polymorphic forms occupying selected polymorphic sites is the same in the suspect and the sample. If the set of polymorphic markers does not match between a suspect and a sample, it can be concluded (barring experimental error) that the suspect was not the source of the sample. If the set of markers does match, one can conclude that the DNA from the suspect is consistent with that found at the crime scene. If frequencies of the polymorphic forms at the loci tested have been determined (e.g., by analysis of a suitable population of individuals) , one can perform a statistical analysis to determine the probability that a match of suspect and crime scene sample would occur by chance . p(ID) is the probability that two random individuals have the same polymorphic or allelic form at a given polymorphic site. In biallelic loci, four genotypes are possible: AA, AB, BA, and BB . If alleles A and B occur in a haploid genome of the organism with frequencies x and y, the probability of each genotype in a diploid organism is
(see WO 95/12607) : Homozygote: p (AA) = x2
Homozygote: p(BB)= y2 = (1-x)2
Single Heterozygote : p(AB)= p (BA) = xy = x(l-x)
Both Heterozygotes : p (AB+BA) = 2xy = 2x(l-x)-
The probability of identity at one locus (i.e, the probability that two individuals, picked at random from a population will have identical polymorphic forms at a given locus) is given by the equation: p(ID) = (x2)2 + (2xy)2 + (y2)2.
These calculations can be extended for any number of polymorphic forms at a given locus. For example, the probability of identity p(ID) for a 3-allele system where the alleles have the frequencies in the population of x, y and z, respectively, is equal to the sum of the squares of the genotype frequencies : p(ID) = x4 + (2xy)2 + (2yz)2 + (2xz)2 + z4 + y4
In a locus of n alleles, the appropriate binomial expansion is used to calculate p(ID) and p(exc) .
The cumulative probability of identity (cum p(ID)) for each of multiple unlinked loci is determined by multiplying the probabilities provided by each locus. cum p(ID) = p(IDl)p(ID2)p(ID3) .... p(IDn) The cumulative probability of non-identity for n loci (i.e. the probability that two random individuals will be different at 1 or more loci) is given by the equation: cum p (nonID) = l-cum p(ID) . If several polymorphic loci are tested, the cumulative probability of non- identity for random individuals becomes very high (e.g., one billion to one) . Such probabilities can be taken into account together with other evidence in determining the guilt or innocence of the suspect .
B. Paternity Testing
The object of paternity testing is usually" to~determine whether a male is the father of a child. In most cases, the mother of the child is known and thus, the mother's contribution to the child's genotype can be traced. Paternity testing investigates whether the part of the child's genotype not attributable to the mother is consistent with that of the putative father. Paternity testing can be performed by analyzing sets of polymorphisms in the putative father and the child. If the set of polymorphisms in the child attributable to the father does not match the set of polymorphisms of the putative father, it can be concluded, barring experimental error, that the putative father is not the real father. If the set of polymorphisms in the child attributable to the father does match the set of polymorphisms of the putative father, a statistical calculation can be performed to determine the probability of coincidental match.
The probability of parentage exclusion (representing the probability that a random male will have a polymorphic form at a given polymorphic site that makes him incompatible as the father) is given by the equation (see WO 95/12607) : p(exc) = xy(l-xy) where x and y are the population frequencies of alleles A and B of a biallelic polymorphic site.
(At a triallelic site p(exc) = xy(l-xy) + yz (1- yz) + xz(l-xz)+ 3xyz (1-xyz) ) ) , where x, y and z and the respective population frequencies of alleles A, B and C) .
The probability of non-exclusion is p(non-exc) = l-p(exc)
The cumulative probability of non-exclusion (representing the value obtained when n loc-i a^re used) is thus : cum p(non-exc) = p (non-excl) p (non-exc2) p (non-exc3 ) .... p(non-excn)
The cumulative probability of exclusion for n loci (representing the probability that a random male will be excluded) cum p(exc) = 1 - cum p(non-exc) . If several polymorphic loci are included in the analysis, the cumulative probability of exclusion of a random male is very high. This probability can be taken into account in assessing the liability of a putative father whose polymorphic marker set matches the child's polymorphic marker set attributable to his/her father.
C. Correlation of Polymorphisms with Phenotypic Traits The polymorphisms of the invention may contribute to the phenotype of an organism in different ways . Some polymorphisms occur within a protein coding sequence and contribute to phenotype by affecting protein structure.
The effect may be neutral, beneficial or detrimental, or both beneficial and detrimental, depending on the circumstances . For example, a heterozygous sickle cell mutation confers resistance to malaria, but a homozygous sickle cell mutation is usually lethal. Other polymorphisms occur in noncoding regions but may exert phenotypic effects indirectly via influence on replication, transcription, and translation. A single polymorphism may affect more than one phenotypic trait. Likewise, a single phenotypic trait may be affected by polymorphisms in different genes. Further, some polymorphisms predispose an individual to a distinct mutation that is causally related to a certain phenotype.
Phenotypic traits include diseases that ha-ve teiown but hitherto unmapped genetic components (e.g., agammaglobulimenia, diabetes insipidus, Lesch-Nyhan syndrome, muscular dystrophy, Wiskott-Aldrich syndrome,
Fabry's disease, familial hypercholesterolemia, polycystic kidney disease, hereditary spherocytosis, von Willebrand' s disease, tuberous sclerosis, hereditary hemorrhagic telangiectasia, familial colonic polyposis, Ehlers-Danlos syndrome, osteogenesis imperfecta, and acute intermittent porphyria) . Phenotypic traits also include symptoms of, or susceptibility to, multifactorial diseases of which a component is or may be genetic, such as autoimmune diseases, inflammation, cancer, diseases of the nervous system, and infection by pathogenic microorganisms. Some examples of autoimmune diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non-independent) , systemic lupus erythematosus and Graves disease. Some examples of cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, leukemia, liver, lung, oral cavity, ovary, pancreas, prostate, skin, stomach and uterus. Phenotypic traits also include characteristics such as longevity, appearance (e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity to particular drugs or therapeutic treatments.
Correlation is performed for a population of individuals who have been tested for the presence or absence of a phenotypic trait of interest and for polymorphic markers sets. To perform such analysis, the presence or absence of a set of polymorphisms (i.e. a polymorphic set) is determined for a set of the individuals, some of whom exhibit a particular trait, and some of which exhibit lack of the trait. The alleles of each polymorphism of the set are then reviewed--to-determine whether the presence or absence of a particular allele is associated with the trait of interest. Correlation can be performed by standard statistical methods such as a K - squared test and statistically significant correlations between polymorphic form(s) and phenotypic characteristics are noted. For example, it might be found that the presence of allele Al at polymorphism A correlates with heart disease. As a further example, it might be found that the combined presence of allele Al at polymorphism A and allele Bl at polymorphism B correlates with increased milk production of a farm animal.
Such correlations can be exploited in several ways . In the case of a strong correlation between a set of one or more polymorphic forms and a disease for which treatment is available, detection of the polymorphic form set in a human or animal patient may justify immediate administration of treatment, or at least the institution of regular monitoring of the patient. Detection of a polymorphic form correlated with serious disease in a couple contemplating a family may also be valuable to the couple in their reproductive decisions. For example, the female partner might elect to undergo in vitro fertilization to avoid the possibility of transmitting such a polymorphism from her husband to her offspring. In the case of a weaker, but still statistically significant correlation between a polymorphic set and human disease, immediate therapeutic intervention or monitoring may not be justified. Nevertheless, the patient can be motivated to begin simple life-style changes (e.g., diet, exercise) that can be accomplished at little cost to the patient but confer potential benefits in reducing the risk of conditions to which the patient may have increased susceptibility by virtue of variant alleles . Identification -of -a polymorphic set in a patient correlated with enhanced receptiveness to one of several treatment regimes for a disease indicates that this treatment regime should be followed.
For animals and plants, correlations between characteristics and phenotype are useful for breeding for desired characteristics. For example, Beitz et al . , US 5,292,639 discuss use of bovine mitochondrial polymorphisms in a breeding program to improve milk production in cows. To evaluate the effect of mtDNA D-loop sequence polymorphism on milk production, each cow was assigned a value of 1 if variant or 0 if wildtype with respect to a prototypical mitochondrial DNA sequence at each of 17 locations considered. Each production trait was analyzed individually with the following animal model:
Yijkpn= μ + YSi + Pj + Xk + β1 + ... jS17 + PEn + an +ep where Yijknp is the milk, fat, fat percentage, SNF, SNF percentage, energy concentration, or lactation energy record; μ is an overall mean; YSi is the effect common to all cows calving in year-season; Xk is the effect common to cows in either the high or average selection line; β to βxl are the binomial regressions of production record on mtDNA D-loop sequence polymorphisms; PEn is permanent environmental effect common to all records of cow n; an is effect of animal n and is composed of the additive genetic contribution of sire and dam breeding values and a Mendelian sampling effect; and ep is a random residual. It was found that eleven of seventeen polymorphisms tested influenced at least one production trait. Bovines having the best polymorphic forms for milk production at these eleven loci are used as parents for breeding the next generation of the herd.
D. Genetic Mapping of Phenotypic Traits The previous section concerns identifying correlations between phenotypic traits and polymorphisms that directly or indirectly contribute to those traits. The present section describes identification of a physical linkage between a genetic locus associated with a trait of interest and polymorphic markers that are not associated with the trait, but are in physical proximity with the genetic locus responsible for the trait and co-segregate with it. Such analysis is useful for mapping a genetic locus associated with a phenotypic trait to a chromosomal position, and thereby cloning gene(s) responsible for the trait. See Lander et al . , Proc . Na tl . Acad . Sci . (USA) 83, 7353-7357 (1986); Lander et al . , Proc . Na tl . Acad. Sci . (USA) 84, 2363-2367 (1987); Donis-Keller et al . , Cell 51, 319-337 (1987); Lander et al . , Genetics 121, 185-199 (1989)). Genes localized by linkage can be cloned by a process known as directional cloning. See Wainwright, Med . J. Australia 159, 170-174 (1993); Collins, Nature Genetics 1, 3-6 (1992) .
Linkage studies are typically performed on members of a family. Available members of the family are characterized for the presence or absence of a phenotypic trait and for a set of polymorphic markers. The distribution of polymorphic markers in an informative meiosis is then analyzed to determine which polymorphic markers co- segregate with a phenotypic trait. See, e . g. , Kerem et al . , Science 245, 1073-1080 (1989); Monaco et al . , Na ture 316, 842 (1985); Yamoka et al . , Neurology 40, 222-226 (1990); Rossiter et al . , FASEB Journal 5, 21-27 (1991). Linkage is analyzed by calculation of LOD (log of the odds) values. A lod value is the relative likelihood of obtaining observed segregation data for a marker and a genetic locus when the two are located at a recombination fraction θ , versus the situation in which the two are not linked, and thus segregating independently (Thompson & Thompson, Genetics in Medicine (5th ed, W.B. Saunders
Company, Philadelphia, 1991) ; Strachan, "Mapping the human genome" in The Human Genome (BIOS Scientific Publishers Ltd, Oxford) , Chapter 4) . A series of likelihood ratios are calculated at various recombination fractions ( θ ) , ranging from θ = 0.0 (coincident loci) to θ = 0.50
(unlinked) . Thus, the likelihood at a given value of θ is: probability of data if loci linked at θ to probability of data if loci unlinked. The computed likelihoods are usually expressed as the log10 of this ratio (i.e., a lod score) . For example, a lod score of 3 indicates 1000:1 odds against an apparent observed linkage being a coincidence. The use of logarithms- allows data collected from different families to be combined by simple addition. Computer programs are available for the calculation of lod scores for differing values of θ (e.g., LIPED, MLINK (Lathrop, Proc . Na t . Acad . Sci . (USA) 81, 3443-3446 (1984)) . For any particular lod score, a recombination fraction may be determined from mathematical tables. See Smith et al . , Ma thema tical tables for research workers in human genetics (Churchill, London, 1961); Smith, Ann . Hum . Genet . 32, 127-150 (1968) . The value of θ at which the lod score is the highest is considered to be the best estimate of the recombination fraction.
Positive lod score values suggest that the two loci are linked, whereas negative values suggest that linkage is less likely (at that value of θ ) than the possibility that the two loci are unlinked. By convention, a combined lod score of +3 or greater (equivalent to greater than 1000:1 odds in favor of linkage) is considered definitive evidence that two loci are linked. Similarly, by convention, a negative lod score of -2 or less is taken as definitive evidence against linkage of the two loci being compared. Negative linkage data are useful in excluding a chromosome or a segment thereof from consideration. The search focuses on the remaining non-excluded chromosomal locations .
IV. Modified Polypeptides and Gene Sequences The invention further provides variant forms of nucleic acids and corresponding proteins. The nucleic acids comprise one of the sequences described in the Table, column 8, in which the polymorphic position is occupied by one of the alternative bases for that position. Some nucleic acids encode full-length variant forms of proteins. Similarly, variant proteins have the prototypical amino acid sequences encoded by nucleic acid sequences shown in the Table, column 8, (read so as to be in- frame with the full-length coding sequence of which it is a component) except at an amino acid encoded by a codon including one of the polymorphic positions shown in the Table. That position is occupied by the amino acid coded by the corresponding codon in any of the alternative forms shown in the Table .
Variant genes can be expressed in an expression vector in which a variant gene is operably linked to a native or other promoter. Usually, the promoter is a eukaryotic promoter for expression in a mammalian cell. The transcription regulation sequences typically include a heterologous promoter and optionally an enhancer which is recognized by the host. The selection of an appropriate promoter, for example trp, lac, phage promoters, glycolytic enzyme promoters and tRNA promoters, depends on the host selected. Commercially available expression vectors can be used. Vectors can include host-recognized replication systems, amplifiable genes, selectable markers, host sequences useful for insertion into the host genome, and the like.
The means of introducing the expression construct into a host cell varies depending upon the particular construction and the target host. Suitable means include fusion, conjugation, transfection, transduction, electroporation or injection, as described in Sambrook, supra . A wide variety of host cells can be employed for expression of the variant gene, both prokaryotic and eukaryotic. Suitable host cells include bacteria such as E. coli , yeast, filamentous fungi, insect cells, mammalian cells, typically immortalized, e . g. , mouse, CHO, human and monkey cell lines and derivatives thereof. Preferred host cells are able to process the variant gene product to produce an appropriate mature polypeptide. Processing includes glycosylation, ubiquitination, disulfide bond formation, general post-translational modification, and the like. The protein may be isolated by conventional means of protein biochemistry and purification to obtain a substantially pure product, i . e . , 80, 95 or 99% free of cell component contaminants, as described in Jacoby, Methods in Enzymology Volume 104, Academic Press, New York (1984); Scopes, Protein Purifica tion, Principles and Practice, 2nd Edition, Springer-Verlag, New York (1987); and DeuLscher (ed) , Guide to Protein Purifica tion, Methods in Enzymology, Vol. 182 (1990). If the protein is secreted, it can be isolated from the supernatant in which the host cell is grown. If not secreted, the protein can be isolated from a lysate of the host cells.
The invention further provides transgenic nonhuman animals capable of expressing an exogenous variant gene and/or having one or both alleles of an endogenous variant gene inactivated. Expression of an exogenous variant gene is usually achieved by operably linking the gene to a promoter and optionally an enhancer, and microinjecting the construct into a zygote . See Hogan et al . , "Manipulating the Mouse Embryo, A Laboratory Manual," Cold Spring Harbor Laboratory. Inactivation of endogenous variant genes can be achieved by forming a transgene in which a cloned variant gene is inactivated by insertion of a positive selection marker. See Capecchi, Science 244, 1288-1292 (1989) . The transgene is then introduced into an embryonic stem cell, where it undergoes homologous recombination with an endogenous variant gene. Mice and other rodents are preferred animals. Such animals provide useful drug screening systems . In addition to substantially full-length polypeptides expressed by variant genes, the present invention includes biologically active fragments of the polypeptides, or analogs thereof, including organic molecules which simulate the interactions of the peptides. Biologically active fragments include any portion of the full-length polypeptide which confers a biological function on the variant gene product, including ligand binding, and antibody binding. Ligand binding includes binding by nucleic acids, proteins or polypeptides, small biologically active molecules, or large cellular structures.
Polyclonal and/or monoclonal antibodies that specifically bind to variant gene products but not to corresponding prototypical gene products are also provided. Antibodies can be made by injecting mice or other animals with the variant gene product or synthetic -peptide- fragments thereof. Monoclonal antibodies are screened as are described, for example, in Harlow & Lane, Antibodies , A Labora tory Manual , Cold Spring Harbor Press, New York (1988) ; Goding, Monoclonal antibodies, Principles and Practice (2d ed.) Academic Press, New York (1986) . Monoclonal antibodies are tested for specific immunoreactivity with a variant gene product and lack of immunoreactivity to the corresponding prototypical gene product . These antibodies are useful in diagnostic assays for detection of the variant form, or as an active ingredient in a pharmaceutical composition.
V. Kits The invention further provides kits comprising at least one allele-specific oligonucleotide as described above. Often, the kits contain one or more pairs of allele- specific oligonucleotides hybridizing to different forms of a polymorphism. In some kits, the allele-specific oligonucleotides are provided immobilized to a substrate. For example, the same substrate can comprise allele- specific oligonucleotide probes for detecting at least 10, 100 or all of the polymorphisms shown in the Table. Optional additional components of the kit include, for example, restriction enzymes, reverse-transcriptase or polymerase, the substrate nucleoside triphosphates , means used to label (for example, an avidin-enzyme conjugate and enzyme substrate and chromogen if the label is biotin) , and the appropriate buffers for reverse transcription, PCR, or hybridization reactions. Usually, the kit also contains instructions for carrying out the methods. The following Examples are offered for the purpose of illustrating the present invention and are not to be construed to limit the scope of this invention-.- T e teachings of all references cited herein are hereby incorporated herein by reference.
EXAMPLES
The polymorphisms shown in the Table were identified by resequencing of target sequences from three to ten unrelated individuals of diverse ethnic and geographic backgrounds by hybridization to probes immobilized to microfabricated arrays or conventional sequencing. The strategy and principles for design and use of such arrays are generally described in WO 95/11995. The strategy provides arrays of probes for analysis of target sequences showing a high degree of sequence identity to the reference sequences of the fragments shown in the Table, column 1. The reference sequences were sequence-tagged sites (STSs) developed in the course of the Human Genome Project (see, e . g . , Science 270, 1945-1954 (1995); Nature 380, 152-154 (1996)). Most STS's ranged from 100 bp to 300 bp in size. A typical probe array used in this analysis has two groups of four sets of probes that respectively tile both strands of a reference sequence. A first probe set comprises a plurality of probes exhibiting perfect complementarily with one of the reference sequences. Each probe in the first probe set has an interrogation position that corresponds to a nucleotide in the reference sequence. That is, the interrogation position is aligned with the corresponding nucleotide in the reference sequence, when the probe and reference sequence are aligned to maximize complementarily between the two. For each probe in the first set, there are three corresponding probes from three additional probe sets. Thus, there are four probes corresponding to each nucleotide in the reference sequence. The probes from the three additional probe -sets aaee identical to the corresponding probe from the first probe set except at the interrogation position, which occurs in the same position in each of the four corresponding probes from the four probe sets, and is occupied by a different nucleotide in the four probe sets. In the present analysis, probes were 25 nucleotides long. Arrays tiled for multiple different references sequences were included on the same substrate.
Multiple target sequences from an individual were amplified from human genomic DNA using primers for the fragments indicated in the listed Web sites. The amplified target sequences were fluorescently labelled during or after PCR. The labelled target sequences were hybridized with a substrate bearing immobilized arrays of probes. The amount of lable bound to probes was measured. Analysis of the pattern of label revealed the nature and position of differences between the target and reference sequence. For example, comparison of the intensities of four corresponding probes reveals the identity of a corresponding nucleotide in the target sequences aligned with the interrogation position of the probes. The corresponding nucleotide is the complement of the nucleotide occupying the interrogation position of the probe showing the highest intensity (see WO 95/11995) . The existence of a polymorphism is also manifested by differences in normalized hybridization intensities of probes flanking the polymorphism when the probes hybridized to corresponding targets from different individuals. For example, relative loss of hybridization intensity in a "footprint" of probes flanking a polymorphism signals a difference between the target and reference (i.e., a polymorphism) (see EP 717,113) . Additionally, hybridization intensities for corresponding targete-s from different individuals can be classified into groups or clusters suggested by the data, not defined a priori , such that isolates in a give cluster tend to be similar and isolates in different clusters tend to be dissimilar. Hybridizations to samples from different individuals were performed separately. The Table summarizes the data obtained for target sequences in comparison with a reference sequence for the individuals tested.
From the foregoing, it is apparent that the invention includes a number of general uses that can be expressed concisely as follows. The invention provides for the use of any of the nucleic acid segments described above in the diagnosis or monitoring of diseases, such as cancer, inflammation, heart disease, diseases of the CNS, and susceptibility to infection by microorganisms. The invention further provides for the use of any of the nucleic acid segments in the manufacture of a medicament for the treatment or prophylaxis of such diseases. The invention further provides for the use of any of the DNA segments as a pharmaceutical. All publications and patent applications cited above are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication or patent application were specifically and individually indicated to be so incorporated by reference
)
0
U>
ATTGCACTGAAG I I I I I GAAATACCTTTGTAGTTACTCAAGCAGTTACTCCCTACACTGATGCAAGGA TTACAGAAACTGATGCCAAGGGGtC/G]TGAGTGAGTTCAACTACATGTTCTGGGGGCCCGGAGATAG ATGACTTTGCAGATGGAAAGAGGTGAAAATGAAGAAGGAAGCTGTGTTGAAACAGAAAAATAAGTC
WI-7718C 91 G AAAAGGAACAAAAATTACAAAGAACCATGCAGGAAGGAAAACTATGTATTAAT
ATTGCACTGAAG I I I I I GAAATACCTTTGTAGTTACTCAAGCAGTTACTCCCTACACTGATGCAAGGA TTACAGAAACTGATGCCAAGGGGCTGAGTGAGTTCAACTACATGTTCTGGGGGCCCGGAGATAGATG ACTTTGCAGATGGAAAGAGGTGAAAATGAAGAAGGAAGCTGTGTTGAAACAGAAAAATAAGTCAAA
Wl-7718b 248 AGGAACAAAAATTACAAAGAACCATGCAGGAAGGAAAACTATGTATT[A/G1AT
ATrGCACTG GTTTTTGAAATACCTTTGTAGTTACTCAAGC[A/C,ηGTTACTCCCTACACTGATGC AAGGATTACAGAAACTGATGCCAAGGGGCTGAGTGAGTTCAACTACATGTTCTGGGGGCCCGGAGAT AGATGACTTTGCAGATGGAMGAGGTGAAAATGAAGAAGGAAGCTGTGTTGAAACAGAAAAATAAG
Wl-7718a 42 TCAAAAGGAACAAAAATTACAAAGAACCATGCAGGAAGGAAAACTATGTATTA
AGGGAATTGTGTTGCTCCTGGAGGAAGCCCAGGCATCATTAAACAAGCCAGTAGGTCACCTGGCTTC CGTGGACCAATTCATCTTTCAGACAAGCTTTA[G/C]AGAAATGGACTCAGGGAAGAGACTCACATGC TTTGGTTAGTATCTGTGTTTCCGGTGGGTGTAATAGGGGATTAGCCCCAGAAGGGACTGAGCTAAACA
Wl-7227d 99 GTGTTATTATGGGAAAGGAAATGGCATTGCTGCTTTCAACCAGCGACTAATG Ul
00
AGGGAATTGTGTTGCTCCTGGAGGAAGCCCAGGCATCATTAAACAAGCCAGTAGGTCACCTGGCTTC CGTGGACCAATTCATCTTTCAGACAAGCTTTAGAGAAATGGACTCAGGGAAGAGACTCACATGCTTT GGTTAGTATCTGTGTTTCCGGTGGGTGTAATAGGGGATTAGCCCCAGAAGGGACTGAGCTAAACAGTG
WI-7227C 291 TTATTATGGGAAAGGAAATGGCATTGCTGCTTTCAACCAGCGACTAATGCAAT
AGGGAATTGTGTTGCTCCTGGAGGAAGCCCAGGCATCATTAAACAAGCCAGTAGGTCACCTGGCTTC CGTGGACCAATTCATCTTTCAGACAA[G/ηCTTTAGAGAAATGGACTCAGGGAAGAGACTCACATGC TTTGGTTAGTATCTGTGTTTCCGGTGGGTGTAATAGGGGATTAGCCCCAGAAGGGACTGAGCTAAACA
Wl-7227b 93 GTGTTATTATGGGAAAGGAAATGGCATTGCTGCTTTCAACCAGCGACTAATG
AGGGAATTGTGTTGCTCCTGGAGG[A G]AGCCCAGGCATCATTAAACAAGCCAGTAGGTCACCTGGC TTCCGTGGACCAATTCATCTTTCAGACAAGCTTTAGAGAAATGGACTCAGGGAAGAGACTCACATGC TTTGGTTAGTATCTGTGTTTCCGGTGGGTGTAATAGGGGATTAGCCCCAGAAGGGACTGAGCTAAACA
Wl-7227a 24 G GTGTTATTATGGGAAAGGAAATGGCATTGCTGCTTTCAACCAGCGACTAATG j CCACAATGCCTCTCCCACGATGTCAAGGACTCCTGTCTGTCCTGGAGGTGGGAGACAAGGAACCTCCG | AAGAGGAAGCAAGAAAGCCGTACTGTCTATGTTGTGATCCTTCATCGAACAAACTGATGCGAAAACT |TGAATCTGTTACTGAAATGAGGAGAGAAGGACATGTGCTATTGAACTGAGCCAAACACACTGTAAAT
Wl-7310b 234>A ATCCACAGACTCCCTCCCCTGCCCCCATCCCAfA/CIATGATCTTGAGATTTC
)
GAAGCAACCAGAAAGTATCTTTATCCCCATCTAGATTATGTCTGGGTTCTTCCAGACTCCTACGATTA AATTGTATGCATGTGAACAACTGATGAGGTACTTAGATCTCAGTGCTTTGCAGAAAGAAAAG[T/C]C GTCTACCATTTTCACCAAATTTCGTAGTACAATTTAAGTATCTCTTGTTATCTCCCCTAGGAGTCTAA
Wl-1 95b 1 30 AGTGAGCTGGGGAAGGCAGGATTT
GAAGCAACCAGAAAGTATCTTTATCCCCATCTAGATTATGTCTGGGT[T/C]CTTCCAGACTCCTACGA TTAAATTGTATGCATGTGAACAACTGATGAGGTACTTAGATCTCAGTGCTTTGCAGAAAGAAAAGTC GTCTACCATTTTCACCAAATTTCGTAGTACAATTTAAGTATCTCTTGTTATCTCCCCTAGGAGTCTAA
Wl-1795a 47 AGTGAGCTGGGGAAGGCAGGATTT
CACACAATTTGCAAACACTTCAAAGTGAACGCCCGACATCATCAGCCCGTTAACGTCCAGGCCATGT CCCACATAGAGAACGCTTTACTTCCACGTCTCTCCATACGTAGGTCCTGGTCTCCTATCACATTGCCA C[G/A]TAGCCCTCCCTTCCCTTCCCCCTACAGGCCCTCTTCAGGGCCCCAGTCCCCCTCTGAGACTCCC
1 36 ATGGATCATTCCTGTTTCTGTATCAGGCAGTGATTTAACTCC I I I I I I GT
CACACAATTTGCAAACACTTCAAAGTGAACGCCCGACATCATCAGCCCGTTAACGTCCAGGCCATGT CCCACATAGAGAACGCTTTACTTCCACGTCTCTCCATACGTAGGTCCTGGTCTCCTATCACATTGCCA C[G/A]TAGCCCTCCCTTCCCTTCCCCCTACAGGCCCTCTTCAGGGCCCCAGTCCCCCTCTGAGACTCCC
1 36 ATGGATCATTCCTGTTTCTGTATCAGGCAGTGATTTAACTCC I I I I I I GT
4-.
CACACAATTTGCAAACACTTCAAAGTGAACGCCCGACATCATCAGCCCGTTAACGTCCAGGCCATGT CCCACATAGAGAACGCTTTACTTCCACGTCTCTCCATACGTAGGTCCTGGTCTCCTATCACATTGCCA CGTAGC[C/ηCTCCCTTCCCTTCCCCCTACAGGCCCTCTTCAGGGCCCCAGTCCCCCTCTGAGACTCCC
1 41 ATGGATCATTCCTGTTTCTGTATCAGGCAGTGATTTAACTCC I I I I I I GT
CACACAATTTGCAAACACTTCAAAGTGAACGCCCGACATCATCAGCCCGTTAACGTCCAGGCCATGT CCCACATAGAGAACGCTTTACTTCCACGTCTCTCCATACGTAGGTCCTG[G/CJTCTCCTATCACATTG CCACGTAGCCCTCCCTTCCCTTCCCCCTACAGGCCCTCTTCAGGGCCCCAGTCCCCCTCTGAGACTCCC 1 1 6 ATGGATCATTCCTGTTTCTGTATCAGGCAGTGATTTAACTCC I I I I I I GT
CTCTTATTTCTCTGGGCACTGCTTTCTTTGGGGGCAAACTTCCAGTATCACT[G/A]ATACTAATATAA AAACCCTGT GTCTGCTTGCATTTTCAAGATTCAATATATATCCAGATTGTTTTCCCAGCAAAGAA TTTTATTTCTCAAGATATAAAAAATMATATTTAATTTCAGTTTCCTCAAAAGGAATATGAAATT
WI-1126C 52 G TGTTAAAATGCAAATCCAGCTGTAAC I I I I I I GGACTTGTCTTTTATTTCTT
CTCTTATTTCTCTGGGCACTGCTTTCTTTGGGGGCAAACTTCCAGTATCACTGATACTAATATAAAAA CCCTGTMGTCTGCTTGCATTTTCAAGATTCAATATATATCCAGATTGTTTTCCCAGCAAAGAAAATT TTATTTCTCAAGATATAAMMTMATATTTMTTTCAGTTTCCTCMAAGGAATATGMATTTGTT
Wl-1126b 230 AAAATGCAAATCCAGCTGT CTTTTT[T/C|GGACTTGTCTTTTATTTCTT
4-.
CGAGCTTGGGATAAAGCAAGGGGACCTTGGC[G/A]CTCTCAGCTTTCCCTGCCACATCCAGCTTGTTG TCCCAATGAAATACTGAGATGCTGGGCTGTCTCTCCCTTCCAGGAATGCTGGGCCCCCAGCCTGGCCA GACMGMGACTGTCAGGMGGGTCGGAGTCTGTAAMCCAGCATACAGTTTGGCTTTTTTCACATT
Wl-7038a 31 G . GATCA I I I I l ATATGAAATAAAAAGATCCTGCATTTATGGTGTAGTTCTGA
ATACGCTTTCTGTCTGTCCCACAGTGGAACCAGCACCCAGGTGGCCAGGGTCGGGCTCCACACA[G η CCCTCAGCCCCTTCAGCTTTGCATGTGTCCATCGGTGACTCAGCACAGAGTTTTCCAACCTCATGTGA CAAAAATACAGATTCCCAGTCTCCTCTCCTGGATTTGGATCTAGCAAGACCAGAGACGGTCCTAGAA
Wl-3429b 64 TCCTGACTGTTAACAAGCACTCCAGGCAATTCTTAAGACCAAGCACGGAGC
ATACGCTTTCTGTCTGTCCCACAGTGGAACCAGCACCCAGGTGGCCAGGGTCGGGCTCCACA[C/ηAG CCCTCAGCCCCTTCAGCTTTGCATGTGTCCATCGGTGACTCAGCACAGAGTTTTCCAACCTCATGTGA CAAAAATACAGATTCCCAGTCTCCTCTCCTGGATTTGGATCTAGCAAGACCAGAGACGGTCCTAGAA
Wl-3429a 62 TCCTGACTGTTAACAAGCACTCCAGGCAATTCTTAAGACCAAGCACGGAGC
ATTTTAGGACAGTGAAAAAAAGGGATTTATAAATAAAATCTATGCCATCCAGGAGGTATGTGTCAGT GTCCAGAACATCCTAGATGAAGTGGCTTCCTTTGGCGAAAGGATAAAGAAGTGAGTGACGGTGACCT GTGAGCCCCATTCTTCT[G/A]TGGGATAAGGTGTCCATTTGTTTCTTGGAGGGTGAAATGCCACATTC
WI-6786C 1 51 TTTTTGGCAGGGGACACTCCTTCTGGGTGCTCTATTGCTCAGTTTCATCATT 4-.
ATTTTAGGACAGTGAAAAAMGGGATTTATAAATAAAATCTATGCCATCCAGGAGGTATGTGTCAGT GTCCAGAACATCCTAGATGAAGTGGCTTCCTTTGGCGAAAGGAT[A/ηAAGAAGTGAGTGACGGTGA CCTGTGAGCCCCATTCTTCTGTGGGATAAGGTGTCCATTTGTTTCTTGGAGGGTGAAATGCCACATTC
Wl-6786b 1 1 1 TTTTTGGCAGGGGACACTCCTTCTGGGTGCTCTATTGCTCAGTTTCATCATT
ATTTTAGGACAGTGAAAAAAAGGGATTTATAAATAAAATCTATGCCATCCAGGAGGTATGTGTCAGT GTCCAGAACATCCTAGATGAAGTGGCTTCCTTTGGCGAA[A/ηGGATAAAGAAGTGAGTGACGGTGA CCTGTGAGCCCCATTCTTCTGTGGGATAAGGTGTCCATTTGTTTCTTGGAGGGTGAAATGCCACATTC
Wl-6786a 1 06 TTTTTGGCAGGGGACACTCCTTCTGGGTGCTCTATTGCTCAGTTTCATCATT
GGCTATTTGTAAATGCTTGGTTATTTGACTCCAAAATTGAATAAGTATTGGGGAAGAATCCCTCACCT ACTTCCAMTCCCTTACATATCMTTTTACACAAAGCCCCTAAACCTTCAGTTCCAATCACTCTGAAT TTCATATACCTCCATTATTAAATTCAATACATCATTGCAGAGAAAAGACAACGGTGCCAACTGGGTT
Wl-671 1 b 226 T TGGTTGGTGCCTGCACACCCACA[G ηTGGCAACTAAGTGTAATCTCTAAA
GGCTATTTGTA TGCTTGGTTATTTGACTCCAAAA[T/C]TGAATAAGTATTGGGGAAGAATCCCTC
ACCTACTTCCA TCCCTTACATATC TTTTACACAAAGCCCCTAAACCTTCAGTTCCAATCACTCT GAATTTCATATACCTCCATTATTAAATTCAATACATCATTGCAGAGAAAAGACAACGGTGCCAACTG
Wl-6711 a 36 T Ci - GGTTTGGTTGGTGCCTGCACACCCACAGTGGCAACTAAGTGTAATCTCTAAA
4-.
4-. ^1
4-.
00
4-.
Λ
O
CCTCCTCTGAG GCAGTTCTCTGAAAGACMTGGATTGTGGAGCATACTGMGACTATTCCTAMTGGCTATTTGTGTTG
TTTGTGTTGGG ATTTTCTGAAT GGTGGTCMG[A G]CTATTCAGAAMTCTCAGAGGAGGACAMTGATAGTGCACTGCAGCCAGCTCG
WI-11909 78 G TGGTCMG_ AG GACTGGCTTGCAAGAGTC
TCCTGTAMGC
CATGAAGAGT CAATTTTATAT AAAAATACCATTTAGCATCMTTGCCCCMGTTTGGCAGGCATGMGAGTGGGCAGTTCAΓT/G]GTT
WI-1 1806 60 GGGCAGTTCA ACTAATAA TTATTAGTATATMAATTGGCTTTACAGGMGCATTATGG
CCCTAGTGMTACMCCTTTGTCCTGGAGAC[C/A]CCAGCTAGTCTMGAAMCTTCCTAGGCTGAG
WI-11946 31 CTCTCTTGGGMTCTMGATAMGMCTGAGATCCTGGGMGMGGGM
TGMGATCAG
ATCTCTGGTTT CAGCTGTGGTG ACAAAATTCACMGTACAACACTGCTTATTTTCTTGCTTGMGATCAGATCTCTGGTTTATTTM[T/
WI-11965 65 G ATTT MTGTTGAT G]ATCMCATTCACCACAGCTGMGGAAATTAMCTGMCCT
TGCCCTACTAC TGAGGAAATGT ACCTATTTTGMACTGCAGAAAGGGCAGGACAAMCAAATCACTTCATAGATTTTTCTGGGAMTAT
GCTTTTAAAA GTTACAGTATT TGCCCTACTACGCTTTTAAMAA[T/A]AATAAAAATACTGTAACACATTTCCTCATTTCTCTTACGA
WI-11027 90 A TTTATT ATACTTTC I I I I I GATATTGCAMTTCTATGGCATACACAGAGGCACCTCCTCMTGCCCTG
TTCTGCTGAAGATCACAAMCMTTTCMCCTCTGTGGTTCAAMTMTTTMGGATCTTGTACCTTT
GTGTTTATTTTCTGTTTCAACTAAGGA[C/ηAGACTTCAGMGGCATAGCTTCCCTTGTAACGTTTTT
WI-1 1049 95 AAACATCTTTTTCATTTGTAGGAAGGMCATTTCAAAAGCCCAA CΛ )
AAAAGGACAG TTTCCATCTTA CCAGATATCA TTTCATTTCTG CAACATTTATCAAACATGGTAGGGAAMGTTCTCACTCTGCACTATAAAAAGGACAGCCAGATATCA
WI-15488 69 AC TMC AC[CtT]GTTACAGAAATGAAATAAGATGGAAAATm AACAAATTG
MCAGTTAAT GAMCACATC GGCTGGTGAM TGCTCAATTTMTGTGATAATCTCCMCAGTTMTGAAACACATCCGTA[A/G]GTATGACATCATTT
WI-13654 49 CGT TGATGTCAT CACCAGCCAGCTACTTCATGTGGCAGAAMGGTMCCTTTTCCCCATTTTACAGACAAMCCAGT _
ATGAGACCCTGCTTTGMCGTTAMCGTTTTGGMTMTGGAAAAGGAGCTAGGACMTTCTTGCTT
Wl- TCMGTAAMTTGTGACTGAGCAGAAMTCAGCCAGCTATCTTGGGTGCAGAGAGGTACTCCMGTA 1 1 070b 1 3_5j C| C[C, ]GTGGGGGTTCTGATGACTTCCACGGTCACTGGGGATCCMCAGMGGGM
CAGAAMTCA ATGAGACCCTGCTTTGMCGTTAMCGTTTTGGMTMTGGAAMGGAGCTAGGACMTTCTTGCTT
Wl- GCCAGCTATCT TFGGAGTACCT TCMGTAAMTTGTGACTGAGCAGAAMTCAGCCAGCTATCTT[G/ηGGTGCAGAGAGGTACTCCM 1 1070a 1 1 0 | G T T CTCTGCACC GTACCGTGGGGGTTCTGATGACTTCCACGGTCACTGGGGATCCMCAGMGGGM
MTCTTTTATATTTCCAGCTGTTGAGACAGTATTTTTGAGGGCTGATGTTACCTCTAGCGGCGAMCC AGAGCCAGCTATTMGCAGCCAGMAGCTACAGTMTTGMTACATGACCATT[T/C]CTCTTTTAGC
WI-12020 1 21 T C -- ACGTTCTTTGTTCTCCTC
Λ
4-.
/l
00
O
ON 4-.
ON 00
-4
O
TGCTC I I I I I ATTTCACGTTTCACMCACACGCCGTG[G/ηTGGCACAGTCTACCAMGTGCCCGCAG CGCCACGCTTGGGCCGGMGGTCTCATTCTGTTCGTCTCTATGGACTGATTGMTTTGGGATGGCCAG CTCCAGMTGTTCCACGTGGGGGCACTCTGTGGGCAGAGAGGCTGAGCCCTTGCCCACACTGGCACCA
WI-9617 37 G MGAGGTTGCACGATGCAGCTTGCAGTGGGTCCMGCCGGGTGTGCTGTG
MTGCTGGAGAAMCATCMCATTGAGTTGACATTTGTTTTGCTGMGTATAGCTACCATCCACTAT CATGAATTTTTGTTTCATTACAMTGATAGAAMGCCAGATTCTCAAAATAAAG[T/G]ATMTTCTT TGTATTAAATAMTGTTTATAAATGTTTATGAAGCTCATTACATTATC I I I I I I AAMAAGTAAAAA
WI-9657 1 21 TTTTAGMCATATGACGCTTTTCATMTTMTGCTTTTGATATAGATTTGAGG
AAAAATTAAC CAGGGTCTTGCTCTGTCTCCCAGGCTAGAGTGAGGTGACACMTCMGACTCACAGTAGCCTCMCCT
Wl- CCTCCCMGTA CAGGTGTGGTG CCTATGCTCMGCCAGCCTCCCMGTAGCTGGGACTACAGGCATGT[G/C]ACACCACACCTGGTTM 131 19b 1 14ι G ' GCTGGGA T I I I l I I IM I I I I I l GTMAGATAGGGTCTCACTATGTTGCCCCGTCTCAAAAMCMACCMCTMC
CAGGGTCTTGCTCTGTCTCCCAGGCTAGAGTGAGGTGACACMTCMGACT[C/G]ACAGTAGCCTCA ACCTCCTATGCTCMGCCAGCCTCCCMGTAGCTGGGACTACAGGCATGTGACACCACACCTGGTTA
Wl- A l I I I I I I MI I I I I I GTAMGATAGGGTCTCACTATGTTGCCCCGTCTCAAAMACAMCCAACTAA 131 19a 51 C _
ACAGGAATCTGAAAGTTACCMGGCAATTTTCCCTTTTAGGATCATAMGACTACAGACTTMGCTT 3
TCATAMGAC TTAGAAATTTT TTTT[C/T]C I I I I I CCATATMTACACAAAATTTCTAAATATCCTTAAAAAAGAAAATATAAATAGT TACAGACTTA GTGTATTATAT TTCAGTATGTTATGTAGAGTCACATACTATGGCAAMATATTTTATTMTTGAGGGMTAGGCCMT
WI-13112 71 AGC I I I I I GGAAAMG
TGTTAACATTTTTATTGGTACGTGCTCTCAGTACM[C/A]AMCAGCATCAGTAGTGTACACTTTGAT
CAMGTGTACA MAMGGMTTTTTAGCTTAGTAGMMGMAGCCCAMGGTCAGAAGTATMTGMTATGTACAT
TGGTACGTGCT CTACTGATGCT CTTTATGGAMCTGTTTGTGTGACCATCTTTATCTTCCCCTGTGGATGAGATGTATGCACACACMGT
WI-12988 36 CTCAGTACM GTTT AAA
TGCTATTCATGACAGACACGTGAGACAMTATTCTTATTTTACAGATGGAMTAGACCCAGACATTA
CTMTAGTGG TTCAGTACTTTMCCACTAATAGTGGMCCCTGAGACTTTA[G/A]ATCTGCAMGGGGTTTAATAAT
Wl- MCCCTGAGA CATTATTAAAC GCMATATCACATATATTTCCA I I I I I AACACCATATTTAAGTTTTCCATTTTCTTMTAGAMATGA 13020a 1 08 GL CTTT CCCTTTGCAGA TAMAAATGTTTTCCCCMTAT
TGTATAAAMATCCMCTTGTTCCACMGTACATATGTCCTATGATTTTATGCATACATCCATATAC
CCATATACAT ATATATCAAGGTMAGTCCA[A/G]TACAMMMCAGCATTTCCTATGGCCAGTGTFCTACAGAAGT ATATCMGGT GCCATAGGM MGACTGTGCAMCTTTATCGTATAGTCAMTGAGATTGCACACTMGGCAGGATGAGGCAGMGCA
WI-12837 87' MAGTCCA ATGCTG I I I I I AGTTGTGTCCA
GTCCTCAGGCCCTTCTCTGGCTGCAGAGCCGTCTTCTCAGGTTGCCTGTC[G/C]TCTCCTGGCCTCTAG TCTTCCCTGCTCTCCGAGGTAGAGCTGGGTATGGATGCTTAGTGCCCTCACTTCTCTCTGTCTATACCT GCCCCATCTGAGCACCCATTGCTCACCATCAGATCMCCTTTGATΠTACATCATMTGTATTCACCA
L4261 1 b 50 CTGGAGCTTCACTTTGTTAC
GTCCTCAGGCCCTTCTCTGGCTGCAGAGCCGTCT[T/C]CTCAGGTTGCCTGTCGTCTCCTGGCCTCTAG TCTTCCCTGCTCTCCGAGGTAGAGCTGGGTATGGATGCTTAGTGCCCTCACTTCTCTCTGTCTATACCT GCCCCATCTGAGCACCCATTGCTCACCATCAGATCMCCTTTGATTTTACATCATMTGTATTCACCA
L4261 1 34 T CTGGAGCTTCACTTTGTTAC
TGAACGTGTGGTTAAAACTAGGCMTTGGTTMMATCMTTTMMAACAGGCCTAGAMCAGTG
TGMGAAATG ACCACACCTCMGCAATGATTATCCCTAGCACTCAGATTATGTTCTTGAMTACCATTTTCTGCTTTC GCTGATACCA ATGTGCATTTT AAMGAAAGACATGAGGGCTTCTTGMGAMTGGCTGATACCMG[CtηCTGCAGTGAAAMTGCA
Wl-1172b 1 79 A TCACTGCAG CATGATGAGCCTGGMCATGTTGT
TGAACGTGTGGTTMAA[C/A]TAGGCAATTGGTTAAAAATCAATTTAAMAACAGGCCTAGAAACA GTGACCACACCTCMGCMTGATTATCCCTAGCACTCAGATTATGTFCTTGAMTACCATTTTCTGCT TTCAAMGMAGACATGAGGGCTTCTTGMGMATGGCTGATACCMGCCTGCAGTGAAMATGCA
Wl-1 172a 1 7 CATGATGAGCCTGGMCATGTTGT
AGAGGCAGATTGGAAGTGTGAAAAAAATGAAAGM[G/C]MGMAAAAAGAGTCTAAATATTCAG 4-.
GCAGATTGGA CACTTACATTT MATGTMGTGCTGCCCTCMCTGTTCTTTACCCACTTMTTCTGCMTTTTGAAAACTAGATTGMT AGTGTGAAM CTGAATATTTA TCCTTTGCAAMCCCTTGCATCATGGATACCCGAGTTAMCCGTTMTTAAMGACATTAMCATGG
WI-1 177 35 G O A GACTCTTT CCTGGTG
TCCATGGTTTGGTTGCTACTGACTTTGTTAGCCTTACTGCCCACTATGCATTGGMCATTCCCATATTC CMCTMGCAGGAGTGTTCACMTM.ACMCATAGGCTCTTTATTCTCCTTCTTTCATTMTTTTCTT TCAC[GyA]TTATTCCCTCACCCTGMCGCCCTTCTTCCTTCGTAGTGACATTTTAAMTCCACTTTAC
Wl-1231 b 1 41 I G ACATTCGGACC
TCCATGGTTTGGTTGCTACTGACTTTGTTAGCCTTACTGCCCACTATGCATTGGMCATTCCCATATTC
GGCTCTTTATT CMCTAAGCAGGAGTGTTCACMTMACMCATAGGCTCTTFATTCTCCTTCTTTCA[T/C]TMTTTT
CTCCTTCTTTC CGTTCAGGGTG CTTTCACGTTATTCCCTCACCCTGMCGCCCTTCTTCCTTCGTAGTGACATTTTAAMTCCACTTTACA
Wl-1231 a 1 26 T!C A_ __ _ AGGGAATM CATTCGGACC
ACATACATAT GMGGCAGGACTGTGTTTTGGAGGACMMAGTAAMTC I I I I l ATATCTTTA I I I I I I MTTTTATT [CCATTATACA GACCTTTCTTT TTTTTTCAGGCATATAGACATACATATCCATTATACMCAGMMG[G/C]GGGCTGGAAAAGMAG
WI-472 1 1 4 G C ACAGAMAG TCCAGCCC GTCMGTGAGATTTCAGATATTCTTAAATGCMGGCTGACAMTTTGGGCTTGATT
-4 -4
oo
vo
09
©
00
00 SI
oo
00 4-.
00 ON
00 -4
00
00
00 VO
vo o
VO
SI
VO
vo
VO ON
TΓTTTGTTTΌCTCTGGACACCCACTGCTCCCAGGATGAMGGAGAG[G/A]MTGAGATCAGTTTTGGA
WI-7593 46 CACTTCCTCTTGAMTATAMGMTCMCMGTFACAGTCATGTTGGGGACTTCTTCTCTCTCCM
AGTGCATCTTGGGGGAMGGGCTCCAGTGTTATCTGGACCAGTTCCTTCATΠTCAGGTGGGACTCTT GATCCAGAGA[A G]GACAMGCTCCTCAGTGAGCTGGTGTATMTCCMGACAGMCCCMGTCTCC TGACTCCTGGCCTTCTATGCCCTCTATCCTATCATAGATAACATTCTCCACAGCCTCACTTCATTCCAC
WI-6962 78 CTATTCTCTGAAMTATTCCCTGAGAGAGMCAGAGAGATTTAGATMGA
GCAGAGMGAGMCCATGCCAGGGGAGMGGCACCCAGCCATC[C/G]TGACCCAGCGAGGAGCCM
MGGCACCCA GCTCCTCGCTG CTATCCCAMTATACCTGGGTGMATATACCAAATTCTGCATCTCCAGAGGMMTMGAMTMA
WI-7059 43 GCCATC GGTCA GATGAATTGTTGCAACTCTTAAAAAM
CACTTCACTGA MGACACCAT TCTACTTTCTG AGCAGCCATCACATGATCTGTTTTTCACCACTTCACTGAMGACACCATTTAT[A/C]TACCCMGGG
WI-9063 53 CCCTTGGGT CAGAMGTAGMCTTACTATTCATTAMTGTTTGACACMTTGGMTTGTC
MGGGGCATTGAGACTATAMGCAGTAGACAATCCCCACATACCATCTGTAGAGTTGGMCTGCATT CTTTTMAGTTΓΓATATGCATATATTTΓAGGGCTGCTAGACTTACTTTCCTATTTTC'FTTTCCATTGC TATTCTTGAGCACAMATGATMTCMTTATTACATTTATACATCACC I I I I I GACTTTTCCMGCCC
WI-7079 293 TTTTACAGCTCTTGGCATΠTCCTCGCCTAGGCCTGTGAGGTMCTGGGAT
GGTAAMGTT GACAGAI I I I I o CTTTTTGCTCT GACCTAGTTCC TGGATGCCGAGGTAAMGTTCTTTTFGCTCTAAAAGM[A/G]AAGGMCTAGGTCAAAMTCTGTCC 1
WI-9074 38 AAAAG TT GTGACCTATCAGTTATTM I I I I I MGGATGTTGCCACTGGCAMTGTMCTGT
GGAGTTTGCCCCTTCCTMGGGMGGAGATCTTTATCTTTCTGGTTGGCTTGACCAGTCACGTTGGGA GMGAGAGAGAGTGCCAGGAGACCCTGAGGGCAGCCGGTTCCTACTTTGGACTGAGAGMGGGAGCC CCAGGCTGGAGCAGCATGAGGCCCAGCMGMGGGCTTGGGTTCTGAGGMGCAGATGTTTCATGCT
Wl-7104b 249 GTGAGGCCTTGCACCAGGTGGGGGCCACAGCACCAGCAGCATCTTTG[CtFJF
GGAGTTTGCCCCTTCCTMGGGMGGAGATCTTTATCTTTCTGGTTGGCTTGACCAGTCACGTTGGGA GMGAGAGAGAGTGCCAGGAGACCCTGAGGGCAGCCGGTTCCTACTTTGGACTGAGAGMGGGAGCC CCAGGCTGGAGCAGCATGAGGC[C/A]CAGCMGMGGGCTTGGGTTCTGAGGMGCAGATGTTTCAT
WI-7104 1 57 GCTGTGAGGCCTTGCACCAGGTGGGGGCCACAGCACCAGCAGCATCTTTGCT
CCTGAGCCCTC TGTAGGGCTGA CATACMTGAGAGCCCTGAGCCCTCMGMCTCA[CtηGCCAGCTCAGCCCTACACCAGTTTCCACC
WI-8974 34 AAGMCTCA GCTGGC TGGAGTTCATGCMGGGCMMGGCAGTGCCATGCMGCTGTTTM
GCTTACAGGAG
CCTMGCATTG AGACTAGACA CTGTGAGGGTGACGTTAGCATTACCCCCMCCTCATTTTAGTTGCCTMGCATTGCCTGGC[Cπ TC
WI-9161 61 1 CCTGGC GGM CTGTCTAGTCTCTCCTGTMGCCAMGMATGMCATTCCA
CCCTGTTCCCATGCTGACCTGTGTTTCCTCCCCAGTCATCTTTCCTGTTCCAGAGAGGTGGGGCTGGAT
WI-9014C 93 lT Cl- GTCTCCATCTCTGTCTCMCTTTAΓF/CIGTGCACTGAGCTGCMCTTCT
CCCTGTTCCCATGCTGACCTGTGTTTCCTCCCCAGTCATCTTTqCtFJFGTTCCAGAGAGGTGGGGCTG
Wl-9014b 44 GATGTCTCCATCTCTGTCTCMCTTTATGTGCACTGAGCTGCMCTTCT
TCTGAGAGAMTGACTTGTGGGAGACACCCTGCAGATCCTCATGGGTTTGTGACAGACCCTGCGTGCT CAGTGCCCTTTMGTGCATCCCGCTGTGCTGACTTTGAGTGGGATCMCATCTGTCCTACGGGTCCCC TCI I I I I IGGCCCCAGTATTCATGGCAGGGTTTGTTGGACACCTACTAGCTTCCCTTCCCATTCMCAC
Wl-7023b 206 A[C/A]ACACACATTCTTGCTCTACCCAMGCTCTGGCTGGCAGCACTM
TCTGAGAGAMTGACTTGTGGGAGACACCCTGCAGATCCTCATGGGTTTGTGACAG[A/C]CCCTGCGT GCTCAGTGCCCTTTMGTGCATCCCGCTGTGCTGACTTTGAGTGGGATCMCATCTGTCCTACGGGTC CCCTC I M IT IGGCCCCAGTATTCATGGCAGGGTTTGTTGGACACCTACTAGCTTCCCTTCCCATTCM
Wl-7023a 56 CACACACACACATTCTTGCTCTACCCAMGCTCTGGCTGGCAGCACTM
CTGAMTCCCCCTCTCTGCCCTGGCTGGATCCGGGGACCCCTTTGCCCTTCCCT[CT]GGCTCCCAGCC CTACAGACTTGCTGTGTGACCTCAGGCCAGTGTGCCGACCTCTCTGGGCCTCAGTTTTCCCAGCTATG AAAACAGCTATCTCACAMGTTGTGTGMGCAGMGAGAAMGCTGGAGGMGGCCGTGGGCCMT
WI-7093 54 GGGAGAGCTCTTGTTATFATTMTATTGTTGCCGCTGTTGTGTTGTTGTTA
ACATATCTGAAAAATGTTGAMGCCTMGCCAGGAATAMAGAAMGTAGAGATMTAATCA[G/A]
WI-9171 62 TTCTTTACMCCGATGGTMTTMGCTTGTATTCACMGACTTCATGC
CTAGGACCCC TCTAGAGGGTA vo 00 ATTCTCCTATT TATAGGACAGG GTGTGAGACCATCATGGTGCCAGTCTAGGACCCCATTCTCCTATTTAΓT/C]CAGTCCTGTCCTATATA
WI-9174 47 T ACTG_ CCCTCTAGAMCAGAMGCMTTTTTAGGCAGCTATGGTCAMTTGAG _ _
CAGAGGTCTTG MGGCCAGATGCACATCCCTGGMGGACATCCATGTTCCGAGMGMCAGAT[A/G]ATCCCTGTATT
CCATGTTCCGA AMTACAGGG TCMGACCTCTGTGCACTTATTTATGMCCTGCCCTGCTCCCACAGMCACAGCMTTCCTCAGGCTA
WI-7753 52 GMGMCAGA A AGCTGCCGGTTCTTAMTCCATCCTGCTMGTTMTGTTGGGTAGM
AMGGGMAG
CCACTTCTCCC TCTGACCTAGG MAGMCTACAGAGGACGATGTCCAAMCMAAMTGGCATCACCTGTCAAAMTGGAGTTCCACT
WI-9186 76 CGCA T TCTCCCCGCA[G/A]ACCTAGGTCAGACTTTCCCTTTCATCTT
AGMTATTGT
CTGCCTTAMG GGTGTGTGTGG TTGGACAMCCTAGMTTTTCTCCCTFTATGTATCTCTATCGATTGTGTAGCMTTGACAGAGMTM
WI-9193 94 G, CA TAGGGGG CTCACMTATTGTCTGCCTTAMGCA[G/A]TACCCCCCTACCACACACACCCCTGTCCTC
TTTGGATTGATATCGTGAMTCCTCAGCCGAGAMTTGGGCTGGATTG[CtF]GCTTTGGTTMTACAT
WI-9015 48 CTTTCCCTMAGMGATAMCACAAMTCCATTCCAGGTAGCTCGGCACCMCTMGM
GGAGCCAGGAGACAGCAGGGTCTGAGAGAGGAGCCAC[A/G]GTCCCTMTGACACCCACTCCTAGCC
GGTCTGAGAG GGAGTGGGTGT CTGAGGCTCGTGCCCCTCAGACTGGGGMGAGTCCMGGMGGGAGGGAGCAGCCACTCCTCMTGC
WI-7254 37 AGGAGCCAC CATFAGGGA TCMTGGCTCCCCTGMATCMGACAGG
o β
CMGAGAGAG TGCAAAGAAA CCAGGAGCACTAGAGAGGGAGGGGGMGAGCAGMGTTAGAGAAAAAMGCCACCGGAGGAMGG AGAGGAMGA GMTGAAAGTT AAAAAACATCGGCCAACCTAGAMCGTTTTCATTCGTCATTCCMGAGAGAGAGAGGAMGMAM
WI-7424 1 31 AAAA G [T/A]ACAACTTTCATTCTTTCTTTGCACGTTCATAMCATTCTACATA
TCCTGCMGMGTTCTCMGCC I I I I I GATTTTTGTGCMTMAGTACAGCTTTGCATMGAGTGAM TTGGGCTAGCTTAMTGGATCCATAMCTTTCTTCTMTTTTMGTGAGA[A/C]TCTTTTAMCACCT GTTMATTTMTGTAGCAGTCTGAGAATCTAAMTTATGTACCACTCGTTTATFTGTTCATTCATCCA
X86400 1 1 8 TCCCTTTTCCCATGMTATTTCA
GTGGCCACTACATGTTATAGAMCCATCATCTTGTCACACAGCACAGTCTATGMTMMGGCTGAG TTATCACTMGCAGGAGAAAMGCATTAAAMGTGTCCCATTMMGGGACTTTTMTCMCCTAA TMACTCTMTTCTGCTGACTTTTTAMGATCTMGGTCATTFTMTACATGCTGAAMGGGTCACA
WI-8053 242 T ATTAATTCTTTGATCTTTTTTACTCACTGTTAACTTATATAA[T/A]TTCAGMC
TACACMTGMTTGCTTTTATTTCGGTATGCATCCACATTTCAGCATTTAGTGGTCCTGMCAGCMG TGGMAGACGCAGCMTTTGCCAGGAGGTCMGCCCACCMTTTCGGGGATCTGCTGTGCACACCGG GTTCCTTCTTMTCCCTGCTGAGGATCTTG[G/A]GMGCAGCAGCAGCACCAMACCMGGCATGCA
WI-6190 ' 1 65 CCGGATTCMGGTTCTTTTTGTTCCAGTTGTCAGATTCCAMCTAGACCCCA
MCAGTCACCACCMCCACATGACMCTCGCCAGGCMGGCCTTGCTTCCCTCCCTCCTTTGCGTCCC ATGTGCCTAGTCAGCMGGTCGGGGAGGCACCGATGTTAGCTTCGCCCAMGGGAGTATTACAGAGA GAGGCTTGGGAM[G C]GGMGGMACCTGGACAGGCTTTTCAGCACTGAGAMTCACTTAMACTG
WI-6275 I 1 48 G Cl ATTTGCTTTCAGTMCTGGTATGTCTGM
ACCMGAGATCAGCTGTCTAMCAGCAGCT rGATTGT[G/ηGGGCTTCCTGAMGMACCTTGC
TGACAGCTTCTCACTGACCTGCAGGACGGMCCGTACCTGAGAGGGGATGGGGGCTCTCTCACAAM GMTATTTGGGGCAGMCCCTGGMCTGGCCACCAGGGACATCCCAMTATCCCCTCCTCCTCAGGG
WI-6421 41 CTCACCCCGACATCCTCAGCCAMTGMGGCTCTGM
GGGTGAGACGGGTTTATTGTGCACATFTACACAGCGTCACAGCGTCTGGGCTGGCAGCGGCCATGCTC CTGTGGTCGGGCTGCTCTACMGGGCGTTCACTTTTCTTCACCACACTATGTACAGTCAGTGCTCCM GGTGATGGGCTACAGTGCTGCATCAGTGAGTCTGTACACACATTTTTACATAMTTACACACGACTC
WI-6905 21 51 T j A ATACATGMAAA[T/A]AGAGCCTAAGGGCCTGTATTTTAATGAGAAMAAA
MCTTGTTTACAAMTAGGCTTTGCAMCTTCATTACTGMTTGTMAGTCMTGACTGTGTTGTTTT TAAAATATGTACCMGGAMTACAMTTGGATMTGATCATTTTTCATGCTCAGGAGAGMCAGCAC AGAAATAMGGATACTGCACMGGTGCMGGAAACCGGMCCCATTGTGTACACTGTCTTCACACAG
WI-9420 202 G A' — [G/A]GCATTCTTTCTCACCTTMCTGCAGCTGTGCMGATGCCTCAGTGTG
O I
©Ul
o 4-
O CΛ
TTTCTAGGCTGTACAGTCTGATGCATGA I I I I I I I ATAMTATTTCATACTCTTGTGMTTTGGATCTT TTTACTTTGAGCATATATTTTAGMTATGTGT[A/G]TGTTAMGGATCTCCACMTGTCTGCAGTGTG MGGCAGGTTCATTGTGGMTAGTTTMCAGTCAGGMGGCTAMCTGGTCAGTATTMTGTGTAGC
WI-7805 1 01 G CCTACCAMAATAGCCAGTAGTATCTGAAMTGMAAATAMTGMGTAT
GGCCAGGAGATTAGCMCMGGATTCATTCTGTTACTTACTTGCCCCTTTTTATCTTTCCCTCTTGCCC CAGTCCCTTCTCTCCAGCTTCATGTGMGCTCTGCACAGACMGACACTCAGTGTCCTTGGCAGTGCT [G/T]CTACTCCTCAGGTGCAGCATACATMCCAGTMGAGACTMATCTGCMTATATMAGAGCTC
WI-7416 1 37 CTACAMTCAGTAACATGMGMCACTCMMATTGGCAMTGTCATCAG
ATTTGMGATTTGGAGGGCTTTGCAGAGGAAMTAGATTTCMTTGGATCCCCAAACTATMTGACA AG I I I I I MTTAGGTGTGATCMGGCTFCTMMGTGAMTGCMGTTGTTACCAGTAMGTTTATA TCTTCCATTCAGCCCAGCTCATTTGCCAGAAMTTCAGGTGAGTGGATTGGCCAGACTATCTGGCMG
WI-140 252 GATGAAAATTTTAGTTTAMMJGTGTCATTTGTCTGTAπGGCAπ
GAGGTCTTTCAGCMCATGGMGCCCTACTGCTTCMCCCCGAGTTCCCCGGATCMGTGCTGGCACC CATGATGGAAACTCTTGCCATGGTTTTAGTACCCTGGACCMGTAGTCATTCCATCCTGACTTTAAM TTCTMACAGCCTTTGATGGGACMTCTCTGCTMAGACTMCCACTTCCTTATCTTATCTTCAGCTA
WI-198 21 8 CCTGCTTCCCTTTC[C/ηGTTTAACMAGCATAGMTATTCTGMCMCT
TTCATGGTCCCAAGACAGATTTTAMGAAAGAAMTAAGCCTCATCTCCTMCTATGACTTGGTCGG o ON MGCCMGAACCTACTTCAACATTTGACCCATMCCTTCTCTTGAGATGATGGGCTGACTTTTTCMT GCATGAGTTTGΓT/C]CCAAAGGCTTGATGGGAAMTCTCMCATTTGTTACCTMGMAGAGGATGT
WI-205C 1 46 ATCTTACTTTGTTTAAMMCTGCATATGCCTTTA I I I I I GTTTTAGTTCCC
TTCATGGTCCCMGACAGATTTTAMGMAGAAMTMGCCTCATCTCCTMCTATGACTTGGTCGG MGCCMGMCCTACTTCMCATTTGACCCATMCCTTCTCTTGAGATGATGGGCTGACTTTTTCMT GCATGAGTTTGΓT/C]CCAMGGCTTGATGGGAAMTCTCMCATTTGTTACCTMGMAGAGGATGT
Wl-205b 1 46 ATCTTACTTTGTTTAAAAMCTGCATATGCCTTTATTTTTGTTTTAGTTCCC
GMGACTGAGTTTCCAGGAGGTTGCAGCCGTTTCTCTCGGGCCATATGGCTMTMGGAGCTTGAGCA GGGATFCMCCTGTTTGCMCCCMGTNCTTTCCMGAGGTCTCAGACTACCTCCTCCATCTCCCCCT CTCCCCCACMCACACAMTACAGAGATT[G/C]MTTCAGGAGCCAGTTTCTAGGTGGGCTTTGAGC
WI-234 1 65 MTCATACACAGTMTCTCTTGGTGCTTTAGTTTTCTCAMTGGGMATGG
AGCTTTTGAMTCCAAMACCACAT[A/G]CTTGACTCTCTTATCCTCCTCTTGTTGTMCATCTATCC CTGAGGCAGAAMTACAGMCACCCTGTGGCTGCCTGMCGGAGGMGGATGGGGGCGGGGAGACAT CGGTCMTGTATCAMGCATCTCTCTGCCTGAMGACCTCTCCTGAMGACATGAGCTATTAGGAGC
Wl-276b 25 A G — TCTGGCMGGGCTTTGTCTTATCCTCCTTGCTATCCCTGATGACTGGGCAM
© 00
o vo
TGTTCTCTGGTCCAGGCACCGGGCTMGTCTTGTCTGCATMTGGMTMTCMCTGGACMCCCCNG CTNAGGTAGGNTACCTNGGCMTTAGCCCCATCTTACAGCTGCAMAGAGG[C/T]GCTCTGAGAGGT AMGTGCCCTGCCCCMCGCGCACMCTAGAGAGCAGCCAMCAGGTGTTTGMCCCAGCTCTGCCT
WI-1900 1 1 9 GACTTCAGATCTGTGTGCTTMCTGCCATGAGAMCCACTTTTCTTTGCTCC
ATTCCAGTTTCACAGTGGGCACAGGAGTCAGATTAGGGCTMGTTGGGGGGACAGGATGCACAGCGT GTTGGCTCAGGATCTCTGGGAGGTGGCACCTGTGACCTGGGCTMNCATGCTACTTTCAGAGTCMGC AGCMGCCMTGGGTAGGGAMGACCAGCC[C/T]CTCTGMNCTGGGTCCCACGTGGAGATAGTGM
WI-1943C 1 65 TACAGGGCACCGNTGAGCATTCCAGATGACTCCAMGCCCCGGCTGGAGTAT
ATTCCAGTTTCACAGTGGGCACAGGAGTCAGATTAGGGCTMGTTGGGGGGACAGGATGCACAGCGT GTTGGCTCAGGATCTCTGGGAGGTGGCACCTGTGACCTGGGCTMNCATGCTACTTTCAGAGTCMGC AGCMGCCMTGGGTAGGGAMGACCAGCC[C/T]CTCTGMNCTGGGTCCCACGTGGAGATAGTGM
Wl-1943b 1 65 TACAGGGCACCGNTGAGCATTCCAGATGACTCCAMGCCCCGGCTGGAGTAT
ATFCCAGTTTCACAGTGGGCACAGGAGTCAGATTAGGGCTMGTTGGGGGGACAGGATGCACAGCGT GTTGGCTCAGGATCTCTGGGAGGTGGCACCTGTGACCTGGGCTMNCATGCTACTTTCAGAGTCMGC AGCMGCCMTGGGTAGGGAMGACCAGC[C/T]CCTCTGMNCTGGGTCCCACGTGGAGATAGTGM
WI-1943 1 64 TACAGGGCACCGNTGAGCATTCCAGATGACTCCAMGCCCCGGCTGGAGTAT
CCAGGTGAGGCTGAMGMGGMGGAGGCMTTGCTGTTGGAGTGAGGGATTCTGGAGMGCACCCT GCAGAGCTTCATTCTGTTTTCAAMGTGTGCCATGCANGGTCNTCTGGGTTGTGAGCTCATNGCTGAG TTATCACAGCTCCTGATGACAGATCATGAAAMTAGGTACTTCCCMGCTCTGACTAGACCTTGGCA
WI-1960C 270 GTTGCMTTMATCCGTGGTGTCTGAAMCTTAAAMTGCACCTCCCMCTTT
CCAGGTGAGGCTGAMGMGGMGGAGGCMTTGCTGTTGGAGTGAGGGATTCTGGAGMGCACCCT GCAGAGCTTCATTCTGTTTTCAAMGTGTGCCATGCANGGTCNTCTGGGTTGTGAGCTCATNGCTGAG TTATCACAGCTCCTGATGACAGATCATGAAAMTAGGTACTTCCCMGCTCTGACTAGACCTTGGCA
Wl-1960b 270 GTTGCMTTAMTCCGTGGTGTCTGMMCTFAAMATGCACCTCCCMCTTT
CTGATGCCMGTGCAGCTTAGAGTNAGGMTCCAGAGAMGTNTTTGGATCTGGTMGTAGGAGTCA TTCTGGGCATTTCTTCATAGAGTNTTG I I I I lAGTCTCGTMTMTACTGTTGCCCTAGGMGGTTGTT
TTTCCTACTGCGTCTGTGAMGCCTTTCCCCATCGAGTGATACAGTACTTTCCAGTTATGGAGATTT[
WI-1977 203 /C]TMCMTCMACACTGGCTGAGGCTGTTGG
AMTTCTAGMGCCAGMGTCAGCTCACGATTTATAMGTTGMGTMATGCATTGTAGTTTCATGT TTTCTCTTAATTCTGCACAMACTAGCTAAAMTC[T/C]TTTMATCAGTTACCAGAGGCAATACCT GGGTTMTGTMGCACTCAMAGTTATGTAGAGTAGCTGTCTCTGAGTCAC I I I I I I CTACTCTCATT
WI-2012 1 02 | GGCTTCACCMTGCTTCCACTGGATC
00
SI s
-123-
TTGGTTGGCATTTTAGCCTCATAACMCTATTTACMTCATAATTGTTACTCTTATTTTACMACMG MAMTGAGGCTTMCATCACACTTCTGCTTAGTCGCAGAGCCMGATTTGMCCCAGGMTCCATT CACCGGTAC[A/G]TGCTACCTGGGTAAMAATGTTTAATTAAMTCTATGGCATTAGATTTCAMGA
WI-4584 144 GTCCTMTGTGGTTTTGMMTAGGTGTGCTTTMTTTGTFTATCAGTATGC
TTTCTGCATTTGMTGTGTATGGTCAGACTTCAGAGGMCCCAGGMTCTCATTTATTCAGTACMTA TGGTGGCCAGGTGCTCAGGCCCTATTATCAGAGAGATCTCAGTTTMCTTTCCMTTCCACCATTTAC TGACCATATGACTTGGGGMCATTATCTCACCTATCTGAGTCTGTATCC[CtηCATCTTTAMTTGTA
WI-4639 1 85 AATTTTAAGGACACCTATCATAGTAATATTGTGAGGATAAAATGAAATAA
AMTGMTCCGCTFTAGAGCAMTACCAGTMGGGCTGGTGCAGGATGGTGGTGGCTGAGAGA[A/- ]GATTACTCATAAAAGCATATTAATTTTATAAATATGGAMATTFAACTAGATAATTAAATGTGAAT TGAGTTTGMGGTTGCATGAGAGTAGGGAGGAGGTAGTTTCTACTTATAGGGTTTATATMGTNTGCT
WI-5327 63 A TCMTAGMTGGCTCTTTCGGATGACAATGATGAACTGTTCTMGCAGACAG
GCTTTTGAGMTGMMGGGGAGCCTGGACCATTGCAGGGCTTCTTCATCTCTGATTATTTTGTGTAT TTATTGTTCACTTATTTAT[C/T]GTCTGTCTCCCCTTCTGGTATGCTFGTGTCATGAMCMT GMTTC CCCAGTGCCTGGCCCGATTCGTGGCTCCTAGAGGTGTCCAGAAAAAMGTTTCGGTGMTAGMTTG
WI-5390 87 ACGMTGGGTTCAGMTTGMACCTGTGMTCTATGGMGACAMCGAAT SI
CCTTGCCTGCTTTATGCATMTGAGMTAGAGTTGACTCTCCTGTCMGMATCMTFATTMGCAGT GCAAACATTATTTTAATTT[G/A]AAAGAAACTTGTTTCTGAAACTTTGTACTCTTGTAGTNAAATTG MTCTTTCCTTCTCAGCAGTTTCCATGGTCGTGMTCCACCCCATCTCTTTTCACCAGTAGCMGATT
Wl-5404b 87 GCTACTTATATGGAAGGGTTTTAGAGTTCATMCM
CCTTGCCTGCTTTATGCATMTGAGMTAGAGTTGACTCTCCTGTCMGMATCMTTATFMGCAGT GCAAACATTATTTTAATTT[G/A]AAAGAAACTTGTTTCTGAMCTTTGTACTCTTGTAGTNAAATTG MTCTTTCCTTCTCAGCAGTTTCCATGGTCGTGMTCCACCCCATCTCTTTTCACCAGTAGCMGATT
WI-5404 I 87 GCTACTTATATGGMGGGTTTTAGAGTTCATMCM
TAGGAMGGGGATGGTGATGGCCTCTGAGACATFTAMTCTATTCTTTCACCACTCACACTGCCGCCA TATCTCCTC[A/C]CCMCACCTCTGTTTTCTGACAGCCMGTTTCCATCAGTTGATATGGGACTATTT GTTGCAAMCMTTGTTMMGATTTGGCTGACTTTGGCTGMTTTGCTACMCTCCMMAGANTC
Wl-5545b 77 GAGATACACCATGMTTTTATTTTCATTTCA
TAGGAMGGGGATGGTGATGGCCTCTGAGACATTTAMTCTATTCTTTCACCACTCACACTGCCGCCA TATCTCCTC[A/C]CCMCACCTCTGTTTTCTGACAGCCMGTTTCCATCAGTTGATATGGGACTATTT GTTGCAAMCMTTGTTMMGATTTGGCTGACTTTGGCTGMTTTGCTACMCTCCAAAMGANTC
WI-5545 77 A' C GAGATACACCATGMTFTTATTTTCATTTCA
SI CΛ
TMTTGCACAACTTACATATCAGGGTTTCTGATTGAMGGMGAGMTATTCCTTTCTTTTAGTGATT GCTTAATATTMTTCATAATMGTGCACCATCTCTΓF/CJGCTCCTTATAAATGTGTTTAGMGMGG MATTGAGTGTTGGGMTTMGCMCCAGGAGACATTTTTATATACTCCTACAGTGGGGGMGACTT
WI-6244 1 03 CCTATTTTCTΓTCCCMGGATGGATACATTTCTAC
CTGGCCTTATMTCCMGTFTAGGATTMTCTTACCCCMCTTMTAGACTFCCAGACAGTTGCAGTT GTCTACMGATTTCCTCCTAGTAGGGCTTTGGGTGTTGGCACCGTTTGGCTCATTC[C/ΗACTCTCCCT GGGTCTTATTGACTTTCAGGGAGCCTAGMGAGCTGGACMMCCTGCTTCTTTGCAGAMGAGTCG
WI-6268 1 24 GGGTTCCAMGATTTCGTTACGAI M I M A
AGGTGCCATTTMTCCATTCAMTTTGGMGCTACATCTTCMGGGTCTGAGAGAGCTCACTCCCCCC ATATATTCCCCCTTFACATGTTTFCTFATMGACATACAGTTTAATCMTTAACAMCTMACAGCTT ATATACTGGCMTATATTACAGATGGGTTTATGTCAGAGTAATAGATCACATGAMTGGACCATGTG
Wi-6336b 234 GTACCCCAGTGCATTATGTCTTGGTAGAGCC[C/T]TGAGGACACTGACAGT
AGGTGCCATTTMTCCATTCAMTTTGGMGCTACATCTTCMGGGTCTGAGAGAGCTCACTCCCCCC ATATATTCCCCCTTTACATGTTTTCTTATMGACATACAGTTTMTCMTTAACAAACTAMCAGCTT ATATACTGGCAATATATTACAGATGGGTTTATGTCAGAGTAATAGATCACATGAAATGGACCATGTG
WI-6336 234 T GTACCCCAGTGCATTATGTCTTGGTAGAGCC[CtTTFGAGGACACTGACAGT
SI
TTGGATACAAAAATTCAGTTACACAATCAGTAGCATTCMMTTAGTTATGAGTATTTATACAATTA ON CAAAAATGGNTTCATGTrTTMCM[C/A]GTATTTTAMAGCTCAAACATTTTAAAACAGGCACAAT ATTCTMNGGCATATGCATTCACCATGGGCTTTTGMTGTCCTCACTCCCMCTTCACMTCAAMTC
WI-6381 92 TACAGANGCGGCAAMGATCAGAGTTCAG
GGTTGAGGCATTGGGAMGGCAGAMTTGAGGCAGTAGAMATGGACATTTTAGGAAMGAGMGT TCAGAGGCAAAGTCATGACAGACAGGAMTACMGGCTTAGGMGACAGTAGTCTCTGTGGTTGM ATTTTGGTGTCATMTMGMGTTTAGACTTTGGTGGTTGTAGTAGTTGTAGTAGTAGGTAGCGTT[C/
WI-6436 1 98 G]ATTGGGTGTATTCCACAGACMGGTGATGTTCTMGATTTGATATTTATTGT
GAGGCCTCTTTGCTTTTCCTCAGTCMGGCTGTATCCAGGGTTGATATCTAGCCTATATGCCATATGT GTATGGCTAGTGTTTGTTCTGATTGGTTGGTGCTCACACTGCCCAGATTGTTAMTATTTTGAAMTC GTATCTGGTTCTATTCATCTGCATTCTCTGATCTTATGTCTGGCTCTATT[C/ηATCCCTATTCTCTGA
WI-6449 1 86 TCTTATGTCAGACCTGMGTTCCTCTMI I I I I CTGTGGTGTATTTATA
GAGGCCTCTTTGCTTTTCCTCAGTCMGGCTGTATCCAGGGTTGATATCTAGCCTATATGCCATATGT GTATGGCTAGTGTTTGTTCTGATTGGTTGGTGCTCACACTGCCCAGATTGTTAMTATTTTGAAMTC GTATCTGGTTCTATTCATCTGCATTCTCTGATCTTATGTCTGGCTCTATT[CA]ATCCCTATTCTCTGA
WI-6449 1 86 C T — TCTTATGTCAGACCTGMGTTCCTCTM I I I I I CTGTGGTGTATTTATA
GCTGGAGAGAAMGACCTCCAAMGMGAMCTMATCAGAGTCTCTTGAGCMGAGGMTTGMA AGAACA[T/C]TGAAAAMATTAAAGTAGAACTCAMGAGCCMAAAGTCCCCMTTGTGTCCATTA TMGMATATTFTGAATGGAMTCTTMGAATGATTTTATTGATCAGTTAMTGTTCTTCCTCTCCTC
WI-6463 72 CAGTCCCATTTATATGACATTCCGCATGCTG
MGCAGTAMTCTTCCATCATGCCATGGATGCCAGTGGGTAMTGTTATAGAMCTTCAGAGGANAC AGAGGCAM[C/T]GTTGGTTATAGCAGTCMCGACATCATCMTGMGACATGACTTGCTTAGAGCC MGMMAGTAGGATTTTGAMGGCACAGAGAAMGGGGTGTACTAGAGGAGMCTATGTMGCAG
Wl-6474b 76 T AGGTATAGAGGMCTMAGTATAAMGAGTGAGCCATMCTTAGGGTACCATAA
MGCAGTAMTCTTCCATCATGCCATGGATGCCAGTGGGTAMTGTTATAGAMCTTCAGAGGANAC AGAGGCAM[GT]GTTGGTTATAGCAGTCMCGACATCATCMTGMGACATGACTTGCTTAGAGCC MGAAAMGTAGGATTTTGAMGGCACAGAGAAMGGGGTGTACTAGAGGAGMCTATGTMGCAG
WI-6474 76 AGGTATAGAGGMCTMAGTATAAMGAGTGAGCCATMCTTAGGGTACCATM
GMCTCMTTMCTTTGCMCACTGAGAAMTCGGATTTGGAGATCTGCAMGCTGAGGTTGAGATT TTGGACCTTGGTGATCCAMTGGGGMTGCCACGCTTCGAGGCCTGTCTATATGCTTTATTTTTGTGA CACTGTCTATTTACCCTCCCCCMTAGTGGAGMTCAGAG[T/A]GCTCCTTGTCAGTGTTGCTACAGA
Wl-6478b 175 GMGATATACAGGATGGMGGACAGCTCCTCGTAGGACCTAGACACMCTG
GMCTCAATTMCTTTGCMCACTGAGAAMTCGGATFTGGAGATCTGCAMGCTGAGGTTGAGATT s
-4 TTGGACCTTGGTGATCCAMTGGGGMTGCCACGCTTCGAGGCCTGTCTATATGCTTTATΠTFGTGA CACTGTCTATTTACCCTCCCCCMTAGTGGAGMTCAGAGΓF/A]GCTCCTTGTCAGTGTTGCTACAGA
WI-6478 1 75 GAAGATATACAGGATGGMGGACAGCTCCTCGTAGGACCTAGACACMCTG
CACATTTTGMTGCMCTGAGAMNTGGTTΓTNTAGGCCTACCTTTTATTTMGAGTACATCTGGCTC CMTGTTACCCCAMCATGCAMACATMGGCMCMTTCTGATCATTTTATAGGNTCCCMGCCCA
TTAGCAATATCTTA[G/A]TCAAATTTTAMAAGAGMCAGGAAATMGGAAGGCCTMCAGAGGAG
WI-6559 149 TTAAATMTTGTGCAAMCTTATCAGTTCTTC
TTCTTTATTGGTCCTACCMTGTGACTCTTTACCCAGGCCCACTGTTCCTATGC[G/A]CACTGGCTTTG TAGGCATTCACATCATATGTCTGTGTCCTGAAMTCTCMTTMTTTCTCCTNCCTATFCCTTTTCCATl GCTCTGCCTCATTTNCTCAGAMTTGMGGCATTTGATTATNA I I I I I I I GTTTGGGTCTGTGTAMG
Wl-6564b 54 GTTCCTTGGCAGGAGMCATGCATATGACTTTAAMTMAGACCMCA
TFCTTTATTGGTCCTACCMTGTGACTCTrTACCCAGGCCCACTGTTCCTATGC[GyA]CACTGGCTTTG
TAGGCATTCACATCATATGTCTGTGTCCTGAMATCTCMTTMTTTCTCCTNCCTATTCC' FCCATl
GCTCTGCCTCATTTNCTCAGAMTTGMGGCATTTGATTATNATTT1 GTTTGGGTCTGTGTAMG
WI-6564 54 G! GTTCCTTGGCAGGAGMCATGCATATGACTTTAAMTMAGACCMCA
SI
00
S>
VO
GCATGATTAMCCAGTGCAGAMMTACCMGTACATTGGGTGMCGATGAGCTAGCTGTTCTAGTA TTTGCTTTTTGTMTCCAGTTMGACCATCAGCATATACAACATCATCACTAACTCMCMTGTAGCT GCAGGGTMC[C/A]TGTGGATACCCTGTGTGCTCTACTNGCCTCCAMGGCATTCAGGGGATCATCA
Wl-6817 1 45 MGATGTTGGACACCTTGTGTTCAMTCTTGGTTCAGGTGCGGCCTGTGCAG
GATGGMAGCCATTTTA I I I I I CTCTMATTTTAAAATAGMGACTTTAATGGAAMCATTTAGTAC CATCATGTCACCCTGMTGCCAGCMTACCTCGACTTTTACACACGCAGGMGCCTAGTAAMGCCC CGTCAGTAGTACACATTTCTCTATGGTCCTTCMCAGTT TTTTGGCCAATTAATTAACCAAAAMMTTTTTCTGCTATTT
Wl-6819b 221 CTTTAGCAMCAGCMTMCT TTGTGTTTCCTATATGACACCTAATATCCAG
GATGGAAAGCCATTTTA I I I I I CTCTAAATTTTAAAATAGMGACTTTAATGGAAMCATTTAGTAC CATCATGTCACCCTGMTGCCAGCMTACCTCGACTTTTACACACGCAGGMGCCTAGTAMAGCCC CGTCAGTAGTACACATTTCTCTATGGTCCTTCMCAGTTTT[G/T]CATATACAMATTTTCTGCTATT
Wl-6819a 175 TTGCTTTAGCMACAGCMTMCTTTTGTGTTTCCTATATGACACCTMTAT
GCAAAMGCTTTATTGGCTCCMCMATTATCCCTTTTAAMCTCCTCTTCTTCTTCTGGTCTCAGTG GAACAACACATTTGMTTTCAGATTTGCAGTTTATAGCA I I I I I I I I CCCTAAGMCCATATAMTAC ATGCAAAACCTTGTACAT[A/G]GAGCTTAAATMTATCAAAATGCAAATATAGATTGGGTGCACTGT
Wl-6826b 1 54 TMGCTGMTTGCAMTTATGGCMCACACACTGGACTGGGGTATACGTTG
GCAAAMGCTTTATTGGCTCCMCAMTFATCCCTTTTAAAACTCCTCTTCTTCTTCTGGTCTCAGTG UI
© GAACAACACATTTGAATTTCAGATTTGCAGTTTATAGCA I I I I I I I I CCCTMGAACCATATAMTAC ATGCAAAACCTTGTACAT[A/G]GAGCTTAAATMTATCAAAATGCMATATAGATTGGGTGCACTGT
WI-6826 1 54 TMGCTGMTTGCAMTTATGGCMCACACACTGGACTGGGGTATACGTTG
AGTGCAMCTATTTTGMCAAMGTAMCTATGAGTCACAGCATTCAGCMGACATCAGACACGGA AGAGTGMCMTATTCACTMGTMAATACAGCAGATGAGATGTCTCTCACATGTA[T/C]ATTTMT TATFCATGC I I I I I CMTAGTCTCTTAGTCMCTTTCAGTGTMTTTCCACAMTATATAGCAGCTCA
Wl-6857a 1 22 MCACMATGCAGGAGCACMTGGCMAGTTTGGCMCTGTTTTGGGCTMTT
TTATAGMTACTTATGGGGCATACGNGTAMTGAACTGTCMCCTTMAATCTAMCMACAGCTTG TTTGTGGTTCGTCCTGMATCCTCCCTGCTCACAAMCAGCCAGCTACTNGGTTTTCTAAMGACGTA ATTTTGCAGGCAMCTTC[G/A]TAGAGCCATTCTGTGCAGMGMGGGMGGGAGMGCTGTTTGTT
WI-6865 1 53 G A TTACCTGTAGTATGMGATATTCTTTGCGCTGTTAGMCTGAGCTCATFM
ATTGAMACTGGTTAGCMCAGATAMTTACMTAGAGCCTGGATATAMMTGAGAGMGMTGC AGACTTA[C/T]MGCTTATAGAGAMGTCAAAMGGAGCMGTTTTTGMATCAGATTTTATGATAC GGAAAAAAMTTFCCTTFTTTTGCCMCAGGATTATTTCGMTAATAMTCTGCCAGTGCCMTCAG
WI-6909 73 C T AMCACCATTTCCACMTATTTGCATGCCCCTAGTTGCCTATTTTATACATATC
UI
ACTTCTAGTGCCTCTGTTACCACCACCTCTMTGCCTCTGGTCGCCGCACTTCTGATGTCCGTAGGCCT TMATCTGCCTGGCGTCCCCTCCCTCTGTCTTCAGCACCCAGAGGAGGAGAGAGCCGGCAGTTCCCTG CAGGAGAGAGGAGGGGCTGCTGGACCCMGGCTCAGTCCCTCTGCTCTCAGGACCCCCTGTCCTGACT
Wl-6996b 242 CTCTCCTGATGGTGGGCCCTCTGTGCTCTTCTCTTCqG/ηGTCGGATC
ACTTCTAGTGCCTCTGTTACCACCACCTCTMTGCCTCTGGTCGCCGCACTTCTGATGTCCGTAGGCCT TAMTCTGCCTGGCGTCCCCTCCCTCTGTCTTCAGCACCCAGAGGAGGAGAGAGCCGGCAGTTCCCTG CAGGAGAGAGGAGGGGCTGCTGGACCCMGGCTCAGTCCCTCTGCTCTCAGGACCCCCTGTCCTGACT
WI-6996 228 CTCTCCTGATGGTGGGCCCTCTG[T/G]GCTCTTCTCTTCCGGTCGGATC
TGGGGAGGACAGGGAGATGCTGCAGTTCCAAMGAGMGGTFTCTTCCAGAGTCATCTACCTGAGTC CTGMGCTCCCTGTCCTGAMGCCACAGACMTATGGTCCCAMT[G/A]CCCGACTGCACCTTCTGTG CTTCAGCTCTTCTFGACATCMGGCTCTTCCGTTCCACATCCACACAGCCMTCCMTTMTCMACC
Wl-7021 b 1 1 2 ACTGTTATTMCAGATAATAGCAACTTGGGAMTGCTTATGTTACAGGTTA
TGGGGAGGACAGGGAGATGCTGCAGTTCCAAMGAGMGGTTTCTTCCAGAGTCATCTACCTGAGTC CTGMGCTCCCTGTCCTGMAGCCACAGACMTATGGTCCC[A/G]MTGCCCGACTGCACCTTCTGTG CTTCAGCTCTTCTTGACATCMGGCTCTTCCGTTCCACATCCACACAGCCMTCCMTTMTCMACC
WI-7021 1 08 ACTGTTATTMCAGATAATAGCAACTFGGGAMTGCTTATGTTACAGGTTA
UI
GGCAGTAGGACCACCAGTGTGGGGTTCTGCTGGGACCTTGGAGAGCCTGCATCCCAGGATGCGGGTGG SI CCCTGCAGCCTCCTCCACCTCACCTCCATGACAGCGCTAMCGTTGGTGAfC/ηGGTTGGGAGCCTCT GGGGCTGTTGMGTCACCTTGTGTGTTCCMGTTTCCMACMCAGAMGTCATTCCTTCTTTTTAM
WI-7056C 1 1 8 ATGGTGCTTMGTTCCAGCAGATGCCACATMGGGGTTTGCCATTTGATA
GGCAGTAGGACCACCAGTGTGGGGTTCTGCTGGGACCTTGGAGAGCCTGCATCCCAGGATGCGGGTGG CCCTGCAGCCTCCTCCACCTCACCTCCATGACAGCGCTAMCGTTGGTGA[C/ηGGTTGGGAGCCTCT GGGGCTGTTGMGTCACCTTGTGTGTTCCMGπTCCMACMCAGAMGTCATTCCTTCTTTTTAM
Wl-7056b 1 1 8| ATGGTGCTTMGTTCCAGCAGATGCCACATMGGGGTTTGCCATTTGATA
MTTCGCTGMAAAGGMCTACCTATCCTTACATTTCACCTACTMTGTCTCTTCTMCATCTTAGAG GTCCATGGAGMGGCATATGGAGMCATGTTTTATACTGCTCTATAMTAGTATTCCMTCACTGTG CTTAATTTAAATAGCATT[A/C]TCTTATCATTFATCAGCCTTTTATGTATTTTCCAAGTAAAATATTA
Wl-7091 b 1 53 ACATATTATTTCATFGGTCTTC I I M I I ATCTGGTTCTATATGMTGCTAT
MTTCGCTGMMAGGMCTACCTATCCTTACATTTCACCTACTMTGTCTCTTCTMCATCTTAGAG GTCCATGGAGMGGCATATGGAGMCATGTTTTATACTGCTCTATAMTAGTATTCCMTCACTGTG CTTAATTTAAATAGCAT [A/C]TCTTATCATTTATCAGCCTTTTATGTATTTTCCMGTAAAATATTA
WI-7091 1 531 ACATATTATTTCATFGGTCTTC I I I I I I ATCTGGTTCTATATGMTGCTAT
TGTGMGCCACATTTTCCMCATGAGCCTCATGMGCCAACTMGTGTTATTGMCTGΓΓ/CIMTTC TCTCMTMCTCAGTGTAGCACTTTAMGTCTGMGGACAGCMCATGAAMGAGCATATCMTGTG
GTGGAGAMGGGMGGGGTTGGC I I I I I MTTTAT TTTTCTTCATCTTTTATMCMGMAGNNNNN
WI-7136 58 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGTAGCTTTCTATATATG
GGGACGCCTGTTGTT1TGGCTCAATTTGGGTTTGTTGGTCACATGGAGCTCTTCCATTTCGTTTAGCTG MTMTGAGTTGTTCCTAGAGGAGACAGCCTGTCTCTCCTFGTTGCCCCCAMGCCCATGCCCTGCCG TGGTGGCAGCTGGGGCTGTGGATGGGAGGGGTCCCCMCATGGATGTGTTGCCCCTCCTCCGCATGCC
WI-7146C 21 0 MCGC[A/G]GTTCATGTACMGGCCCCTCTGCMCTGGAGAGAAMTTA
GGGACGCCTGTTGTTTTGGCTCMTTTGGGTTTGTTGGTCACATGGAGCTCTTCCATTTCGTTTAGCTG MTMTGAGTTGTTCCTAGAGGAGACAGCCTGTCTCTCCTTGTTGCCCCCAMGCCCATGCCCTGCCG TGGTGGCAGCTGGGGCTGTGGATGGGAGGGGTCCCCMCATGGATGTGTTGCCCCTCCTCCGCATGCC
Wl-7146b 21 0 MCGC[A G]GTTCATGTACMGGCCCCTCTGCMCTGGAGAGAAMTTA
GGGACGCCTGTTGTTTTGGCTCMTTTGGGTTTGTTGGTCACATGGAGCTCTTCCATTTCGTTTAGCTG MTMTGAGTTGTTCCTAGAGGAGACAGCCTGTCTCTCCTTGTTGCCCCCAMGCCCATGCCCTGCCG TGGTGGCAGCTGGGGCTGTGGATGGGAGGGGTCCCCMCATGGATGTGTTGCCCCTCCTCCGCAηGA
WI-7146 202 ICCMCGCAGTTCATGTACMGGCCCCTCTGCMCTGGAGAGAAMTTA
UI
ATATTACMCTTGC I I I I I AGCTGATCTTCCATCCTCMATGACTC I I I I I I CTTTATATGTTMCATA I TATAAMTGGCMCTGATAGTCMTTTTGAI I I I lATTCAGGMCTATCTGAMTCTGCTCAGAGCCT ATGTGCATAGATGAAACNNNNNNNtA/T]AAAAAAAGTTATTTAACAGTAATCTATTTACTAATTAT
WI-7153 1 61 AGTACCTATCTTTAMGTATAGTACATTTTACATATGTAAATGGTATGTTT
TAGMTAGATGCGGTCATATTCTTCTTTGGCTTCTGGTTCTTCCAGCCCTCATGGTTGGCATCACATAT GCCTGCATGCCATTMCACCAGCTGGCCCTACCCCTATMTGATCCTGTGTCCTAMTTMTATACAC CAGTGGTTCCTCCTCCCTGΓT/G]TAAAGACTMTGCTCAGATGCTGTTTACGGATATTTATATTCTAG
WI-7155 1 56 TCTCACTCTCTTGTCCCACCCTTCTTCTCTTCCCCATTCCCMCTCCAG
AGCTCCACCAGATGCAGATTTGTGTTΓTGTTTTCTTGTTATCACTGTCACACAGCTTATMCATGTAT
GCTTTTCAGMTACAGTTGTCTAGCCMGCCATCMGTGTCTGMATTCMTATTGGTTTATGCAMT
ACAGCAAACTTTTATTTAAGTAGAT[A/G]GGAGAATATGTTTAAAATATTAGGAATCCTAGACCATA
Wl-7169b 1 61 TTTCMGTCATCTTAGCAGCTAGGATTCTCAMTGGMGTGTTATATATA
CTCCTAGACTAGTGCTTTACCTTTATTMTGMCTGTGACAGGMGCCCMGGCAGTGTTCCTCACCA ATMCTTCAGAGMGTCAGTTGGAGAAMTGMGMMAGGCTGGCTGAAMTCACTATMCCATC AGTTACTGGTTTCAGTTGACAAMTATATMTGGTTTACTGCTGTCATTGTCCATGCCTA[C/T]AGAT
Wl-7175b 1 94 MTTTATTTTGTA I I I I I GMTMAMACATTTGTACATTCCTGATACTGGG
ui
4-.
UI CΛ
I -4
UI 00
U
VO
©
TGAMTCCTGGGTCTCTTGGCCTGTCCTGTAGCTGGTTTATTTTTTACTTTGCCCCCTCCCCAC I I I I I I TGAGATCCATCCTTTATCAAGAAG[T/A]CTGAAGCGACTATAMGGTTTTTGMTTCAGATTTAAM ACCMCTTATAMGCATΓGCMCMGGTTACCTCTATTTTGCCACMGCGTCTCGGGATTGTGTTTGA
WI-7388 94 CTTGTGTCTGTCCMGMCTTTTCCCCCAAAGATGTGTATAGTTATTGG
TTAGATTTTMTFGGCMCCAGCMCTCACTGCCACCATTCCACTGCAGATCTNCTATTCCTGG[A/G] GTTGATATGACMGGMACCCTATTGGMCCMGTCTTCAGATTGTNCCATGTGCAGACAGGCTCCT TGTCTGTAGGTGTAGTAGCATGTACACTGTACTGTTCACTGTMCATAGTTTGTNCTGGTATTTGTTA
WI-7438 64 TTGGAMTGMTATCGCTTCCACTGACTTTTACCA
CCATGATCCCCTCCTCTTGCCAMTGGAGGMGCCTGTGGATGGTACCMCAMCMGCCCCAMCC CAGTACAMCTGAGMTGAGAGMCCCTGATAGCACTGTCTGMTTGCCAGGAGCCTCCMGGCTM TCCTACCCCTGGATTTCT|T/C]TGTTGTTTAAGTTATTTCTAGCCACCACAMGAGGGTACTGCCCM
Wl-7454b 1 52 CAGACTCATCCTTAAAMATCCCATTTGTCTACTTCTCAMTG I I I I I GACA
CCATGATCCCCTCCTCTTGCCAMTGGAGGMGCCTGTGGATGGTACCMCAMCMGCCCCAMCC CAGTACAMCTGAGMTGAGAGMCCCTGATAGCACTGTCTGMTFGCCAGGAGCCTCCMGGCTM TCCTACCCCTGGATTTCT[T/C]TGTTGTTTMG"FTATTTCTAGCCACCACAMGAGGGTACTGCCCM
WI-7454 1 52 CAGACTCATCCTTAAAAAATCCCATTTGTCTACTTCTCAMTG I I I I I GACA
AATTTGAAAATCTGAAAAAAAGTGCATAAGCAGAGAMTGACACTTATTCCAAATAAATAAATTGT 4i. SI CCA I I I I I CACTCAGTCCATCTTAACCATGTACAATGCACTAMTTACTATTFATMTTTCCTATGTA CMCAGAGCCACAGCACMGAGGGTGGGCATMGCAGTTGCCA[G/C]CCAGMGAGCTTTCACTCAT
Wl-7464c 1 77 GAMGMAGCCCTACAMTAGGCCCAGGAGMGCMCGTTCACCMCMTTAT
AATTTGAAMTCTGAMAAMGTGCATAAGCAGAGAMTGACACTTATTCCAAATAMTMATTGT CCA I I I I I CACTCAGTCCATCTTMCCATGTACAATGCACTAMTTACTATTTATMTTTCCTATGTA CMCAGAGCCACAGCACMGAGGGTGGGCATMG[C/A]AGTTGCCAGCCAGMGAGCTTTCACTCAT
Wl-7464b 1 68 GAMGMAGCCCTACAMTAGGCCCAGGAGMGCMCGTTCACCMCMTTAT
MTTTGMMTCTGAMMMGTGCATMGCAGAGAMTGACACTTATTCCAMTMATAMTTGT CCA I I I I I CACTCAGTCCATCTTMCCATGTACAATG[C/A]ACTAAATTACTATTTATMTTTCCTAT GTACMCAGAGCCACAGCACMGAGGGTGGGCATMGCAGTTGCCAGCCAGMGAGCTTTCACTCAT
Wl-7464a 1 03 GAMGMAGCCCTACAMTAGGCCCAGGAGMGCAACGTTCACCMCMTTAT
CMTTCTCMTCCMCCTAGTCTGTNTGCCTAMCCATTCCAGACAMCTTCCACTTCGMGGTTTTA MTGCATMGTCAGATAGCMTCCTTCAGTTGCCCCAGAGGCACATCACGTTCTTTGMTGCTTCA[T /G]TATAGTCCTCTTCATTTAGCMTCAGTGAGGCMTACACTGGCATCATGATCCCI I I I I I IAGGA
Wl-7499b 1 34 G! ACTCTGTACAAMTTCCCTTTGMMTATAMTTTTGGAAATGAGTGATGA
CMTTCTCAATCCMCCTAGTCTGTNTGCCTM[A/G]CCATTCCAGACAMCTTCCACTTCGAAGGTT TTAMTGCATMGTCAGATAGCAATCCTTCAGTTGCCCCAGAGGCACATCACGTTCTTFGMTGCTTC ATTATAGTCCTCTTCATTTAGCMTCAGTGAGGCMTACACTGGCATCATGATCCCTTTTTTTAGGM
Wl-7499a 33 CTCTGTACAAMTTCCCTTTGAMATATAMTTTTGGMATGAGTGATGA
TGGGMTAGTMGAGAMGATGGGAAAGGTGACCAMAACMTATAGAGGCAGAGGCCMGTGMT GCATCCCAGCAGCAGACCACTTNAAMGTAGTCCTGGTGCTGATFGCCTAGC[A/C]GGAGAGTTGAG TGCCACAGGTAAGAATGAGTGMGAGGAAAAMTCATGATGTCATGTATGCAGTMTTACTATGTCA
Wl-7506b 1 1 8 GMGMMTATTTTAAMTATTGGACCACTCTTGTTCTACCATCCCTACCCACT
TGGGMTAGTMGAGAMGATGGGAMGGTGACCAMMCMTATAGAGGCAGAGGCCMGTGMT GCATCCCAGCAGCAGACCACTTNAAMGTAGTCCTGGTGCTGATTGCCTAGC[A/C]GGAGAGTTGAG TGCCACAGGTMGMTGAGTGMGAGGAAMMTCATGATGTCATGTATGCAGTMTTACTATGTCA
WI-7506 1 1 8 GMGAAMTATTTTAAMTATTGGACCACTCTTGTTCTACCATCCCTACCCACT
TGTGMTTCTTAGCTCTGGMGGTGTTTATGCCTTTGCGGGTTTCTTGATGTGTTCGCAGTGTCACCCA AGAGTCAGMCTGTACACATCCCAAMTTTGGTGGCCGTGGMCACATTCCCGGTGATAGMTTGCT AMTFGT[CAF]GTGAAATAGGTTAGM I I I I I CTTTAAATTATGGTTTTCTTATTCGTGAAAATTCGG
Wl-7534b 1 43 AGAGTGCTGCTAAMπGGATTGGTGTGATCTTTTTGGTAGTTGTMTTT
TGTGMTTCTTAGCTCTGGMGGTGTTTATGCCTTTGCGGGTTTCTTGATGTGTTCGCAGTGTCACCCA AGAGTCAGMCTGTACACATCCCAAMTTTGGTGGCCGTGGMCACATFCCCGGTGATAGMTTGCΓΓ /C]MATTGTCGTGAAATAGGTTAGM I I I I I CTTTAMTTATGGTTTTCTTATTCGTGAAAATTCGG
WI-7534 1 35 AGAGTGCTGCTAAMTTGGATTGGTGTGATCTTTTTGGTAGTTGTMTTT
GGGMAGMTAAMTFAGCTTGAGCMCCTGGCTMGATAGAGGGGCTCTGGGAGACTTTGMGACC AGTCCTGTTTGCAGGGMGCCCCACTTGMGGMGMGTCTMGAGTGMGTAGGTGTGACTTGMC TAGATTGCATGCTTCCTCCTTTGCTCTT[G/A]GGMGACCAGCTTTGCAGTGACAGCTTGAGTGGGTT
Wl-7543b 1 62 CTCTGCAGCCCTCAGATFATTTTTCCTCTGGCTCCTTGGATGTAGTCAGTTA
GGGAMGMTMMTTAGCTTGAGCMCCTGGCTMGATAGAGGGGCTCTGGGAGACTTTGMGACC AGTCCTGTTTGCAGGGMGCCCCACTTGMGGMGMGTCTMGAGTGMGTAGGTGTGACTTGMC TAGATTGCATGCTTCCTCCTTTGCTCTr[G/A]GGMGACCAGCTrTGCAGTGACAGCTTGAGTGGGTT
WI-7543 M 62 I GJ A CTCTGCAGCCCTCAGATTATTTTTCCTCTGGCTCCTTGGATGTAGTCAGTTA
GGTGATCMGATCTGTTCCACAGGGCTMTGCCACCATCTCCCCTCAMATTTGTAGAGGtT/C]TCTA MMGAMGTGGTATGTTGTGTGATGATCAGCACTMGTCCTGCATTCCTGTTAMGCCACTTGGGTC ATMGMGGGMGTAAAAMTGAAGTCTGACTAGAMTTCTATTGCAGAGGCCMGTACATTTAGT
WI-7555C 60 T Ci ATGGCATTGAGTTGTGATATAGTTTTCATTTGATGTGCATTTTGMTTTCAG
4i.
MCCATGTTCCCTTCTTCTTAGCACCACAMTMTCAAMCCCMCATMGTGTΓTGCTTTCCTTTM AMTATGCATCAMTCGTCTCTCATTACTTTTCTCTGAGGGTΠTAGTA[A/G]ACAGTAGGAGTTMT AMGMGTFCATΠTGGTTTACACGTAGGAMGMGAGMGCATCAMGTGGAGATATGTTMCTAT
W1-7577J 1 1 7 TGTATMTGTGGCCTGTTATACATGACACTCTTCTGMTTGACTGTATTTC
MCCATGTTCCCTTCTTCTTAGCACCACAMTMTCMMCCCAACATMGTGTTTGCTTTCCTTTM AMTATGCA[T/C]CAAATCGTCTCTCATTACTTTTCTCTGAGGGTTTTAGTAMCAGTAGGAGTTMT MAGAAGTTCATTTTGGTTTACACGTAGGAMGAAGAGMGCATCAMGTGGAGATATGTTMCTAT
WI-7577J 77 TGTATAATGTGGCCTGTTATACATGACACTCTTCTGAATTGACTGTATTTC
MCCATGTTCCCTTCTTCTTAGCACCACAMTMTCMMCCCMCATM[G/C]TGTTTGCTTTCCTT TMMATATGCATCAMTCGTCTCTCATTACTTTTCTCTGAGGGTTTTAGTAMCAGTAGGAGTTMT AMGMGTTCATTTTGGTTTACACGTAGGAMGMGAGMGCATCAMGTGGAGATATGTTMCTAT
Wl-7577 50 TGTATMTGTGGCCTGTTATACATGACACTCTTCTGMTTGACTGTATTTC
MCCATGTTCCCTTCTTCTTAGCACCACAMTMTCAAMCCCMCATAAGTGTTTGCTTTCCTTTM MATATGCATCAAATCGTCTCTCATTACTTTTCTCTGAGGGTTTTAGTAMCAGTAGGAGTTMTM AGMGTTCATTTTGGTTTACAC[G/A]TAGGAMGMGAGMGCATCMAGTGGAGATATGTTMCT
Wl-7577g 1 57 ATTGTATMTGTGGCCTGTTATACATGACACTCTTCTGMTTGACTGTATTTC
MCCATGTTCCCTTCTTCTTAGCACCACAAATAATCAAAACCCAACAT[A/G]AGTGTTTGCTTTCCTT
ON
TMAAATATGCATCAMTCGTCTCTCATTACTTTTCTCTGAGGGTTT FAGTAMCAGTAGGAGTTMT
AMGMGTTCAT GGTTTACACGTAGGAMGMGAGMGCATCAMGTGGAGATATGTTMCTAT
Wl-7577f 48 G TGTATMTGTGGCCTGTTATACATGACACTCTTCTGMTTGACTGTATTTC
AACCATGTTCCCTTCTTCTTAGCACCACAMTMTCMAACCCMCATMGTGTTTGCTTTCCTTTM MATATGCATCAAATC[G/A]TCTCTCATTACTTTTCTCTGAGGGTTTTAGTAMCAGTAGGAGTTMT MAGMGTTCATTTTGGTTTACACGTAGGAMGMGAGMGCATCAMGTGGAGATATGTTMCTAT
Wl-7577e 84 TGTATMTGTGGCCTGTTATACATGACACTCTTCTGMTTGACTGTATTTC
MCCATGTTCCCTTCTTCTTAGCACCACAMTMTCMMCCCMCATMGTGTTTGCTTTCCTTTM MATATGCATCMATCGTCTCTCAT[T/C]ACTTTTCTCTGAGGGTTTTAGTAMCAGTAGGAGTTAAT AMGMGTTCATTTTGGTTTACACGTAGGAMGMGAGMGCATCAMGTGGAGATATGTTMCTAT
Wl-7577d 93 TGTATMTGTGGCCTGTTATACATGACACTCTTCTGMTTGACTGTATTTC
MCCATGTTCCCTTCTTCTTAGCACCACAMTMTCMMCCCMCATMGTGTTTGCTTTCCTTTM MATATGCATCAMTCGTCTCTCATTACTTTTCTCTGAGGGTTTTAGTAMCAGTAGGAGTTMTM AGMGTTCATTTTGGTTTA[C/A]ACGTAGGAMGMGAGMGCATCAMGTGGAGATATGTTMCT
WI-7577C I 1 54 ' C' A ATTGTATMTGTGGCCTGTTATACATGACACTCTTCTGMTTGACTGTATTTC
-4
00
÷-
VO
TTAMTGAGTGTGTTTGTCACCGTTGGGGATTGGGGMGACTGTGGCTGCTGGCACTTGGAGCCMGG GTTCAGAGACTCAGGGCCCCAGCACTAMGCAGTGGACCCCAGGAGTCCCTGGTMTMGTACTGTG TACAGMTTCTGCTACCTCACTGGGGTCCTGGGGCCTCGGAGCCTCATCCGAGGCAGGGTCAGGAGAG
Wl-7743d 275 T GGGCAGMCAGCCGCTCCTGTCTGCCAGCCAGCAGCCAGCTCTCAGCCMCG
TTAMTGAGTGTGTTTGTCACCGTTGGGGATTGGGGMGACTGTGGCTGCTGGCACTTGGAGCCMGG GTTCAGAGACTCAGGGCCCCAGCACTAMGCAGTGGAC[C/A]CCAGGAGTCCCTGGTMTMGTACT GTGTACAGMTTCTGCTACCTCACTGGGGTCCTGGGGCCTCGGAGCCTCATCCGAGGCAGGGTCAGGA
WI-7743Θ 1 06 GAGGGGCAGMCAGCCGCTCCTGTCTGCCAGCCAGCAGCCAGCTCTCAGCC
TTAMTGAGTGTGTTTGTCACCGTTGGGGATTGGGGMGACTGTGGCTGCTGGCACTTGGAGCCMGG GTTCAGAGACTCAGGGCCCCAGCACTAMGCAGTGGACCCCAGGAGTCCCTGGTMTMGTACTGTG TACAGAATFCTGCTACCTCACTGGGGTCCTGGGGCCTCGGAGCCTCATCCGAGGCAGGGTCAGGAGAG
Wl-7743d 275 | T GGGCAGMCAGCCGCTCCTGTCTGCCAGCCAGCAGCCAGCTCTCAGCCMCG
TTMATGAGTGTGTTTGTCACCGTTGGGGATTGGGGMGACTGTGGCTGCTGGCACTTGGAGCCMGG GTTCAGAGACTCAGGGCCCCAGCACTAMGCAGTGGAC[C/A]CCAGGAGTCCCTGGTMTMGTACT GTGTACAGMTTCTGCTACCTCACTGGGGTCCTGGGGCCTCGGAGCCTCATCCGAGGCAGGGTCAGGA
WI-7743C 1 06 GAGGGGCAGMCAGCCGCTCCTGTCTGCCAGCCAGCAGCCAGCTCTCAGCC
TTAMTGAGTGTGTFTGTCACCGTTGGGGATTGGGGMGACTGTGGCTGCTGGCACTTGGAGCCMGG GTTCAGAGACTCAGGGCCCCAGCACTAMGCAGTGGACCCCAGGAGTCCCTGGTMTMGTACTGTG TACAGMTTCTGCTACCTCACTGGGGTCCTGGGGCCTCGGAGCCTCATCCGAGGCAGGGTCAGGAGAG
Wl-7743b 275 GGGCAGMCAGCCGCTCCTGTCTGCCAGCCAGCAGCCAGCTCTCAGCCMCG
TTAMTGAGTGTGTTTGTCACCGTTGGGGATTGGGGMGACTGTGGCTGCTGGCACTTGGAGCCMGG GTTCAGAGACTCAGGGCCCCAGCACTAMGCAGTGGAC[C/A]CCAGGAGTCCCTGGTMTMGTACT GTGTACAGMTTCTGCTACCTCACTGGGGTCCTGGGGCCTCGGAGCCTCATCCGAGGCAGGGTCAGGA
Wl-7743 1 06 GAGGGGCAGMCAGCCGCTCCTGTCTGCCAGCCAGCAGCCAGCTCTCAGCC
TTAMTGAGTGTGTTTGTCACCGTTGGGGATTGGGGMGACTGTGGCTGCTGGCACTTGGAGCCMGG GTTCAGAGACTCAGGGCCCCAGCACTAMGCAGTGGACCCCAGGAGTCCCTGGTMTMGTACTGTG TACAGA TTCTGCTACCTCACTGGGGTCCTGGGGCCTCGGAGCCTCATCCGAGGCAGGGTCAGGAGAG
WI-7743 275 GGGCAGMCAGCCGCTCCTGTCTGCCAGCCAGCAGCCAGCTCTCAGCCMCG
TGACATTTATTCAMGTTMMGCAMCACTTACAGMTTATGAAGAGGTATCTGTTTAACATTTCC TCAGTCMGTTCAGAGTCTTCAGAGACTTCGTMTTMAGGMCAGAGTGAGAGACATCATCMGTG GAGAGAAATCtA/G]TAGTTTAAACTGCATTATAAATTTTATAACAGAATTAAAGTAGATTTTAAAA
WI-7758 1 441 GATMMTGTGTMTTTTGTTTATATTTTCCCATTTGGACTGTMCTGACTGCC
ACAGGGCCTTTGGCAGGTGCAGCCCCCACTGCCTTTGACCTGCCTCCCTTCATGCATGGAMTTCCCT TCATCTGGMCCATCAGAMCACCCTCACACTGGGACTTGCAAAMGGGTCAGTATGG[G/C]TTAGG GMMCATTCCATCCTFGAGTCAMAMTCTCMTTCTTCCCTATCTTTGCCACCCTCATGCTGTGTG
Wl-7765b 1 26 ACTCAMCCAMTCACTGMCTTTGCTGAGCCTGTAMATAAMGGTCGGA
TTMTTTACTGATTCCAGCMGACCAMTCATTGTATCAGATTΛ I I I I I MGTTTTATCCGTAGTTT GATAAMGATTTTCCTATTCCTTGGTTCTGTCAGAGMCCTMTMGTGCTACTTTGCCATTMGGCA
GACTAGGGTTCATGTC I I I I LACCCTTTNNNNNNNNNTTGTAAMGTCTAGTTACCTACTTTTTCTTT
Wl-7773b 237 G GATTTFCGACGTTTGACTAGCCATCTCMGCM[C/G]TTTCGACGTTTGA
TGCMCCTCTTTTCGTGATGGGCAGCCTGCTGGTCAGCACTCCAGTAGCGAGAGACGGCACCCAGMT CAGATCCCAGCTTCGGCATTTGATCAGACCAMCAGTGCTGTTΓCCCGGGGAGGAMCACTTTTTTM TTACCCTTTTGCAGGCACCACCTTTAATCTGTTTTT/C]ATACCTTGCTTATTAMTGAGCGACTTMA
Wl-7774b 1 70 ATGATTGAAMTMTGCTGTCCTTTAGTAGCMGTAAMTGTGTCTTGCT
GCAGAGACCTTCCMGGACATATTGCAGGATTCTGTMTAGTGMCATATGGAMGTATTAGAMTA TTTATTGTCTGTAAATACTGTAMTGCATTGGMTAAMCTGTCTCCCCCATTGCTCTATGAMCTGC ACATTGGTCATTGTGMTANNNNNNNNNNNGCCMGGCTMTCCMTTATTATTATCACATTTACCA
WI-7785C 1 65 TMTTTA'ΓΓTTGTCCATTGATGTATTTATTTTGTAAATGTATCTTGGTGCTGC
GCAGAGACCTTCCMGGACATATTGCAGGATTCTGTMTAGTGMCATATGGAMGTATTAGAMTA CΛ I TTTATTGTCTGTAAATACTGTAMTGCATTGGMTAAMCTGTCTCCCCCATTGCTCTATGAMCTGC ACATTGGTCATTGTGMTANNNNNNNNNNNGCCAAGGCTMTCCAATTATTATTATCACATTTACCA
Wl-7785b 1 65 TAATTIATTTTGTCCATTGATGTATTTATTTTGTAMTGTATCTTGGTGCTGC
GCAGAGACCTTCCMGGACATATTGCAGGATTCTGTMTAGTGMCATATGGAMGTATTAGAMTA TTTATTGTCTGTAMTACTGTAMTGCATTGGMTMMCTGTCTCCCCCATTGCTCTATGAMCTGC ACATTGGTCATTGTGMTANN[-
/T]NNNNNNNNGCCAAGGCTMTCCMTTA'TTATTATCACATTTACCATMTTTATTTTGTCCATTGA
WI-7785 1 56 TGTATTTAJTTTGTAMTGTATCTTGGTG __
TCTCCCCCTCATCCMCTCCGAMGTCTGMTCFCCCMGGAGGGCACCATCTTACAGAGACTCTCCC TGACGGTGGMTTTM[G/A]TTTAGGGTCCCTAAMGCATTTGACACACAGTTGTTGMTGACTGAC CCAAMTGTGMTGMGCTMTGTGMTGTGAGTGMGCTCCCTTCAGGCCCGCTGCCCTAGGATAT
WI-7789C 84 GCCCTCCTGGTGACTCGGGGGCTGTCTCAGACGACTAGCCCAGGACCCATCT _
TCTCCCCCTCATCCMCTCCGMAGTCTGMTCTCCCMGGAGGGCACCATCTTACAGAGACTCTCCC TGACGGTGGMTTTM[G/A]TTTAGGGTCCCTAAMGCATTTGACACACAGTTGTTGMTGACTGAC CCAAMTGTGMTGMGCTMTGTGMTGTGAGTGMGCTCCCTTCAGGCCCGCTGCCCTAGGATAT
Wl-7789b 84 ' G Ai — GCCCTCCTGGTGACTCGGGGGCTGTCTCAGACGACTAGCCCAGGACCCATCT
UI
GCAGGAAATAGTCACTCATCCCACTCCACATMGGGGTTTAGTMGAGMGTCTGTCTGTCTGATGA TGGATAGGGGGCAMTC I I I I I CCCCTTTCTGTTMTAGTCATCACATTTCTATGCCAMCAGGMCG
ATCCATMCTTTAGT[CT]TTAATGTACACATTGCATTTTGATAMATTMTTTTGTTGTTTCCTTTG
Wl-7830d 1 50 T AGGTTGATCGTTGTGTTGTTRTGCTGCACTTTTTACI I I I I IGCGTGTGGA
GCAGGAMTAGTCACTCATCCCACTCCACATMGGGGTTTAGTMGAGMGTCT[G A]TCTGTCTGA TGATGGATAGGGGGCAMTCI I I I I CCCCTTTCTGTTMTAGTCATCACATTTCTATGCCAMCAGGA ACGATCCATMCTTFAGTCTTAATGTACACATTGCATTFTGATAAMTΓMTTTTGTTGTTTCCTTTG
WI-7830C 54 AGGTTGATCGTTGTGTTGTTTFGCTGCACTTTTTAC I I I I I I GCGTGTGGA
GCAGGAMTAGTCACTCATCCCACTCCACATMGGGGTTTAGTMGAGMGTCTGTCTGTCTGATGA TGGATAGGGGGCAMTC I I I I I CCCCTTTCTGTTMTAGTCATCACATTTCTATGCCAMCAGGMC[ G/A]ATCCATMCTTTAGTCTΓAATGTACACATTGCATTFTGATAMATTMTTTTGTTGTTTCCTTTG
Wl-7830b 1 34 AGGTTGATCGTTGTGTTGTTFTGCTGCACTTTTTAC I I I I I I GCGTGTGGA
GCAGGAAATAGTCACTCATCCCACTCCACATMGGGGTTTAGTA[A/G]GAGMGTCTGTCTGTCTGA
TGATGGATAGGGGGCAMTCI I I FT CCCCTTTCTGTTMTAGTCATCACATTTCTATGCCAMCAGGA ACGATCCATMCTTTAGTCTTAATGTACACATTGCATTTTGATAAMTTMTTTΓGTTGTTFCCTΓTG
Wl-7830 44 AGGTTGATCGTTGTGTTGTTTTGCTGCACTTΓTTACTTTTTTGCGTGTGGA
CCACTTCCTATCTGA I I I I I CCCAGFC/ΗAAATGAGGCAGGCAATTCTAGTCTTCCACAAAACATCTA 4i. GCCATCTAAMTGGAGAGATGAATCATTCTACCTATACAMCMGCTAGCTATTAGAGGGTGGTTGG GGTATGCTACTCATMGATTTCAGGGTGTCTTCCMCTGMATCTCMTGTTCTCAGTACGAAAMC
Wl-7865e 25 CTGAAATCACATGCCTATGTAAGGAMGTGCTATTCACCCAGTAMCCCAM
CCACTTCCTATCTGATTTTFCCCAGCAMTGAGGCAGGCMTTCTAGTCTTCCACAAMCATCTAGCC ATCTAAMTGGAGAGATGAATCATTCTACCTATACAMCMGCTAGCTATFAGAGGGTGGTTGGGGT ATGCTACTCATAAGATTTCAGGGTGTCTTCCMCTGAAATCTCMTGTTCTCAGTA[C/ηGAAMAC
Wl-7865d 1 91 CTGAMTCACATGCCTATGTMGGMAGTGCTATTCACCCAGTAMCCCMA
CCACTTCCTATCTGATTTrTCCCAG[CtηAAATGAGGCAGGCMTTCTAGTCTTCCACAAAACATCTA GCCATCTAAMTGGAGAGATGMTCATTCTACCTATACAMCMGCTAGCTATTAGAGGGTGGTTGG GGTATGCTACTCATMGATTTCAGGGTGTCTTCCMCTGMATCTCMTGTTCTCAGTACGAAAMC
WI-7865C 25 l C CTGAMTCACATGCCTATGTMGGMAGTGCTATTCACCCAGTAMCCCMA
CCACTTCCTATCTGATTTTTCCCAGCAMTGAGGCAGGCMTTCTAGTCTTCCACAAMCATCTAGCC ATCTAAMTGGAGAGATGMTCATTCTACCTATACAMCMGCTAGCTATTAGAGGGTGGTTGGGGT ATGCTACTCATAAGATTTCAGGGTGTCTTCCMCTGMATCTCMTGTTCTCAGTA[C/T]GAAMAC
Wl-7865b 1 91 CT CTGAMTCACATGCCTATGTMGGMAGTGCTATTCACCCAGTAMCCCAM
GCTCACTGTGACCCATCCTTACTCTACTTGGCCAGGCCACAGTAAMCMGTGACCTTCAGAGCAGCT GCCACMCTGGCCATGCCCTGCCATTGAMCAGTGATTMGTTTGATCMGCCATGGTGA[C/T]ACA AAMTGCATTGATCATGMTAGGAGCCCATGCTAGMGTACATTCTCTCAGATTTGMCCAGTGAM
Wl-7900d 1 28 T TATGATGTATTTCTGAGCTAAAACTCAACTAΓAGAAGACATTAAAAGAAATC
GCTCACTGTGACCCATCCTTACTCTACTTGGCCAGGCCACAGTAAMCMGTGACCTTCAGAGCAGCT GCCACMCTGGCCATG[CTΗCCTGCCATTGAMCAGTGATTMGTTTGATCMGCCATGGTGACACA AAAATGCATTGATCATGMTAGGAGCCCATGCTAGMGTACATFCTCTCAGATTTGMCCAGTGAM
Wl-7900e 84 TATGATGTATTFCTGAGCTAAMCTCAACTATAGMGACATTAAMGAAATC
GCTCACTGTGACCCATCCTTACTCTACTTGGCCAGGCCACAGTAAMCMGTGACCTTCAGAGCAGCT GCCACMCTGGCCATGCCCTGCCATTGAMCAGTGATTMGTTTGATCMGCCATGGTGA[C/ΗACA AAMTGCATTGATCATGMTAGGAGCCCATGCTAGMGTACATTCTCTCAGATTTGMCCAGTGAM
Wl-7900d 1 28 TATGATGTATTTCTGAGCTAAMCTCMCTATAGMGACATTAAAAGAAATC
GCTCACTGTGACCCATCCTTACTCTACTTGGCCAGGCCACAGTAAMCMGTGACCTTCAGAGCAGCT GCCACMCTGGCCATG[CT CCTGCCATTGAMCAGTGATTMGTTTGATCAAGCCATGGTGACACA AAMTGCATTGATCATGMTAGGAGCCCATGCTAGMGTACATFCTCTCAGATTTGMCCAGTGAM
WI-7900C 84 TATGATGTATTTCTGAGCTAAMCTCMCTATAGMGACATTAMAGAAATC
GCTCACTGTGACCCATCCTTACTCTACTTGGCCAGGCCACAGTAAMCMGTGACCTTCAGAGCAGCT GCCACMCTGGCCATGCCCTGCCATTGAMCAGTGATTMGTTTGATCMGCCATGGTGA[C/T]ACA AAAATGCATTGATCATGMTAGGAGCCCATGCTAGMGTACATTCTCTCAGATTTGMCCAGTGAM
Wl-7900b 1 28 T TATGATGTATTTCTGAGCTAAAACTCAACTATAGAAGACATTMAAGAAATC
GCTCACTGTGACCCATCCTTACTCTACTTGGCCAGGCCACAGTAAMCMGTGACCTTCAGAGCAGCT GCCACMCTGGCCATG[CTΗCCTGCCATTGAMCAGTGATTMGTTTGATCMGCCATGGTGACACA AAMTGCATTGATCATGMTAGGAGCCCATGCTAGMGTACATTCTCTCAGATTTGMCCAGTGAM
WI-7900 84 TATGATGTATTTCTGAGCTAAAACTCAACTATAGAAGACATTAAAAGAAATC
AGACTTAGGTACAATTGCTCCCCTTTTTATATA[C/T]AGACACACACAGGACACATATATTAMCAG ATTGTTTCATCATTGCATCTATTTTCCATATAGTCATCMGAGACCATTFTATAAMCATGGTMGAC CCTTTTTAAMCAMCTCCAGGCCCTTGGTTGCGGGTCGCTGGGTTATTGGGGCAGCGCCGTGGTCGT
WI-7901 C 33 CAC I CAGTCGCTCTGCATGCTCTCTGTCATACAGACAGGTMCCTAGTFCT
AGACTTAGGTACAATTGCTCCCCTTTTTATATA[C/T]AGACACACACAGGACACATATATFAMCAG ATTGTTTCATCATTGCATCTATTTTCCATATAGTCATCAAGAGACCATTTTATAAMCATGGTMGAC CCTTTTFAAMCAMCTCCAGGCCCTTGGTTGCGGGTCGCTGGGTTATTGGGGCAGCGCCGTGGTCGT
Wl-7901 b 33 CACTCAGTCGCTCTGCATGCTCTCTGTCATACAGACAGGTMCCTAGTTCT
VO
ACMTCTCAGMGGACTGTGCMGTCMTGAGTCGCTTGTGMTTCTCATCTGGAM[C/T]GATCCC ACGTCTTAGMCCTTCACCACMGGAG I I I I I CTTGTAGTGATTCTCAMGTCTTGGTAGGCATTCGA ACTGGTCCTTTCACTTTGAGATTCTTTTCTTTTGCGCCTCTTATCMGTCAGCACACACCTTTTCCMG
Wl-8021 b 57 GATTTTACGTTGCGGCTTGTTAGGGGTGATTCGMTTCGGTGMTTGCCA
ACMTCTCAGMGGACTGTGCMGTCMTGAGTCGCTTGTGMTTCTCATCTGGAM[C/TJGATCCC ACGTCTTAGMCCTTCACCACMGGAG I I MTCTTGTAGTGATTCTCAMGTCTTGGTAGGCATTCGA ACTGGTCCTTTCACTTΓGAGATTCTTTTCTTTTGCGCCTCTTATCMGTCAGCACACACCTTTTCCMG
WI-8021 57 GAT ΓACGTTGCGGCTTGTTAGGGGTGATTCGMTTCGGTGMTTGCCA
CTGAAMTTTACTATGCTCTCCACMCMGAGCTCCCATTTTCCACAGACACAGTCMTGTCAGTCA GCTTGTATTCAGGAGGACAGGGCAGAGGGATCCCAGTGGCACTTCCCATGGGMGACAGMGAGAGT GGGCCCCAGAGATGGMGGACCCCAGTGTCATCACCAMCMCCATTTCAGCCGCTCTAGCCTCTM
WI-8024C 206 TTCCC[A/G]CTCTAGMCAGCTGGCCCTGGTCGTCAGTACACMGGAMGAGC
CTGAAAATTTACTATGCTCTCCACAACMGAGCTCCCATTTTCCACAGACACAGTCMTGTCAGTCA GCTTGTATFCAGGAGGACAGGGCAGAGGGATCCCAGTGGCACTTCCCATGGGMGACAGMGAGAGT GGGCCCCAGAGATGGMGGACCCCAGTGTCATCACCAMCMCCATTTCAGCCGCTCTAGCCTCTM
Wl-8024b 206 TTCCC[A/G]CTCTAGMCAGCTGGCCCTGGTCGTCAGTACACMGGMAGAGC
GMTGAGCCTTCCTAGCGCCGAGGGACCTGCTGCTGTTGTTGGCCTGCACATGCATTCTATGGMTGC TTTTTGGCCMGCGGGGGCACTGAGGACTMGCTCTGANNNNNNNNNATCTCGCCCAMCTCCTTTCT MGGAGTCTGGGGTGTCATGCCCTACAMCC[A/G]TAAATTCTCATCAGATGGATTTTATTTMCGTT
WI-8077 1 67 GTGTATTGTGACTTACTTTCCAATCTGACTCTGGCATMCMGGGMAAA
TCTAGGTTTMTCMAGCMTTTGCANTTTGGATTTTGGMTGACCACTCCCTTGCTMGGMGCTAT GTACTTCATGCTGTGGAMCTGGCAMTACAGMTGTAGCTTGTTT[G/C]TTTTCTTAGCCTTGMGA TGACCAGGTAGAGAGACAGAGTGAGACCMCAGTTTTTCTGATTTCCCTGCTCCTCCTATTCCTTCCT
Wl-81 18f 1 1 4 AAAMTCAGACTCATTGTGACCAGTAGTCTTGAGGACTCMGCTGMTGA
TCTAGGTTFMTCMAGCMTTTGCANTTTGGATTTTGGA[A/G]TGACCACTCCCTTGCTMGGMGC TATGTACTTCATGCTGTGGAMCTGGCAMTACAGMTGTAGCTTGTTTGTTTTCTTAGCCTTGMGA TGACCAGGTAGAGAGACAGAGTGAGACCMCAGTTFTTCTGATTTCCCTGCTCCTCCTATTCCTTCCT
WI-8118Θ ! 40 A d- AAAAATCAGACTCATTGTGACCAGTAGTCTTGAGGACTCMGCTGMTGA
TCTAGGTTTMTCMAGCMTTTGCANTTTGGATTTTGGMTGACCACTCCCTTGCTMGGMGCTAT GTACTTCATGCTGTGGAMCTGGCMATACAGMTGTAGCTTGTTTGTTT[T/G]CTTAGCCTTGMGA TGACCAGGTAGAGAGACAGAGTGAGACCMCAGTTTTTCTGATTTCCCTGCTCCTCCTATTCCTTCCT
Wl-81 18d 1 1 8 ' T G — AAAMTCAGACTCATTGTGACCAGTAGTCTTGAGGACTCMGCTGMTGA
TTFTTAMTATGCCCGTTTAGAGCAGACACAGTCACMTMMGTTMAMGTTACMTGTGTCCAG TGTATATACCCAGGNMTCCATTCTTGGTACTTTTCMGAGCTGCTGTTATACTGAGTCTCTGAGMG TCCCCTTAGATMTAGCTGCCACTTTTCAGTATGGTTCAGMT[G/A]AGTATCTTAGTATTCTTTCTA
WI-8321 1 78 TTTTGCTATGGTTCTAGTTFATCMCCTACTTTATTAGCTGMCTGTTGGC
TTTTTMATATGCCCGTTTAGAGCAGACACAGTCACMTMMGTTMAMGTTACMTGTGTCCAG TGTATATACCCAGGNMTCCATTCTTGGTACTTTTCMGAGCTGCTGTTATACTGAGTCTCTGAGMG TCCCCTTAGATAATAGCTGCCACTTTTCAGTATGGTTCAGAAT[G/A]AGTATCTTAGTATTCTTTCTA
WI-8321 1 78 TTTFGCTATGGTTCTAGTTTATCMCCTACTTTATTAGCTGMCTGTTGGC
TATGTACTCACTTTCAGTTACCCCCGTGCCTCCAGMTCGCATGTTGCTCCACCTGGGGGCGGATATA MTTACCTCTAGATTGTCCAMGCCCAGTCTTTCCCTTCCCTGTGCAGCCTTAGA[A/C]ACTMGTAG CAGTACTGTTTGGTGTGTGTTTGTTTCTTCCCCAGCMTGCCTACTGCAGCTACTTAGTMCMCTAG
Wl-8332b 123 AGGTGGAGGGTNTCCGGGGMGCAGTTAGATGAGTTMGTGTGATGCACA
TATGTACTCACTTTCAGTTACCCCCGTGCCTCCAGMTCGCATGTTGCTCCACCTGGGGGCGGATATA MTTACCTCTAGATTGTCCAMGCCCAGTCTTTCCCTTCCCTGTGC[A/C]GCCTTAGAMCTMGTAG CAGTACTGTTTGGTGTGTGTTTGTTTCTTCCCCAGCMTGCCTACTGCAGCTACTTAGTMCMCTAG
Wl-8332 1 1 4 AGGTGGAGGGTNTCCGGGGMGCAGTTAGATGAGTFMGTGTGATGCACA ON SI
TGCGGGCTTMCAGGMGCATGACTGGGAGGCCTCAGGMGCTTATMTCATGGCAGMGGCGMGG GGMGCMGGACCTTCTTCACATGGCAGCAGGAGAMGAGMGMGGGAGMGTCTACACACTTTT AMCMCCAGATCTCATGAGANTTCCATCGGGAGACAGCACTAGGGGGATGGCACTAMCCATTAGA
Wl-8378b 31 1 MCTGCCCCCATGATCCMTCACCTNTCACCAGGCCCCTCCTCCMCACGTGGGG
TGCGGGCTTMCAGGMGCATGACTGGGAGGCCTCAGGMGCTTATMTCATGGCAGMGGCGMGG GGMGCMGGACCTTCTTCACATGGCAGCAGGAGAMGAGMGMGGGAGMGTCTACACACTTTT MACMCCAGATCTCATGAGANTFCCATCGGGAGACAGCACTAGGGGGATGGCACTAMCCATTAGA
WI-8378 308 MCTGCCCCCATGATCCMTCACCTNTCACCAGGCCCCTCCTCCMCACGTGGGG
AGCACATATTTAGCATTMGCCTCMACGATACAGCMTATGTTACATTCTCTTGTGAAMCAG TTGTTGTAGACTGTTMNNNNNNNNMATGTMCTCCGACTTGTGCCTMTAGGATTTGACCNTTAA GAGGNTTCTTTTGCTGTGGANGGGGTGGCTTTGCTTGMCTTCCATTCTGtT/G]GCCTTGTAGCTGGTG
WI-8426 1 84 G AGGCTGGGAGTATGGANGGNCCCGGGGCCCTTGGCNATNGNATFCAGTGAG
TTGAGCCTCCACAMTMTGCAACCMGTTTTACATTTTTMCAGCCCTTCTACATACACT[C/A]CA TCTTCTCTATCTTAGTTCCMGTTTTAGTTTTCMTCCCMπATACCMTTCCATTGTTATTTTMGA MAMCCTTCCCAGTTATTGTCAGAMCTATGATTTAGCTTACCCCCTCCACTACCCAGCAMCTAC
Wl-8450h 61 C A AGAGAGGATGGGAGTGTMTATGAGCAGTACAGAGTCTTMTGCMTTCAT
ON UI
ON 4i-
ON CΛ
ON ON
GGCCACTGTCCAMGTCTGTCACAGTCCTCCATATGGCAMGATGMGMMTTGGCMTCT TTA
GGGGTACCMGGNTCTGAGTTTGTACGGTCTTTATAMTGCAGAGCMGATGTGGCTTTCCTGCCCC[ C/AJATTTCACCTCMGGCATCTTCAGCMCCCCACATGGCTTCCCTCTGTGCGCATGAMTMCTTG
Wl-9676h 1 34 AGGCCAGGGTCTCTCAGCTTTAMGCCTTGGMTCCTATGCATTGTTTGTTT
GGCCACTGTCCAAAGTCTGTCACAGTCCTCCATATGGCAMGATGAAGAAAATTGGCMTCTTTTTA GGGGTACCMGGNTCTGAGTTTGTACGGTCTTTATAMTGCAGAGCAAGATGTGGCTTTCCTGCCCCC ATTTCACCTCMGGCATCTTCAGCMCCCCACATGGCTTCCCTCTGTGCGCATGAMTMCTTGAGG[
Wl-9676g 202 C/ηCAGGGTCTCTCAGCTTTAMGCCTTGGAATCCTATGCATTGTTTGTTT
GGCCACTGTCCAMGTCTGTCACAGTCCTCCATATGGCAMGATGMGMMTTGGCMTCTTTTTA GGGGTACCMGGNTCTGAGTTTGTACGGTCTTTATAMTGCAGAGCMGATGTGGCTTTCCTGCCCCC ATTTCACCTCMGGCATCTTCAGCAACCCCACATGGCTFCCCTCTGTGC[G/ηCATGAMTAACTTGA
Wl-9676f 1 84 GGCCAGGGTCTCTCAGCTTTAMGCCTTGGMTCCTATGCATTGTTTGTTT
GGCCACTGTCCAMGTCTGTCACAGTCCTCCATATGGCAMGATGAAGAAMTTGGCMTCTTTTTA GGGGTACCMGGNTCTGAGTFTGTACGGTCTTTATAMTGCAGAGCMGATGTGGCTTTCCTGCCCCC ATTTCACCTCMGGCATCTTCAGCAACCCCACATGGCTΓF/CJCCCTCTGTGCGCATGAMTMCTTGA
Wl-9676e 1 73 T GGCCAGGGTCTCTCAGCTTTAMGCCTTGGMTCCTATGCATTGTTTGTTT
GGCCACTGTCCAMGTCTGTCACAGTCCTCCATATGGCAMGATGMGMMTTGGCMTCTTTTTA ON -4 GGGGTACCMGGNTCTGAGTTTGTACGGTCTTTATAMTGCAGAGCMGATGTGGCTTTCCTGCCCC[
C/A]ATTTCACCTCMGGCATCTTCAGCMCCCCACATGGCTTCCCTCTGTGCGCATGAMTMCTTG
Wl-9676d 1 34 AGGCCAGGGTCTCTCAGCTTTAMGCCTTGGMTCCTATGCATTGTTTGTTT
GGCCACTGTCCAMGTCTGTCACAGTCCTCCATATGGCAMGATGMGAAMTTGGCMTCT TA
GGGGTACCMGGNTCTGAGTTTGTACGGTCTTTATAAATGCAGAGCA[A/G]GATGTGGCTTTCCTGCC CCCATTTCACCTCMGGCATCTTCAGCMCCCCACATGGCTTCCCTCTGTGCGCATGAMTMCTTGA
WI-9676C 1 1 4 GGCCAGGGTCTCTCAGCTTTAMGCCTTGGMTCCTATGCATTGTTTGTTT
GGCCACTGTCCAAAGTCTGTCACAGTCCTCCATATGGCAMGATGMGAAMTTGGCMTCTTTTTA GGGGTACCMGGNTCTGAGTTTGTA[CtηGGTCTTTATAMTGCAGAGCMGATGTGGCTTTCCTGCC CCCATTTCACCTCMGGCATCTTCAGCMCCCCACATGGCTTCCCTCTGTGCGCATGAMTMCTTGA
Wl-9676b 92 GGCCAGGGTCTCTCAGCTTTAMGCCTTGGMTCCTATGCATFGTTTGTTT
GGCCACTGTCCAMGTCTGTCACAGTCCTCCATATGGCAMGATGMGMMTTGGCMTCTTTTT, GGGGTACCMGGNTCTG[AC]GTTTGTACGGTCTTTATAMTGCAGAGCMGATGTGGCTTTCCTGCC CCCATTFCACCTCMGGCATCTTCAGCMCCCCACATGGCTTCCCTCTGTGCGCATGAMTMCTTGA
|WI-9676a 84 A C GGCCAGGGTCTCTCAGCTTTAMGCCTTGGMTCCTATGCATTGTTTGTTT
TGGACCAMCACAGACAGATGTATTCCTGGTGCCTGTGTA[C/AJATTACMCTCATTGATCACATGC AGCMCATCMCATCTCMGGAGTCCATTTGTTCAAAACACAGTAMTGACTCCACATTTTCCCTTT GAGTCMCAAMGACTCTGCTTGTCACCTTGCCTGGAGCGGGGTGGTTTTTCACTATGTGAGTATCTA
Wl-9738b 40 TCTTTTTATTTCTGTCCCTTATGTTGGTGGGCACATGTCTGTATTGCTGTCC
TGGACCAMCACAGACAGATGTATTCCTGGTGCCTGTGTA[C/AJATTACMCTCATTGATCACATGC AGCMCATCMCATCTCMGGAGTCCATFTGTTCAAMCACAGTAAATGACTCCACATTTTCCCTTT GAGTCMCAAMGACTCTGCTTGTCACCTTGCCTGGAGCGGGGTGGTTTTTCACTATGTGAGTATCTA
WI-9738 40 TCTTTTTATTTCTGTCCCTTATGTTGGTGGGCACATGTCTGTATTGCTGTCC
ACTGAMTGTAMTGGCCMGGCACCCAGGACCTTAAAMTCATMGMGTTMTCTGTGGGMM GAGTMCTACAAMGCATCTAMCMGAGCAGGATGTGATGTAATGTGTCCCCTTATCACTTTAGTC AGTAAAGATMGMAGCCCTGGTGAGTATCCACTTCCACAMCACACAGMTATACACTTTTGGMG
WI-9756 47 ATΓTCCACTTMCCACTTGATTCTTCAC I I I I I I ATGATTTAAAACTCTCCGTGG
GATGGTCCCTTMGGATTTGCATTGGTTMTGGGCAGACTGGTGCAAMGAGGCTGMTTGMTMT TAGGMACTGGGAGMTTCAATTCAMGMGMTTCTTGTTCGCMGGTCMTTTTTATACTATTTA A[A/G]TAAAATAACTCTGGTAGGTTCTATAGCAMTGCTAAGTAAAGTAACCGCTGGTTTCTAAATT
WI-9758 1 35 A G ATTACG
ATTTAAATCCAGGCAGCGGGGAAMTGGATACTTTCATATGTCTCTGTACCCMCTATAMCTTTTG ON CO GTTCTCATGCACCATTTTCATTTTGCCTTCTCACTCCMGTACCACTGATTTTACCMTT[G/A]CTCTC ATMTTGACTTTGCTACTGGMGAMCTCTTAGMTGTFGGMTTTCTCTATTACACACTTTGCCTCA
WI-9778 1 27 MGMTGTGTCAGTCAGGACTAMGGCMTAGTCTCAGGGCAGACAGCC
TCTCCCCTTTGCCTCCTCATGCCCACTCCCTCAGCCTGCACAGAGCGTTTCTCCAGTGTAGTCTCTGGT CCATCTGCATCAAMTCACCTGCAGGACTTGCTGACMTGCAGTTTC[C/A]TGGATCCCACCCAGGA CTCAAAAAMCTAGGMTTGGGAGMGAGGGACCTGGMTCGGTGTTGCTAGCMGCCCCCAGGTGG
WI-9832 1 1 6 A TTTGTMGTGGACTAMGTTTGAGGACCAGACATGGMGGTTGGCTTTGGC
TGGAMMTAGC" TTTATCAATCTCTGATATGCTACATATGTCATGGAGAMTGCAGMTGGCATGA TATGAAATTCCA" TTTTGAATGAATAAAATATAC[A/G]TGTGTATGTATATATACTTATFAACACTT
AGGATTATATACACACMTMAACGTCTGTMGGATAMCTMGGTTCTATCAGTGGGAMTGAGA
WI-9841 1 01 G- TTGAAMGAGGGGGATGTGTTACTTGATATGCTGTTG
GMCTMCACCTTTCTTGCATGGA I I I I I CTTGATTATTGGCAGTTAACMTMMTGTTATTAGATC ACTGGTGCTTCTGTGTGGGGTTGAG I I I I I l ATGATATCTCCTGTTAGACCCATMGGGAGGCTGTGA GTTGTTTTCTACATCCTTGGACTATATAAGATCCTCTTTTMMTTATATTTTATATMGCACATGAA
W1-9880C 222 G A AATGGAATGAAATAATGAfG/AITTGACATAGGAATTACCTACATATTTTG
ON VO
-4 SI
-4
UI
-4
4i.
-4 CΛ
AGTATACAMCATTTMGCTGTGGTCMGGCTACAGATGTGCTGACMGGCACTTCATGTAMGTGT CAGMGGAGCTACAAMCCTACCCTCA[A/GJTGAGCATGGTACTTGGCCTTTGGAGGMCMTCGGC TGCATTGMGATCCAGCTGCCTAπGATTTMGCTTTCCTGTTGMTGACAMGTATGTGGTTTTGTA
DWU-252 94 AT
GMCATTCCTCTGCAGCACTTCACTACCAMTGAGCATTAGCTACTTTTCAGMTTGMGGAGAAM TGCATTATGTGGACTGAA[C/ηCGACTTTTCTAAAGCTCTGMCAAAAGCTTTTCTTFCCTTTTGCM CMGACAMGCMAGCCACATTTTGCATTAGACAGATGACGGCTGCTCGMGMCMTGTCAGAM
DWU-330 85 CTCGATGMTGTGTTGATTTGAGAMTTTTACTGACAGAAATGCMTCTCCCT
GAAAATGTTAATTGGGCAGGTGAAMGGGTACAGATGTGCTGTAGCAGACCTTTGGTTTTAAMGAG
MGCATCATTTCCCCMCAGGGCMCTGTAGMGGCCAGCTGMGAGTAMGGAAMGGTCTGAGG
ACTGAGCCTGTGGCTGGCTGGAAAMGGTGMTGTTGAGGGCCCTTCACTTCCATCACMGAMGTC
DWU-370 231 i ATTAGACGGTACCMTTCAGTGTCTGTTCCT[A/G]GCATCTATTTCCTCTGTGC
CTCTTAACTTCAGTTCCCTCATCTATAAGMTAAGGGATTCAGTTGTGATCACATAGCTCAGGTAATC
DWU- CAGGACCAGAMCCCAGGAGC[A GJTGGGACCTGATCCACAGCTAGAGGATGGGGGACTCTGTAGCT 1537b 89 ACAGCATTTTCCTGMCACACMGMATCCAGTMGCAGCACACACTGGCTGA
CTCTTMCTTCAGTTCCCTCATCTATMGMTMGGGATTCAGTTGTGATCA[C/T]ATAGCTCAGGTA! -4
ON
DWU- ATCCAGGACCAGAMCCCAGGAGCATGGGACCTGATCCACAGCTAGAGGATGGGGGACTCTGTAGCT 1537a 52 T ACAGCATTTTCCTGMCACACMGAMTCCAGTMGCAGCACACACTGGCTGA
ACCATCTTATACTATGGCAGGTMGTCCATACAGMGAGCCCTCTCTCCCTGGGATTTGAGTGGGGTC CCCAGCTCCACCCAGAGGCCCCTGGGGMTTCCAGGGTCACTGTTCCTTCCTGTCTCCCTGTGGGMT
ESTD- CMGCCAGCTCCAGGCCAGMGTGGGACTGTGAGGACATGGAGGCCTCGGCACTGAGCTG[C/G]AGA ADAb 1 96 CCCGCAGACCMCTCCTGAGCTTTCTGGGCCTCTGAGTCTTGTCCTC
ACCATCTTATACTATGGCAGGTMGTCCATACAGMGAGCCCTCTCTCCCTGGGATTTGAGTGGGGTC CCCAGCTCCACCCAGAGGCCCCTGGGGAATTCCAGGGTCACTGTTCCTTCCTGTCTCCCTGTGGGMT
ESTD- CMGCCAGCTCCAGGCCAGMGTGGGACTGTGAGGACATGGAGGCCTC[G/AJGCACTGAGCTGCAGA ADAa 1 84 CCCGCAGACCMCTCCTGAGCTTTCTGGGCCTCTGAGTCTTGTCCTC
TCTCCTGTCATTCCTACTCCATTAGTTCMGGTCAGTGMGMCTGGGGCMTTMCCMGTMTTCA
ESTD- TGGACTGCCCMCTGCGMACMGMGGGCGCAGTGGAGCAGGAGTATTATGCTACGCGGTTACCTT ANT1 1 60 Tl TTTTTATGGAGGACCGMCTGAGGCrr/qGAGCTCAGATGATCCTGT
TGCCTGGGGTGGCMGGCTGCAMCMGGAGGCMCCCAGGAGGCTTTTATGMGCGGGCCATGGTA
EST10398I AGATGCTGCCACCTCTTATCTACTTGATGATGTTCACATTTGGGGCTTGACTTTCCMCACGGAGMG 2b 1 68; A' G - CATTGTTTTCTTCGGGCCMGMGGTATCTACCrA/GIATAGTGTCTATTAGGCATTTG
-4 -1
-4
00
-4
VO
oo s
00
oo
SI
00 U
00
00 ON
00 -4
00 00
00
VO
VO
©
VsO
VO U
VO
VO CΛ
VO
ON
VO ~1
00
V VO
s
© o
SI o
SsI
S)
SI
© UI
©
S © CΛ
s>
©
ON
Ss)
-4
I
© oo
AAAGCATGAC CGCTTATGTTA AATAAAATGA ATAGTMTTCC CMGTGAATATTGATACATGGCTGACMAGCATGACMTMMTGMCAC[A/G]TACGGGMTTAC
WI-17904 50 G ACAC 03 TATTAACATMGCGATAACATCAAAACATCTGGTAMATGCAGTTAAAACMCAACACAMTGA
TGCCAMTAC MCTACTAGCG G I I I M I CTTTGAGTGACACMGCTTGTTCA I I I I I GAGAAMTGTGTGCCMATACTCMGTGTGM
EST34149 TCMGTGTGA AGMCAACTA T[A G]GATTTTATTAGTTGTTCTCGCTAGTAGT ΓGGTATTCTATGMAMMGCAGCTAGTTCAGC 5 69 AT ATAAAATC TT ACAAATCACACAAGT
TGGGAAMCATMGTTMCTCMGMTATATFCCAGTCTTTATGTTACTMMCATTGTMTAGTGT
EST34343 TTTTATCAATGATGCCGAGGTCACTGCT[C/A]TACAMGATTAMGMACTTACCATCAAACACTTC 8 95 CAGTGCATCM
GGACCATATG CAGAMTTATG GGTACACAATTTTMTGGMGGMCCACAGGTATGTTGAMGMCATCAGTACAGCTGGAGACAGG ATATATAACT TGATAATAACT GAGGGACCATATGATATATMCTCCTAMAGC[C ]GGMGGAGTTATTATCACATMTTTCTGGGC
WI-17982 98 CCTAAMGC CCTTCC GCTACAGMG I I I I I CATCA
CTCAGTMCTCCGGTGTATMTCTGCCATTTATTGATTTATTTATGATAAMCMCCTCTCATTGTGA AAMCAGCTMGGGTGACATCTCCAGACCCMCCACTGTCCCTGTMTGT[A/C]CTGCTGAGAGTCC
WI-17993 1 1 8 ACATTTTGGAAATCCAAT
CCCATCCAGAMCCCCAGTGTGATGGTGGMGCAGCATGAAMCMCATCTCCCCAGGCCTCGCAGT
GTAGAGGCGA AGGCACATGGG AGAGGCGAAGGGMCAG[A/GJGCTGCCCATGTGCCTGTCTCTAMGACGCCACCCTCAGGTTGATGT
WI-17996 84 G AGGGMCAG CAGC CACCTGTGGGAGACCGGGT
ATTCTTTATAAAAACACCATGTCCCTAAAATGT[C/GJATTCMCATATATGCACACCTTCGATGTAT
WI-17136 33 AGGACACTGATCAAMMGACAGAGAMTGTGTCCCT
GCCACTGAAAAMGGTGCTCTTCC[A/C]GTTTCTMCTCCCTGGACTCCCTCATTGGMCTGMGCTC ACAGATGTTTCAGCTGGACTAGT1TAGACTTTGCTGTATTTTAMAGGCAGTGTTGATGCTCCAGGAT
WI-18041 24 TCAAATACTTAATCA
EST35164 CACAGCCCTGC CCCTCTGGATT TTGMCCMGGCCCTMCAGATGACTCAGCAGGGCCTTCMGCACAGCCCTGCCCCC[AG]TCTTGA 8a 57 G CCCC CTGMTCTCM GATTCAGMTCCAGAGGGTGCTCAGTCCTTGGTTTAGGTGCTTCTGTGACATTTCCTCTTG
AGCGMTGMMTGCTACATAGGCTCCCTGAGTTCTTTCATGTACGMTCTTGGTTACACATCTTAGl
Wl- AGJACAGCAGAGCTGCCTGAGGGAGGGTTGTGTTTMTGTCGTATGCATGCTCAGCACAGTGCTGGC 18052b 67 ATGGCCCATCCATGCTTT
CCTGAGTTCTT AGCGMTGMMTGCTACATAGGCTCCCTGAGTTCTTTCATGTACGMTC|T/C]TGGTTACACATCTT
Wl- TCATGTACGA CTCAGGCAGCT AGMCAGCAGAGCTGCCTGAGGGAGGGTTGTGTTTMTGTCGTATGCATGCTCAGCACAGTGCTGGC 18052a 50 ATC CTGCTGT ATGGCCCATCCATGCTTT
GGGAGTGGGG CGTCACCCTGC CTGTTGTGCTGAGMCAGMGGGGTCMGGGAGTGGGGGAGTAAAA[G/A]TGGMGCAGGGTGACG
WI-18054 46 GAGTAAAA TTCCA CATGCAGGAGTCCAGACAAMGACGGGTGATTTTGCTCAGGTTGGTAGCMCAGAGGTMTG
S)
O
S)
S) S)
s>
U)
S>
TCATCTGAGA CATTATAGGTA TCC I I I I I ATTCATGATTTGTTTCATCTGAGAATAMCTTCCTGTCTMTTTTCCAA[C/G]ACTATGTT
EST39236 ATAMCTTCCTj CTGAGTCATAC TAATGTATGACTCAGTACCTATMTGAGACTGGAMTATATTACCTGGCAMTGMTGAGGTGTCTC Ob 57 GTCT ATTAAACA
GCACMTTAA CAMCAGACCTTTGGTTTGAGCTCACCTGGTGACAGGAGACTCCTACCTGAMCAGGGATGCC[G/η
EST39294
CCTGAMCAG ACATAGTACCG TTCTCGGTACTATGTTTMTTGTGCTGAGCCAGCMCCCTCGAGTTACCCGGCCTTTTACCCCACGCC 4 63 GGATGCC AGAA AGCTCTGCTTGTCTGCAT
AGMAACATTCTGTCTGATCAGAGGMGATGTATGTAGAAMTCAGMTCTGACTGMTTCCTMA
EST39366 ATCTAT[T/C]ACACTGAGAGGAAAATGGAAAAGAAMTGTTTGCATAMGCTTTTCCCTGACTCTCA 2 GAGGGGTTCAGA
TGATTTGAGAC AAAMGCTGTAGCTGGCMGTCAMGTTTATTTTATGTGTGTAMTTCCCAGTTGAGCATTFTTTCAT
EST39371 CATTTGGATTA ATTTCACATTT TTGGATTAGCGTGAGAGG[A/G]AAAAATGTGAMTGTCTCMATCAAATGCTTCCTTCTMAGATTA 9 86 GCGTGAGAGG TT GACATTGCCCMCCCTGC _
ACMGTGACATATCCMCCMCC[A/G]TCCATCCCCACCTGTGCCCTATTCTTTCCTTGTGTTTCTTT AGAGCCTTTTCAGCTATTTCCTGTGMGCAMCTGCACGMGGCCTCCCCCGTACTCCTCCCCTGGM
WI-17177 23 G _
AGGTTCCTGGTTGCTCCCCACMTTTTGATTIC/TJGGTGGCTTCATAAGGGACCCAGGATFCTGCATT S
EST39428 GCTCCCCACA GGTCCCTTATG TTCTGGGTGGGGCCTAGGTMTTCTGTTGCCTTTGGTCCACAGAGCACMTTAMGMGATCAGGTCT CΛ 8 31 T ATTTTGATT AAGCCACC GGCTGTTGC _
GGCAGAGGM
EST39430 TMCTGATGTT CAGGGGTCGGG MTTTAGCAGAMCMTGMGTTGGCAGAGGMTAACTGATGTTC[A/C]CAATACCCCGACCCCTGA 2 45 C GTATTG CCCAGTACCTTTCCCTCAGGCCCAGGCTCCGGTGGAGGATGTCCTGGG
CTACTGACAT AAAGCCCTGTAMCTGMGCTAGACMCGTCMCTTTGGMGMAATMCAGGMCCTATTTATAT
EST39446 AGGGACTTCA TCCTGGAAMC ACGTAMTCACTTTCATACCTGCCTACTGACATAGGGACTTCAGAGTMTA[CAF]GGTTTATGTCAGT 7b 117. GAGTM TGACATAMCC TTTCCAGGATTGTTCTCCC
EST39465 MTGCAGGAG CMTCTCGGCC ATGGTGTCATTAGAGGGCCACAGGGGATGGGGGAGTAAMAATMCATAMCGMCTGMCAGAM 2 80 GGTGGC CCTCT TGCAGGAGGGTGGC[A/G]AGAGGGGCCGAGATFGGGTGTTCAGGGCAGAGAGGTGGMGACCAG
AMGATTCCT
EST39501 GTAGACATCT CACTTGCMTT TGCTTACMCCCATMCCATAGGCCATGTGTTCAGACATTCTTGACCMGCCTMAGATTCCTGTAG 0 81 MCATTAG CTGMGGCT ACATCTMCATTAG A/GJTAGCCTTCAGMTTGCMGTGCMGTTCAAGTCMACCMTTC
CACAAMTGGGACTGCTGMGAGTGGACAGTTGGACCTTACTTTGGTGACCCCATACATTTGTGGTCAI
Wl- CATGCTTTAGCCATAqA/C]CATGGTMCATTGACTATGGAGTCTTGTGAMGTGTMTGTGCGATG 1 8387b 84 A C - GCTATGTAGACATAAAGA
SI ON
S)
SI
VO
SI
S)
SI ) )
CAGGCAGGACTTCAGTGTCAGTATCCCTGCCTTCAGTCTTCTTTAGAMTCACATCTGTGTTCMTCC ATTGTTTAGAGGGAGTGTA M i l l CCTGTTCCA[C/ηGMGAGGACTTTTTGTTCACMTTGGATCAC
D63807 1 01 T MTGCAGAGGAGTCTGTTCCTCCCCCGTCGGCTTCTCGGTGCTGGGAGGGTGACCTGTCCCAGATGAC
TGGGMCATGCGTGTGACCTC[T/C]ACAGCTACCTCTTCTATGGACTGGTTATTGCCAMCAGCCACA CTGTGGGACTCTTCTTAACTTAAATTTTAATTTATTTATACTATTTAG I I I I I ATAATTTATTTTTGAT TTCACAGTGTGTTTGTGATTGTTTGCTCTGAGAGTTCCCCCTGTCCCCTCCACCTTCCCTCACAGTGTG
D90145 21 C - TCTGGTG
EST14035 ATTATCACTCTCMAAATTTTGGTGTGTGTGTTTMGTACTTTCTTATTTATGAGCCCC[T/C]GAGGA 1 a 59 c - CCAGACATGTTATTATCMGCCCCTTATATACCATCTMT
EST16668 GCATTTTAAAATTCACATTGMTCATTATTTACTATTTATGATGTTTACATAACMTTCAGTATCATT 5 71 T ATG[C/T]TGTAGATTTCAGATGTAGGTCGTCAATACTGAGCACTTATCT
EST16904 ACAGACTATCGCCAACTTATMTGCTTAAACTTTATGATCMTAGTAATAMTTACA[C/T]GAGATA 7 5_7 T TTCACACTTTATTATAAAATAGGGTTTGTGTMGATGA I I I I I CCCMCTGTAGGTTMCAT
EST21863 TTTTTMGTACCAGAGGCACTGCTGGMCAGGATGAAMCTGATACACCIA/G]GTTACTACTTACTC 9 49 G - TTCACTCTTCAMCTGATTCCCCTAMGACTTCTACTTAGCAM
SI
EST21885 GGCTGTMGTAGAATCAMGGTTMGMCATTTTATGCACTTATTCCACAAACATTTACTGAGCATA SI
UI 6 80 A CTAGGTGCTGGGA[G/A]TGTGACAGTGAGCAAAMACACM
EST22623 ATTTTAGTGCAAATGACAMGCCCAA[A/G]AGMCAGAGGATCAMTAAGATTGAAATGTATTACC 8a 26 G_ TTCTCATMGTATACGAAGTTTAACACMGTATGGGAGT
EST22644 AAAATGATTGMTTCAGCAAGTACATTTATGATCTATCTACATTGTTAAAACAGCACTAAAAATAA 2 98 G MA M I L L AAAATGATTATCCATTATTTACAG A/GJAAATGTGGAAAAGATGGCTTTTAAACCC
EST23587 | CCTCATTTATTTAAAAAGACGGACATAAAAA[T/A]TATACAACAAAMACCCAAGTCACATTTCAG 1 31 A GAGGTAAMACTAAAMGTCTGATATGAMATATGGTGG
MAGATCTGGCATTATTCACATCATTCTMATATTTTGTAATTAC I I I I I CCATGAGTATTTTTTTCA
EST24246 TGTCCMGCATTTTMCTATCATTTTAGCGTAMTACCΓF/C]GMTAACCCATAGTTACAGAATTGG
7 1 06 GTCTGTGTMCCTCMTT
TAGTTTMTTTTCTGMCCTTTGGCTTATAM I I I I I CTCAACTT[A/G]CATTTAAAMTGTATCAAT 45 GCACCTTCTTCAGTAGTACCACATGAAMTATAMCCTCGTTC
EST24435 CTTGMCTTCTGGTCTCMGTGGTACGTCCGTCTCMCCTCCCAAMTGATGCGATTACAGGCATMG
(3 73 CAGCCLGYALTGCCTGACCCACATTTFCTTTATCCGATCTGTTGATGGACATTCAGGTTGTTTC
EST25089 TATTGTTGCATTATCAAAATGGTTAΓF/C]AGTTTTCMTTAAAACTGTAATTGATTTCTATGTATAAA 6 25 ACAGCTFTGMGTTGTMATGTAGTTTCCMTCGTTAGTTMTGCTACATT
s>
SI
s
S) CΛ
SI SI
S)
S)
00
SI s vo
SI oI
I
UI
AGTTGCCAGCTCCCATGTACCAGCAGCTGGMTCTGMGGCGTGAGTCTTCATCTTAGGGCATCGCTC CTCCTCAC[G/A]CCACAMTCTGGTGCCTCTCTCTTGCTTACAMTGTCTAGGTCCCCACTGCCTGCT GGAMGMMCACACTCCTTTGCTTAGCCCACAGTTCTCCATTTCACTTGACCCCTGCCCACCTCTCC
U31416C 76 MCCTMCTGGCTTACTTCCT
AGTTGCCAGCTCCCATGTACCAGCAGCTGGMTCTGMGGCGTGAGTCTTCATCTTAGGGCATCGCTC [C/ηTCCTCACGCCACAMTCTGGTGCCTCTCTCTTGCTTACAMTGTCTAGGTCCCCACTGCCTGCTG GAMGMMCACACTCCTTTGCTTAGCCCACAGTTCTCCATTTCACTTGACCCCTGCCCACCTCTCCA
1131416b 68 ACCTMCTGGCTTACTTCCT
ACGGGTCACACAGAGAMCCTGAGTCTAGCCATGAGGGGCTTATGCTCCCMCTCACATTGTTCCTCC AGACCGCAGG[CTJTCCCCCAGCCTCAGGTTGCTGGAGCTGTCACATGACTGCATCCTGCCTGCCAGG GCTGCAMGCMGGTCTTGCTTCTATCTGGGGGACGCTGCTCGAGAGAGGCCGAGAGGCCGCAGMC
U37519a 78 ATGCCAGGTGTCC
GACCACGCTGAMCCCACCCACCCGCTGTGCTGACCATGGGCCCTGAGCGTCCT[A/G]CCCCGMTTC ACGAGGCTGAGGCATCCGGGAGCTGGCGTMTGCCTGGCCGCAGTGTGTGTGTATCCCATACCCCACT
U37690 54 A! G CTGGMGGMCCATCCAGTAMGGTCTTT
TGAAACCGTTTCMCATGGAAATGATCTGTATTGACTM[T/C]ACACCAGTCCACACTTCTATGACT ) UI TCTGCCATTTCAMGACTCATTTCTCCTATMCCACCGCATGAGTTGMTCAAMTTTTCAGATCTTT SI TCAGGAGTGTMGGMACATCATGTTTACCTGTGCAGGCACTAGTCCTTTACAGATGACCATGCTGAT
V00540 39 A
TCMGMGGTGACTGCCCTTGTATGATGGGATGGGMGATGAATGACTGGTTTTTACTGGGGTGTM AACCACTCTGAGCCTCTCTGAGACCATGTGGTTTTAAM[A/ ATCCATMGGGMGGTACCCACAC CAGTATCTGAGTTCCAGTAGCTMGACCCTAGMTTTGGATTCATCTCTG I I I I I I CATGTCTCTCCTT
X15943 1 06 GTMCCCTGAGATCATCAG _
AGGMGATCCCACCGACCCTTCCTGGCCTMTCCTTTAGATFAGGTCACATTACATTMCATTTAGGA ACCCAGACCGAMAGTTGCTGAMGGGMGGAGACACATTCACAMGAAMGTTGCGMMTTGCG MATCTGTTGTGCA[C/ηGCTCAMTGMAACGCCTTTCGGCTTTGGGCTTTTA I I I I I I I GGMCTG
X5201 1 b 1 48 T CGAGTGGCTTAGGTCTAGCCT
AGGMGATCCCACCGACCCTTCCTGGCCTMTCCTTTAGATTAGGTCACATTACATTMCATTTAGGA ACCCAGACCGAMAGTTGCTGMAGGGMGGAGACACATTCACAMGAM[A/C]GTTGCGAMATT
I GCGAMTCTGTTGTGCACGCTCAMTGMMCGCCTTTCGGCTTTGGGCTTTFA I I I I I I I GGMCTG X5201 1 a ' 1 1 8 C - CGAGTGGCTTAGGTCTAGCCT
SI
UI UI
CATCCCMGGCACTGGTGGTGACTCTGCTTCCTG[C/T]ACTGACCCAGAGCCTCTGCCTGTGCACTGC MGCTGTGTCTACTCAGGCCCCMGGGGACTCTCTGTTTCCATTCTCCCCCCACAGACCTGTCMGAG
X87344 34 T MGCATGACAMCMMTCATTTACCGACTTTAGTGCTTTTTT
GGTGGGCTGGTATCTCAGAMGTGCCTGACACACTMCCMGCTGAGTTTCCTATGGGMCMTTGA AGTAMCTTTTTGTTCTGGTCCTTTTTGGTCGAGGAGTMCMTACAMTGGATTTTGGGAGTGACTC MGMGTGAAGMTGCACMGMTGGATCACMGATGGMTTTA[GtηCAMCCCTAGCCTTGCTT
X87838 1 79 G GTJAAAATT
GTTCTGCTGCCTCTACACAGGGGCCCTGTACAGTGMTGGTGCCATTTTCGMGGAGCAGCAGTGTGA CCTCCTGTGACCC[A G]TGMTGTGCCTCCMGCGGCCCTGTGTGTTTGACATGTGMGCTATTTGAT ATGCACCAGGTCTCMGGTTCTCATTTCTCAGGTGACGTGATTCTMGGCAGGATTTGAGAGTTCACA
Z14138 81 GMGGAT
TMTCCTCACCATTCCTCAGGTATMGTTCTATAMCAGGCTTGGMTCTGGGTMTTMAMCAGA AMTTATAGTCMTATACCATGACATGMGMTGAATCCATTCTTTGGAGATGGAGTATACATGACT GCMCTGTATTTCATACGTTCTTTTCAMGTGGGATAGCTATTGCAGCTTAMGAGC[A/C]CAGGTTC
Z18859 1 91 CAGTACTGGTTTTCCM
S)
AGMCCTGACCAGATGTGGCTCGGAGGGGMTCCAGACCCGCTGCTGTCTTGCTCTCCCTCCCCTCCC UI CACTCCTCCTCTCTTCTTCCTCTTCTCTCTCACTGCCACGCCTTCCTTTCCCTCCTCCTCCCCCTCTCCG CTCTGTGCTCTTCATTCTCAC[GA]GGCCCGCMCCCCTCCTCTCTCTGTCCCCGCCCGTCTCTGGMA
Z23091 1 59 G CTGAGCTTGACGTTTG
GTTGGCATTGTTAGTAAMCTTCATAGGTGMGAGGAGGATCAGTGAGATTMGTTATTTTATCAM GTGTGGTTTTCTGCMGGGCAGGTTTGAMCCTGACCCTAGTTGTGCTCCAGGACCTAIA/G]GCGTGC TCACTCTACCTTGTCTTTGTGTTGAMGGAGTGGTTTCCCATGACTGTTTMGTGACMGTGCCATGG
1 1595b 1 25 ATATCTACACCGTCACCAGACTAGATTGTCTCMTGTCCTTGGCTTGCGAC
GTTGGCATTGTTAGTAMACTTCATAGGTGMGAGGAGGATCAGTGAGATTMGTTATTTTATCAM GTGTGGTTTTCTGCMGGGCAGGTTTGAMCCTGACCCTAGTTGTGCTCCAGGACCTA[A/G]GCGTGC TCACTCTACCTTGTCTTTGTGTTGAMGGAGTGGTTTCCCATGACTGTTTMGTGACMGTGCCATGG
1 1 595 1 25 ATATCTACACCGTCACCAGACTAGATTGTCTCMTGTCCTTGGCTTGCGAC
TATATCACATTAGTATGTCACTGCCATGGTMGGACTTTGATCACTAGGAAATMGMCACTTTGM TGGTCTTGTCCTTTCMTMMAGAGTGACATGATTGMCATGTGTTTTAGATMAGGGCACTT[GtF ]GCAGGAGTGTTTAGGATGMGAGAGMGAGATTMGGMGATCAGGMGMMGTAGCMTGGGA
1241 1 31 G T ATGAMATAGGAGGCCCTGAGATCCACTGGATMTCTMAAMCCMGAGAMG
GTGCGATCACCACTACAGTCTMTTTCAGATGFTTTCATTACCCCTAMAGAMTCTTGTACCCATTA GCMTTATTCCTCATTCCTGCCCTCACCCCCAGGCCCTACTCTTTATCGCTATAGATTTGCC[C/ηACT TGACATATCATACACATGGAGCCATACATATGTGTGCCCTTCATGATTGGCTTCTTTCACTGAGMTA
1 282 1 30 ATGTTTTCAAGGT
AGTATCACACATACTTAATATATTAGATATACACMTMTMAATCACTCCCTACCTTGAAMCTTT A[C/ηAGAAGCATTTTTAATTTTACAACACAMGCTCAAACGMCCTACAATMGTCTAGTAGTCTG TTTACGTGCCAAGGGATAAGGCTGAACMTAMTTMCCCTTTAAAAATGTCTATGMCAAGTACAA
681 0 68 T TTTTC I I I I I GAGTTCTGCAGAGCAATGACCACTMGMATA I I I I I AMGGC
CCMGTACATTGGGTGMCGATGAGCTAGCTGTTCTAGTATTTGC I I I I I GTMTCCAGTTMGACCA TCAGCATATACAACATCATCACTMCTCMCMTGTAGCTGCAGGGTMC[A/C]TGTGGATACCCTG TGTGCTCTACTGGCCTCCAMGGCATTCAGGGGATCATCAMGATGTTGGACACCTTGTGTTCAMTC
681 7 1 1 8 πGGTTCAGGTGCGGCCTGTGCAGΛTCGGCTTTTTGGTTTGGTTGTCTTAG
CCATTTTA I I I I I CTCTAAATTTTAAAATAGMGACTTTMTGGAAAACATTFAGTACCATCATGTCA CCCTGMTGCCAGCMTACCTCGACTTTTACACACGCAGGMGCCTAGTAAMGCCCCGTCAGTAGT ACACATTTCTCTATGGTCCTTCMCAGTTTTGCATATACAAMTTTTCTGCTATTΓTGCTTTAGCAAA
6819b 21 2 CAGCMTMCTTTTGTGTTTCCTATATGACACCTAATATCCA SI u>
CCATTTTA 1 1 ΓI FCTCTAAATTTTAAAATAGAAGACTTTAATGGAAAACATTTAGTACCATCATGTCA CCCTGMTGCCAGCMTACCTCGACTTTTACACACGCAGGMGCCTAGTAMAGCCCCGTCAGTAGT
ACACATTFCTCTATGGTCCTTCAACAGTTTT[G ηCATATACAAMTTTTCTGCTATTTTGCTTTAGC
6819a 1 66 UT AMCAGCMTMCTTTTGTGTTTCCTATATGACACCTMTATCCA
CTGGTATGTCATAAGCMTCCATAATTGTTATAGCTATT[A/G1TTATACTATGGCACCATTTGGGACA
CAGATTATATATGTCAGACACCACGMTGTCCTTTMGATATGCAGCMGCACAMTCTGTCATGGT
681 xx 39 TTAACAAMGAAATGMCGTCTAGG
AGGATTCCCTCTTFTFCTATTGATTGGMTAGTTTCAGMGGMTGGTACCAGTTCCTCCTTGTACCT CTGGTAGMTTCGGCTGTGMTCCATCTGGTCCTGGACTCTTTTTGGTTGGTAMCTATTGATTATTGC CACMTTTCAGA[GtηCCTGTTATTGGTCTATTCAGAGATTCMCTTCTTCCTGGTTTAGTCTTGGGA
6972b 1 49 GAGTGTATGTGTCGAGGMT
AGGATTCCCTCTTTTTCTATTGATTGGMTAGTTTCAGMGGMTGGTACCAGTTCCTCCTTGTACCT CTGGTAGMTTCGGCTGTGMTCCATCTGGTCCTGGACTCTTTTTGGTTGGTM[A/G]CTATTGATTA TTGCCACMTTTCAGAGCCTGTTATTGGTCTATTCAGAGATTCMCTTCTTCCTGGTTTAGTCTTGGGAl
6972a 1 22 ' © GAGTGTATGTGTCGAGGMT
s>
UI
ON
SI
U -4
SI
UI 06
S)
UI
VO
S>
©
S
SI
S .I
SI
UI
SI
SI
-248-
-249-
SI oCΛ
S) CΛ
I CΛ SI
s
SI CΛ
I CΛ
ON
SI CΛ -4
SI CΛ
00
)
SI
ON
SI
ON
)
ON CΛ
< < o o o F- < o o o
I
ON 00
s>
ON VO
SI -4
©
SI -4
SI
) -4
ON
s
-
SI -o4o
I -4
VO
SI
00 o
s>
00
)
00 SI
SI oo
00
SI 00 CΛ
I oo
~4
SI
00 00
SI
00
SI
S)
VO )
SI
VO UI
SI
SI
NO CΛ
SI
SI
VO
S>
VO 00
I
VO VO
© o
UI
©
UoI
UI
©
UI
UoI
EQUIVALENTS
While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Those skilled in the art will recognize or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described specifically herein. Such equivalents are intended to be encompassed in the scope of the claims.

Claims

CLAIMSWE CLAIM:
1. A nucleic acid segment shown in column 7 of the Table, or a portion thereof which includes a polymorphic site, or the complement of the segment or portion thereof.
2. The nucleic acid segment of claim 1 that is DNA.
3. The nucleic acid segment of claim 1 that is RNA.
4. The segment of claim 1 that is less 'than 100 bases.
5. The segment of claim 1 that is less than 50 bases.
6. The segment of claim 1 that is less than 20 bases.
7. The segment of claim 1, wherein the polymorphic site is biallelic .
8. The segment of claim 1, wherein the polymorphic form occupying the polymorphic site is the reference base for the fragment listed in the Table, column 3.
9. The segment of claim 1, wherein the polymorphic form occupying the polymorphic site is an alternative form for the fragment listed in the Table, column 4.
10. An allele-specific oligonucleotide that hybridizes to a segment of a fragment shown in the Table, column 7 or its complement.
11. The allele-specific oligonucleotide of claim 10 that is a probe .
12. The allele-specific oligonucleotide of claim 10, wherein a central position of the probe aligns with the polymorphic site of the fragment .
13. The allele-specific oligonucleotide of claim 10 that is a primer.
14. The allele-specific oligonucleotide of claim 13, wherein the 3' end of the primer aligns with the polymorphic site of the fragment .
15. The allele-specific oligonucleotide of Claim 10, which is selected from the group consisting of the nucleotide sequences of the Table, column 5.
16. The allele-specific oligonucleotide of Claim 10, which is selected from the group consisting of the nucleotide sequences of the Table, column 6.
17. An isolated nucleic acid comprising a sequence of the Table, column 7 or the complement thereof, wherein the polymorphic site within the sequence or complement is occupied by a base other than the reference base shown in the Table, column 3.
18. A method of analyzing a nucleic acid, comprising obtaining the nucleic acid from an individual; and determining a base occupying any one of the polymorphic sites shown in the Table.
19. The method of claim 18, wherein the determining comprises determining a set of bases occupying a set of the polymorphic sites shown in the Table.
0. The method of claim 18, wherein the nucleic acid is obtained from a plurality of individuals, and a base occupying one of the polymorphic positions is determined in each of the individuals, and the method further comprising testing each individual for the presence of a disease phenotype, and correlating the presence of the disease phenotype with the base .
EP97946582A 1996-11-06 1997-11-05 Biallelic markers Withdrawn EP0941366A2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US3045596P 1996-11-06 1996-11-06
US30455P 1996-11-06
PCT/US1997/020313 WO1998020165A2 (en) 1996-11-06 1997-11-05 Biallelic markers

Publications (1)

Publication Number Publication Date
EP0941366A2 true EP0941366A2 (en) 1999-09-15

Family

ID=21854280

Family Applications (1)

Application Number Title Priority Date Filing Date
EP97946582A Withdrawn EP0941366A2 (en) 1996-11-06 1997-11-05 Biallelic markers

Country Status (2)

Country Link
EP (1) EP0941366A2 (en)
WO (1) WO1998020165A2 (en)

Families Citing this family (73)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6391550B1 (en) 1996-09-19 2002-05-21 Affymetrix, Inc. Identification of molecular sequence signatures and methods involving the same
US6759515B1 (en) 1997-02-25 2004-07-06 Corixa Corporation Compositions and methods for the therapy and diagnosis of prostate cancer
US6277977B1 (en) * 1997-06-11 2001-08-21 Smithkline Beecham Corporation cDNA clone HAPOI67 that encodes a human 7-transmembrane receptor
US7105353B2 (en) 1997-07-18 2006-09-12 Serono Genetics Institute S.A. Methods of identifying individuals for inclusion in drug studies
FR2767135B1 (en) 1997-08-06 2002-07-12 Genset Sa LSR COMPLEX RECEPTOR, ACTIVITY, CLONING, AND APPLICATION TO DIAGNOSIS, PREVENTION AND / OR TREATMENT OF OBESITY AND THE RISKS OR COMPLICATIONS THEREOF
US6849719B2 (en) 1997-09-17 2005-02-01 Human Genome Sciences, Inc. Antibody to an IL-17 receptor like protein
WO1999014240A1 (en) 1997-09-17 1999-03-25 Human Genome Sciences, Inc. Interleukin-17 receptor-like protein
US6482923B1 (en) 1997-09-17 2002-11-19 Human Genome Sciences, Inc. Interleukin 17-like receptor protein
ATE513042T1 (en) 1998-02-05 2011-07-15 Glaxosmithkline Biolog Sa TUMOR-ASSOCIATED ANTIGEN DERIVATIVES OF THE MAGE FAMILY,NUCLIC ACID SEQUENCES ENCODING SAME FOR PRODUCING FUSION PROTEINS AND VACCINATION COMPOSITIONS
AU2577699A (en) 1998-02-06 1999-08-23 Human Genome Sciences, Inc. Human serine protease and serpin polypeptides
US6692909B1 (en) * 1998-04-01 2004-02-17 Whitehead Institute For Biomedical Research Coding sequence polymorphisms in vascular pathology genes
CA2324869A1 (en) * 1998-04-09 1999-10-21 Whitehead Institute For Biomedical Research Biallelic markers
WO1999054500A2 (en) * 1998-04-21 1999-10-28 Genset Biallelic markers for use in constructing a high density disequilibrium map of the human genome
US6537751B1 (en) 1998-04-21 2003-03-25 Genset S.A. Biallelic markers for use in constructing a high density disequilibrium map of the human genome
US20020192751A1 (en) 1998-05-15 2002-12-19 Genentech, Inc. Secreted and transmembrane polypeptides and nucleic acids encoding the same
US6251592B1 (en) * 1998-05-26 2001-06-26 Procrea Biosciences Inc. STR marker system for DNA fingerprinting
US6759192B1 (en) 1998-06-05 2004-07-06 Genset S.A. Polymorphic markers of prostate carcinoma tumor antigen-1(PCTA-1)
CA2328500A1 (en) * 1998-06-05 1999-12-16 Genset Polymorphic markers of prostate carcinoma tumor antigen-1 (pcta-1)
US6825004B1 (en) 1998-08-07 2004-11-30 Genset S.A. Nucleic acids encoding human TBC-1 protein and polymorphic markers thereof
CA2337694A1 (en) * 1998-08-07 2000-02-17 Genset S.A. Nucleic acids encoding human tbc-1 protein and polymorphic markers thereof
US6703228B1 (en) 1998-09-25 2004-03-09 Massachusetts Institute Of Technology Methods and products related to genotyping and DNA analysis
DE69936379T2 (en) * 1998-09-25 2008-02-28 Massachusetts Institute Of Technology, Cambridge METHOD FOR GENOTYPIZING AND DNA ANALYSIS
US7067627B2 (en) 1999-03-30 2006-06-27 Serono Genetics Institute S.A. Schizophrenia associated genes, proteins and biallelic markers
US6476208B1 (en) 1998-10-13 2002-11-05 Genset Schizophrenia associated genes, proteins and biallelic markers
US6902892B1 (en) 1998-10-19 2005-06-07 Diadexus, Inc. Method of diagnosing, monitoring, staging, imaging and treating prostate cancer
WO2000024939A1 (en) 1998-10-27 2000-05-04 Affymetrix, Inc. Complexity management and analysis of genomic dna
JP2002528118A (en) 1998-11-04 2002-09-03 ジェンセット Genomic and total cDNA sequences of APM1 specific to human adipocytes and their biallelic markers
DE69920032T2 (en) * 1998-11-10 2005-09-15 Genset METHODS, SOFTWARE AND APPARATUS FOR IDENTIFYING GENOMIC AREAS CONTAINING A GENE ASSOCIATED WITH A DETECTABLE CHARACTERISTIC
US6670464B1 (en) * 1998-11-17 2003-12-30 Curagen Corporation Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
US8367322B2 (en) 1999-01-06 2013-02-05 Cornell Research Foundation, Inc. Accelerating identification of single nucleotide polymorphisms and alignment of clones in genomic sequencing
EP1141384A2 (en) 1999-01-06 2001-10-10 Cornell Research Foundation, Inc. Method for accelerating identification of single nucleotide polymorphisms and alignment of clones in genomic sequencing
CA2359132A1 (en) 1999-01-15 2000-07-20 Roxanne D. Duan Bone marrow-specific protein
CA2359757A1 (en) 1999-02-10 2000-08-17 Genset S.A. Polymorphic markers of the lsr gene
US8133734B2 (en) 1999-03-16 2012-03-13 Human Genome Sciences, Inc. Kit comprising an antibody to interleukin 17 receptor-like protein
WO2000055375A1 (en) * 1999-03-17 2000-09-21 Alphagene, Inc. Secreted proteins and polynucleotides encoding them
AU780836B2 (en) 1999-03-24 2005-04-21 Serono Genetics Institute S.A. Genomic sequence of the (purH) gene and (purH)-related biallelic markers
EP1165836A2 (en) * 1999-03-30 2002-01-02 Genset Schizophrenia associated genes, proteins and biallelic markers
AU4050000A (en) * 1999-03-31 2000-10-16 Affymetrix, Inc. Charaterization of single nucleotide polymorphisms in coding regions of human genes
IL129734A0 (en) * 1999-05-03 2000-02-29 Compugen Ltd Novel nucleic acid and amino acid sequences
MXPA01011882A (en) * 1999-05-25 2002-05-06 Aventis Pharma Sa Expression products of genes involved in diseases related to cholesterol metabolism.
FR2794131B1 (en) * 1999-05-25 2003-12-12 Aventis Pharma Sa GENE EXPRESSION PRODUCTS INVOLVED IN CONDITIONS ASSOCIATED WITH THE METABOLISM OF CHOLESTEROL
AU781437B2 (en) * 1999-06-25 2005-05-26 Serono Genetics Institute S.A. A novel BAP28 gene and protein
EP1088900A1 (en) * 1999-09-10 2001-04-04 Epidauros Biotechnologie AG Polymorphisms in the human CYP3A4, CYP3A7 and hPXR genes and their use in diagnostic and therapeutic applications
US6555316B1 (en) 1999-10-12 2003-04-29 Genset S.A. Schizophrenia associated gene, proteins and biallelic markers
US6902890B1 (en) 1999-11-04 2005-06-07 Diadexus, Inc. Method of diagnosing monitoring, staging, imaging and treating cancer
AU1928801A (en) * 1999-11-24 2001-06-04 Curagen Corporation Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
US6869762B1 (en) 1999-12-10 2005-03-22 Whitehead Institute For Biomedical Research Crohn's disease-related polymorphisms
AU2258601A (en) * 1999-12-10 2001-06-18 Ellipsis Biotherapeutics Corporation Ibd-related polymorphisms
EP1287013A2 (en) * 1999-12-27 2003-03-05 Curagen Corporation Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
CA2395786A1 (en) * 1999-12-27 2001-07-05 Curagen Corporation Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
CA2395926A1 (en) * 1999-12-28 2001-07-05 Curagen Corporation Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
AU2630801A (en) * 2000-01-07 2001-07-24 Curagen Corporation Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
US6989367B2 (en) 2000-01-14 2006-01-24 Genset S.A. OBG3 globular head and uses thereof
US20020058617A1 (en) 2000-01-14 2002-05-16 Joachim Fruebis OBG3 globular head and uses thereof for decreasing body mass
US7338787B2 (en) 2000-01-14 2008-03-04 Serono Genetics Institute S.A. Nucleic acids encoding OBG3 globular head and uses thereof
US6566332B2 (en) 2000-01-14 2003-05-20 Genset S.A. OBG3 globular head and uses thereof for decreasing body mass
US20020032319A1 (en) * 2000-03-07 2002-03-14 Whitehead Institute For Biomedical Research Human single nucleotide polymorphisms
AU2001253487A1 (en) * 2000-04-13 2001-10-30 Millennium Pharmaceuticals, Inc. 23155 novel protein human 5-alpha reductases and uses therefor
GB0016169D0 (en) * 2000-06-30 2000-08-23 Univ London Diagnostic method
AU2001235895B2 (en) * 2001-02-20 2008-01-03 Serono Genetics Institute S.A. PG-3 and biallelic markers thereof
FR2824333B1 (en) * 2001-05-03 2003-08-08 Genodyssee NOVEL POLYNUCLEOTIDES AND POLYPEPTIDES OF IFN ALPHA 5
DE10122847A1 (en) * 2001-05-11 2002-11-21 Noxxon Pharma Ag New nucleic acid that binds to staphylococcal enterotoxin B, useful for treating and diagnosing e.g. septic shock, identified by the SELEX method
JP4336877B2 (en) * 2003-04-18 2009-09-30 アークレイ株式会社 Method for detecting β3 adrenergic receptor mutant gene and nucleic acid probe and kit therefor
US20090148458A1 (en) * 2005-06-23 2009-06-11 The University Of British Columbia Coagulation factor iii polymorphisms associated with prediction of subject outcome and response to therapy
WO2007035600A2 (en) * 2005-09-16 2007-03-29 Mayo Foundation For Education And Research Natriuretic activities
US9388457B2 (en) 2007-09-14 2016-07-12 Affymetrix, Inc. Locus specific amplification using array probes
US9074244B2 (en) 2008-03-11 2015-07-07 Affymetrix, Inc. Array-based translocation and rearrangement assays
WO2009143576A1 (en) * 2008-05-27 2009-12-03 Adelaide Research & Innovation Pty Ltd Polymorphisms associated with pregnancy complications
US20120108514A1 (en) 2009-07-09 2012-05-03 University Of Iowa Research Foundation Long acting atrial natriuretic peptide (la-anp) and methods for use thereof
WO2011151405A1 (en) 2010-06-04 2011-12-08 Institut National De La Sante Et De La Recherche Medicale (Inserm) Constitutively active prolactin receptor variants as prognostic markers and therapeutic targets to prevent progression of hormone-dependent cancers towards hormone-independence
EP2751136B1 (en) 2011-08-30 2017-10-18 Mayo Foundation For Medical Education And Research Natriuretic polypeptides
US9611305B2 (en) 2012-01-06 2017-04-04 Mayo Foundation For Medical Education And Research Treating cardiovascular or renal diseases
CN111139301B (en) * 2020-03-10 2020-12-18 无锡市第五人民医院 Breast cancer related gene ERBB2 site g.39397319C > A mutant and application thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0726905B1 (en) * 1993-11-03 2005-03-23 Orchid BioSciences, Inc. Single nucleotide polymorphisms and their use in genetic analysis
FR2722295B1 (en) * 1994-07-07 1996-10-04 Roussy Inst Gustave METHOD OF ANALYSIS OF SADDLE DNA AND ELECTRO-PHORENE GEL

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9820165A2 *

Also Published As

Publication number Publication date
WO1998020165A2 (en) 1998-05-14
WO1998020165A3 (en) 1998-11-12

Similar Documents

Publication Publication Date Title
US5856104A (en) Polymorphisms in the glucose-6 phosphate dehydrogenase locus
WO1998020165A2 (en) Biallelic markers
US6525185B1 (en) Polymorphisms associated with hypertension
US20060263807A1 (en) Methods for polymorphism identification and profiling
US6869762B1 (en) Crohn&#39;s disease-related polymorphisms
US20060188875A1 (en) Human genomic polymorphisms
WO1998038846A2 (en) Genetic compositions and methods
US20020037508A1 (en) Human single nucleotide polymorphisms
WO2001066800A2 (en) Human single nucleotide polymorphisms
EP0812922A2 (en) Polymorphisms in human mitochondrial nucleic acid
WO1999050454A2 (en) Coding sequence polymorphisms in vascular pathology genes
WO2001018250A2 (en) Single nucleotide polymorphisms in genes
WO1998024796A1 (en) Brassica polymorphisms
EP1068354A2 (en) Biallelic markers
WO1998058529A2 (en) Genetic compositions and methods
US20030039973A1 (en) Human single nucleotide polymorphisms
US20030054381A1 (en) Genetic polymorphisms in the human neurokinin 1 receptor gene and their uses in diagnosis and treatment of diseases
EP1024200A2 (en) Genetic compositions and methods
WO2001038576A2 (en) Human single nucleotide polymorphisms
EP1276899A2 (en) Ibd-related polymorphisms
WO1999014228A1 (en) Genetic compositions and methods
WO2000058519A2 (en) Charaterization of single nucleotide polymorphisms in coding regions of human genes
WO2001034840A2 (en) Genetic compositions and methods
US20020155446A1 (en) Very low density lipoprotein receptor polymorphisms and uses therefor
US20030008301A1 (en) Association between schizophrenia and a two-marker haplotype near PILB gene

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

17P Request for examination filed

Effective date: 19990604

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20030531