[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO1999006598A2 - Determining common functional alleles in a population and uses therefor - Google Patents

Determining common functional alleles in a population and uses therefor Download PDF

Info

Publication number
WO1999006598A2
WO1999006598A2 PCT/US1998/016574 US9816574W WO9906598A2 WO 1999006598 A2 WO1999006598 A2 WO 1999006598A2 US 9816574 W US9816574 W US 9816574W WO 9906598 A2 WO9906598 A2 WO 9906598A2
Authority
WO
WIPO (PCT)
Prior art keywords
gene
individual
haplotype
sequence
nucleotide sequence
Prior art date
Application number
PCT/US1998/016574
Other languages
French (fr)
Other versions
WO1999006598A3 (en
Inventor
Patricia D. Murphy
Original Assignee
Oncormed, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oncormed, Inc. filed Critical Oncormed, Inc.
Priority to AU87768/98A priority Critical patent/AU8776898A/en
Publication of WO1999006598A2 publication Critical patent/WO1999006598A2/en
Publication of WO1999006598A3 publication Critical patent/WO1999006598A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism

Definitions

  • the invention relates to methods for identifying functional alleles commonly occurring in a population, for finding new functional alleles, for determining the relative frequencies at which such alleles, for genetic and pharmacogenetic applications of the methods and products produced thereby.
  • Detection of mutations in such genes is instrumental in determining susceptibility to or diagnosing these diseases.
  • Some diseases, such as sickle cell disease are known to be monomorphic; i.e., the disease is generally caused by a single mutation present in the population.
  • methods for detecting the mutations are targeted to the site within the gene at which they are known to occur.
  • the mutation responsible for such a monomo ⁇ hic disease can only be established in the first instance if there exists an accurate reference sequence for the non-pathological state.
  • genes for which the population displays extensive allelic heterogeneity and which have been implicated in disease include CFTR (cystic fibrosis), dystrophin (Duchenne muscular dystrophy, and Becker muscular dystrophy), and p53 (Li-Fraumeni syndrome).
  • breast cancer is also an example of a disease in which, in addition to allelic heterogeneity, there is genetic heterogeneity.
  • BRCA1 the BRCA2 and BRCA3 genes have been linked to breast cancer.
  • the NFI and NFII genes are involved in neurofibromatosis (types I and II, respectively).
  • HNPCC hereditary non- polyposis colorectal cancer
  • MSH2 MSH2, MLH1, PMS1, and PMS2
  • a cDNA sequence for MSH2 has been deposited in GenBank as Accession No. U03911; and a cDNA sequence for MLH1 has been deposited in GenBank as Accession No. U40978.
  • disease or disease susceptibility also results from the interaction of more than one gene or the interaction of an environmental, chemical or biological influence on one or more genes. For example, measles virus infects many people; some are immune due to vaccination or previous infection, some are infected but asymptomatic, some become sick with a rash, some develop an encephalitis and some die. Genetic susceptibility and many other factors are involved in the outcome.
  • wild-type may not be the desirable or may be one of several possibilities.
  • the "wild-type” of any gene may not even be known.
  • the Brassica family debate exists as to exactly what is a wild cabbage plant, much less which of the many genes or traits constitutes a "wild-type".
  • a wild-type is not pathological but sometimes this definition seems inappropriate.
  • the Macintosh apple is propagated asexually exclusively. An inability to reproduce naturally may be considered the result of pathological mutation(s) but is none the less the desired trait.
  • different strains of a plant are cross-breed where each set of genes from each parent strain may be considered "wild-type".
  • Certain wild-type sequences of a gene may be otherwise indistinguishable from others except under certain circumstances.
  • a gene involved in resistance or susceptibility to a certain infectious agent is only recognized when the individual plant or animal is exposed to the infectious agent.
  • chemical sensitivity may be a wild- type which is pathological under only certain circumstances which may never occur in the individual. Drought tolerance traits are significant only under environmental stress which may or may not occur. Therefore, the type of wild-type sequence is of importance.
  • a functional allele profile contains 1) the identity of the key functional allele or alleles for a given gene in the population, including the "consensus" sequence, and 2) the relative frequency with which these functional alleles occur in the population.
  • the functional allele profile includes the identification of the consensus normal sequence, i.e., the most commonly occurring functional allele.
  • the present invention therefore, provides a normal sequence which is the most likely sequence to be found in the majority of the normal population, the (i.e., "consensus normal DNA sequence").
  • a consensus normal allele sequence of a gene more accurately reflects the most likely sequence to be found in the population. Determining the consensus sequence is useful in both the diagnosis and treatment of disease. For example, use of the consensus normal gene sequence reduces the likelihood of misinte ⁇ reting a "sequence variation" found in the normal population with a pathologic "mutation" (i.e. causes disease in the individual or puts the individual at a high risk of developing the disease).
  • a consensus normal DNA sequence makes it possible for true pathological mutations to be easily identified or differentiated from polymo ⁇ hisms.
  • consensus sequence or other sequences identified in the functional allele profile, allow for the selection of therapeutically optimal nucleotide sequences to be administered in gene therapy or gene replacement, or optimal amino acid sequence in the therapeutic administration of active proteins or peptides.
  • the consensus sequence is generally the easiest target for various agonists, antagonists and measuring interactions with the gene or expression product appropriate for pharmacogenetic analysis.
  • determining a functional allele profile of genes allows for an evaluation of the degree to which the gene is under selective pressure.
  • Such a technique applies to one and plural genes, especially genes which interact or express products which interact with each other directly, interact with the same or similar other compound or are along the same metabolic pathway.
  • the method of the present invention determines combinations of haplotypes in different genes.
  • It is another embodiment of the present invention is determining how an individual will react to a particular chemical, environmental or biological influence. It is a premise of the present invention that different wild-type genes or their expression products interact differently in some circumstances.
  • Another embodiment of the present invention is the determination of traits and susceptibilities of plants and animals during breeding experiments by detecting the polymo ⁇ hisms constituting the gene haplotype associated with the trait or susceptibility of interest.
  • FIG. 1 Figure 1 shows alternative alleles containing polymo ⁇ hic (non-mutation causing variations) sites along the BRCA1 gene, represented as individual "haplotypes" of the BRCA1 gene.
  • the alternative allelic variations occurring at nucleotide positions 2201, 2430, 2731, 3232, 3667, 4427, and 4956 are shown.
  • the BRCAl (omil) haplotype is indicated with dark shading.
  • the haplotype available in GenBank is completely unshaded and designated as "GB”.
  • Two additional haplotypes (BRCAl (om ⁇ 2) , and BRCAl (om ⁇ 3) are represented with mixed shaded and unshaded positions (numbers 7 and 9 from left to right).
  • a functional allele profile contains 1) the identity of the key functional allele or alleles for a given gene in the population, including the "consensus" sequence, and 2) the relative frequency with which these functional alleles occur in the population.
  • the functional allele profile includes the identification of the consensus normal sequence, i.e., the most commonly occurring functional allele.
  • the present invention therefore, provides a normal sequence which is the most likely sequence to be found in the majority of the normal population, the (i.e., "consensus normal DNA sequence").
  • a consensus normal allele sequence of a gene more accurately reflects the most likely sequence to be found in the population.
  • a functional allele profile can be determined for any gene in which an altered or deficient function produces a recognizable, phenotypic trait, including, but not limited to, pathology.
  • the invention is set forth for the pu ⁇ ose of illustration, and not by way of limitation, for determining the functional allele profile of three different genes associated with disease - for example, the MSH2 and MLHl genes, each associated with hereditary non-polyposis colorectal cancer (HNPCC), and the BRCA1 gene, associated with breast, ovarian, prostate and other cancers.
  • HNPCC hereditary non-polyposis colorectal cancer
  • Allele refers to an alternative version (i.e., nucleotide sequence) of a gene or DNA sequence at a specific chromosomal locus.
  • Allelic variation or “sequence variation” refers to a particular alternative nucleotide or nucleotide sequence at a position within a gene (e.g., a polymo ⁇ hic site or mutation) whose sequence varies from one allele to another.
  • Coding sequence or "DNA coding sequence” refers to those portions of a gene which, taken together, code for a peptide (protein), or which nucleic acid itself has function.
  • Composite genomic sequence refers to the combination of the two allelic nucleotide sequences (i.e., maternal and paternal) obtained from sequencing a diploid genomic sample.
  • Consensus refers to the most commonly occurring in the population.
  • Frunctional allele refers to an allele which is naturally transcribed and translated into a functioning protein.
  • “Functional Allele Profile” refers to a set of functional alleles which are representative of the most common alleles occurring in a population, wherein the functional alleles are identified by nucleotide sequence and the relative frequencies with which the functional alleles occur in the population.
  • Haplotype refers to a set of nucleotides or nucleotide sequences occurring at sites of allelic variation occurring within a locus on a single chromosome (of either maternal or paternal origin).
  • locus includes the entire coding sequence.
  • “Mutation” refers to a base change or a gain or loss of base pair(s) in a DNA sequence, which results in a DNA sequence which codes for a non-functioning protein or a protein with substantially reduced or altered function.
  • Agent for polymerization refers to an enzyme which may be heat stable, e.g. Taq polymerase, or function at lower temperatures, e.g., room temperature, that effects an extension of DNA from a short primer sequence annealed to the target DNA of interest.
  • Polymo ⁇ hism refers to an allelic variation which occurs in greater than or equal to 1% of the normal healthy population.
  • Single nucleotide polymo ⁇ hism refers to an allelic variation which is defined by two (and only two) alternative bases found at a specific and particular nucleotide in genomic DNA. It may be within a gene (i.e., exonic or intronic) or outside of a gene (such as in a promoter or other regulatory structure) or lastly found between genes.
  • “Individual” refers to a single organism which may be human, plant or non-human animal. The individual may be intact or a biological sample taken from the individual which contains sufficient substances or information regarding the individual.
  • Protein variant and “variant amino acid sequence” refers to different amino acid sequences from that in one naturally occurring wild-type protein and is generally considered the same protein. Some " different haplotypes have variant amino acid sequences.
  • “Expression product” refers to an RNA, spliced or unspliced, a pre-, pro-, prepro- or a peptide which alone or in conjunction with other peptides constitutes a protein.
  • “Pharmaceutical” refers to any bio-effecting chemical drug or biological agent which alters or induces an alteration in the metabolism of an "individual”. Pharmaceuticals include compositions for use on veternary animals and agricultural and ornamental plants.
  • Trait refers to a phenotypically determinable characteristic resulting from the influence of one or more genes, alone or in conjunction with an environmental condition or exposure to other agents. Traits include susceptibilities to chemicals, infectious agents and environmental conditions (temperature, drought etc.).
  • a person skilled in the art of genetic testing will find the present invention useful for diagnosis and treatment of diseases and susceptibility thereto.
  • the invention is especially useful for establishing the "standard” (i.e., consensus normal DNA sequence) and new haplotypes for clinical diagnostic, therapeutic, genetic testing and breeding uses.
  • the diagnostic applications for which determining a functional allele profile in accordance with the invention include, but are not limited to, the following: a) identifying individuals having a gene with no coding mutations, which individuals are therefore not at risk or have no increased susceptibility to the pathology(s) associated with a mutation in the gene in question; b) avoiding misinte ⁇ retation of functional polymo ⁇ hisms detected in the gene as mutations; c) identifying individuals having a potentially abnormal gene that does not match the Consensus Normal DNA sequence; d) determining ethnic founder haplotypes so that clinical analysis is appropriate for an individual from this ethnic group; e) determining a sequence under strongest selective pressure; and f) determining an amino acid and/or short nucleic acid sequence which may be derived from the consensus normal DNA sequence to make diagnostic and probes antibodies.
  • Labeled diagnostic probes may be used by any hybridization method to determine the level of protein in serum or lysed cell suspension of a patient, or solid surface cell sample such as for immunohistochemical analysis. g) detecting a new haplotype and determining the polymo ⁇ hisms constituting the new haplotype. h) detecting a new protein variant type and determining the variant amino acids constituting the new protein variant, i) determining the combination of one haplotype or polymo ⁇ hism for one gene and the haplotype or polymo ⁇ hism for another different gene in the same individual.
  • the genes or their expression products interact with each other directly, e.g.
  • bind to each other, or indirectly by functioning with each other on the same substrate, are in different stages in a metabolic pathway, or are related to the same disease, susceptibility, condition or trait, j) determining whether to administer a bioeffecting composition to an individual wherein individuals with different haplotypes for one or more genes respond differently to the composition, k) determining susceptibility to disease or other pathology to decide on prophylaxis, therapy or differential monitoring. 1) determining a trait by quick assay of a genetic engineered or selectively bred individual. This permits one to determine the trait without actually measuring the trait phenotypically.
  • Therapeutics Certain "normal" alleles maybe more functional or hyper- functional than the minimum needed to maintain a normal phenotype in an individual, particularly when stressed. By determining the most common allele in a population one may be observing empiric data for such suitability for survival (the effects may be so subtle that scientists have not determined the basis of this selection). For example, alleles with longer mRNA or protein half-lives (i.e., stability) may produce healthier cells, and, thus, healthier people.
  • RNA half-life such as in proteins involved in the cell cycle pathway.
  • proteases are known to have favored cutting sites which may be present or absent in different normal alleles leading to peptides that have intrinsic activity themselves.
  • the determination of the functional allele profile or a new functional allele in accordance with the invention is useful in clinical therapy for: a) selecting optimal alleles for performing gene repair or gene therapy; and b) selecting optimal amino acid sequence for administration of functional protein in treatment or prevention of diseases.
  • the determination of the functional allele profile or a new functional allele in accordance with the invention is useful for: a) determining whether a particular gene is under strong selective pressure; and b) determining which of two or more genes which encode proteins with similar functions represents a redundant, or back-up copy of the gene.
  • a group of individuals determined to be at low risk for carrying a mutation in the gene of interest is used as a source for genetic material.
  • Any standard method known in the art for performing pedigree analysis can be used for this selection process. See, for example, Ha ⁇ er, P.S., Practical Genetic Counseling, 3d. ed., 1988 (Wright/Butterworth & Co. Ltd.: Boston), especially at pages 4-7.
  • individuals can be screened in order to identify those with no disease history in their immediate family, i.e., among their first and second degree relatives.
  • a first degree relative is a parent, sibling, or offspring.
  • a second degree relative is an aunt, uncle, grandparent, grandchild, niece, nephew, or half- sibling.
  • each person is asked to fill out a hereditary cancer prescreening questionnaire. More preferably, when an autosomal dominant cancer gene with such relatively high penetrance is the gene of interest, the questionnaire set forth in Table 1, below, is used.
  • Part A Answer the following questions about your family
  • FAP Familial Adenomatous Polyposis
  • Part B Refer to the list of cancers below for your responses only to questions in Part B
  • Bladder Cancer Lung Cancer, Pancreatic Cancer, Breast Cancer, Gastric Cancer, Prostate Cancer, Colon Cancer, Malignant Melanoma, Renal Cancer, Endometrial Cancer, Ovarian Cancer, Thyroid Cancer
  • Part C Refer to the list of relatives below for responses only to questions in Part C
  • Part D Refer to the list of relatives below for responses only to questions in Part D.
  • a group is selected for genomic DNA sequence analysis. Any number of samples may be analyzed. Preferably, a number of samples which is small enough for convenient, accurate sequence analysis, but large enough to provide a reliable representation of the population is analyzed. Most preferably, initial sequencing may be performed on ten different chromosomes by analyzing samples from five unrelated individuals.
  • sequencing template is obtained by amplifying the coding region and optionally one or more related sequences (e.g. splice site junctions, enhancers, introns, promotors and other regulatory elements) of the gene of interest.
  • Any nucleic acid specimen, in purified or non-purified form, can be utilized as the starting nucleic acid or acids, providing it contains, or is suspected of containing, the specific nucleic acid sequence containing a polymo ⁇ hic locus.
  • the process may amplify, for example, DNA or RNA, including messenger RNA, wherein DNA or RNA may be single stranded or double stranded.
  • RNA is to be used as a template
  • enzymes, and/or conditions optimal for reverse transcribing the template to DNA would be utilized.
  • a DNA-RNA hybrid which contains one strand of each may be utilized.
  • a mixture of nucleic acids may also be employed, or the nucleic acids produced in a previous amplification reaction herein, using the same or different primers may be so utilized.
  • the specific nucleic acid sequence to be amplified i.e., the polymo ⁇ hic locus, may be a fraction of a larger molecule or can be present initially as a discrete molecule, so that the specific sequence constitutes the entire nucleic acid. It is not necessary that the sequence to be amplified be present initially in a pure form; it may be a minor fraction of a complex mixture, such as contained in whole human DNA.
  • primer pairs used are greater than required to amplify the particular polymo ⁇ hisms
  • the primer set actually used is listed below. For larger scale testing of polymo ⁇ hisms for haplotype determination, only the primer pairs actually amplifying the polymo ⁇ hism are required. Additionally, primers which amplify a shorter region, as short as the one nucleotide polymo ⁇ hism may be used.
  • DNA utilized herein may be " extracted from a body sample, such as blood, tissue material and the like by a variety of techniques such as that described by Maniatis, et. al. in Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, NY, pp. 280-281, 1982). If the extracted sample is impure, it may be treated before amplification with an amount of a reagent effective to open the cells, or animal cell membranes of the sample, and to expose and or separate the strand(s) of the nucleic acid(s). This lysing and nucleic acid denaturing step to expose and separate the strands will allow amplification to occur much more readily.
  • the deoxyribonucleotide triphosphates dATP, dCTP, dGTP, and dTTP are added to the synthesis mixture, either separately or together with the primers, in adequate amounts and the resulting solution is heated to about 90°-100°C from about 1 to 10 minutes, preferably from 1 to 4 minutes. After this heating period, the solution is allowed to cool, which is preferable for the primer hybridization. To the cooled mixture is added an appropriate agent for effecting the primer extension reaction (called herein "agent for polymerization”), and the reaction is allowed to occur under conditions known in the art. The agent for polymerization may also be added together with the other reagents if it is heat stable.
  • This synthesis (or amplification) reaction may occur at room temperature up to a temperature above which the agent for polymerization no longer functions.
  • the temperature is generally no greater than about 40°C. Most conveniently the reaction occurs at room temperature.
  • the primers used to carry out this invention embrace oligonucleotides of sufficient length and appropriate sequence to provide initiation of polymerization.
  • Environmental conditions conducive to synthesis include the presence of nucleoside triphosphates and an agent for polymerization, such as DNA polymerase, and a suitable temperature and pH.
  • the primer is preferably single stranded for maximum efficiency in amplification, but may be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent for polymerization. The exact length of primer will depend on many factors, including temperature, buffer, and nucleotide composition.
  • the oligonucleotide primer typically contains 12-20 or more nucleotides, although it may contain fewer nucleotides.
  • Primers used to carry out this invention are designed to be substantially complementary to each strand of the genomic locus to be amplified. This means that the primers must be sufficiently complementary to hybridize with their respective strands under conditions which allow the agent for polymerization to perform. In other words, the primers should have sufficient complementarity with the 5' and 3' sequences flanking the mutation to hybridize therewith and permit amplification of the genomic locus.
  • Oligonucleotide primers of the invention are employed in the amplification process which is an enzymatic chain reaction that produces exponential quantities of polymo ⁇ hic locus relative to the number of reaction steps involved.
  • one primer is complementary to the negative (-) strand of the polymo ⁇ hic locus and the other is complementary to the positive (+) strand.
  • Annealing the primers to denatured nucleic acid followed by extension with an enzyme, such as the large fragment of DNA polymerase I (Klenow) and nucleotides results in newly synthesized + and - strands containing the target polymo ⁇ hic locus sequence.
  • the product of the chain reaction is a discreet nucleic acid duplex with termini corresponding to the ends of the specific primers employed.
  • oligonucleotide primers of the invention may be prepared using any suitable method, such as conventional phosphotriester and phosphodiester methods or automated embodiments thereof.
  • diethylphosphoramidites are used as starting materials and may be synthesized as described by Beaucage, et al., Tetrahedron Letters. 22:1859-1862, (1981).
  • One method for synthesizing oligonucleotides on a modified solid support is described in U.S. Patent No. 4,458,066.
  • the agent for polymerization may be any compound or system which will function to accomplish the synthesis of primer extension products, including enzymes.
  • Suitable enzymes for this pu ⁇ ose include, for example, E. coli DNA polymerase I, Klenow fragment of E. coli DNA polymerase, polymerase muteins, reverse transcriptase, other enzymes, including heat-stable enzymes (i.e., those enzymes which perform primer extension after being subjected to temperatures sufficiently elevated to cause denaturation), such as Taq polymerase.
  • Suitable enzyme will facilitate combination of the nucleotides in the proper manner to form the primer extension products which are complementary to each polymo ⁇ hic locus nucleic acid strand.
  • the synthesis will be initiated at the 3' end of each primer and proceed in the 5' direction along the template strand, until synthesis terminates, producing molecules of different lengths.
  • the newly synthesized strand and its complementary nucleic acid strand will form a double-stranded molecule under hybridizing conditions described above and this hybrid is used in subsequent steps of the process.
  • the newly synthesized double-stranded molecule is subjected to denaturing conditions using any of the procedures described above to provide single-stranded molecules.
  • the steps of denaturing, annealing, and extension product synthesis can be repeated as often as needed to amplify the target polymo ⁇ hic locus nucleic acid sequence to the extent necessary for detection.
  • the amount of the specific nucleic acid sequence produced will accumulate in an exponential fashion. Amplification is described in PCR. A Practical Approach, ILR Press, Eds. M. J. McPherson, P. Quirke, and G. R. Taylor, 1992.
  • the amplification products may be detected by Southern blots analysis, without using radioactive probes.
  • a small sample of DNA containing a very low level of the nucleic acid sequence of the polymo ⁇ hic locus is amplified, and analyzed via a Southern blotting technique or similarly, using dot blot analysis.
  • the use of non-radioactive probes or labels is facilitated by the high level of the amplified signal.
  • probes used to detect the amplified products can be directly or indirectly detectably labeled, for example, with a radioisotope, a fluorescent compound, a bioluminescent compound, a chemiluminescent compound, a metal chelator or an enzyme.
  • Sequences amplified by the methods of the invention can be further evaluated, detected, cloned, sequenced, and the like, either in solution or after binding to a solid support, by any method usually applied to the detection of a specific DNA sequence such as PCR, oligomer restriction (Saiki, et al, Bio/Technology. 3:1008-1012, (1985)), allele-specific oligonucleotide (ASO) probe analysis (Conner, et al, Proc. Natl. Acad. Sci.
  • ASO allele-specific oligonucleotide
  • the method of amplifying is by PCR, as described herein and as is commonly used by those of ordinary skill in the art.
  • Alternative methods of amplification have been described and can also be employed as long as the genetic locus amplified by PCR using primers of the invention is similarly amplified by the alternative means.
  • Such alternative amplification systems include but are not limited to self-sustained sequence replication, which begins with a short sequence of RNA of interest and a T7 promoter. Reverse transcriptase copies the RNA into cDNA and degrades the RNA, followed by reverse transcriptase polymerizing a second strand of DNA.
  • nucleic acid sequence-based amplification is nucleic acid sequence-based amplification (NASBA) which uses reverse transcription and T7 RNA polymerase and inco ⁇ orates two primers to target its cycling scheme.
  • NASBA can begin with either DNA or RNA and finish with either, and amplifies to 10 8 copies within 60 to 90 minutes.
  • nucleic acid can be amplified by ligation activated transcription (LAT). LAT works from a single- stranded template with a single primer that is partially single-stranded and partially double-stranded. Amplification is initiated by ligating a cDNA to the promoter oligonucleotide and within a few hours, amplification is 10 s to IO 9 fold.
  • LAT ligation activated transcription
  • the QB replicase system can be utilized by attaching an RNA sequence called MDV-1 to RNA complementary to a DNA sequence of interest. Upon mixing with a sample, the hybrid RNA finds its complement among the specimen's mRNAs and binds, activating the replicase to copy the tag-along sequence of interest.
  • Another nucleic acid amplification technique ligase chain reaction (LCR), works by using two differently labeled halves of a sequence of interest which are covalently bonded by ligase in the presence of the contiguous sequence in a sample, forming a new target.
  • the repair chain reaction (RCR) nucleic acid amplification technique uses two complementary and target- specific oligonucleotide probe pairs; thermostable polymerase and ligase, and DNA nucleotides to geometrically amplify targeted sequences.
  • a 2-base gap separates the oligonucleotide probe pairs, and the RCR fills and joins the gap, mimicking DNA repair.
  • Nucleic acid amplification by strand displacement activation (SDA) utilizes a short primer containing a recognition site for Hindi with short overhang on the 5' end which binds to target DNA.
  • a DNA polymerase fills in the part of the primer opposite the overhang with sulfur-containing adenine analogs. Hindi is added but only cuts the unmodified DNA strand.
  • a DNA polymerase that lacks 5' exonuclease activity enters at the cite of the nick and begins to polymerize, displacing the initial primer strand downstream and building a new one which serves as more primer.
  • SDA produces greater than 10 7 -fold amplification in 2 hours at 37°C. Unlike PCR and LCR, SDA does not require instrumented temperature cycling.
  • Another method is a process for amplifying nucleic acid sequences from a DNA or RNA template which may be purified or may exist in a mixture of nucleic acids.
  • the resulting nucleic acid sequences may be exact copies of the template, or may be modified.
  • the process has advantages over PCR in that it increases the fidelity of copying a specific nucleic acid sequence, and it allows one to more efficiently detect a particular point mutation in a single assay.
  • a target nucleic acid is amplified enzymatically while avoiding strand displacement. Three primers are used. A first primer is complementary to the first end of the target. A second primer is complementary to the second end of the target.
  • a third primer which is similar to the first end of the target and which is substantially complementary to at least a portion of the first primer such that when the third primer is hybridized to the first primer, the position of the third primer complementary to the base at the 5' end of the first primer contains a modification which substantially avoids strand displacement.
  • a number of methods well-known in the art can be used to carry out the sequencing reactions.
  • enzymatic sequencing based on the Sanger dideoxy method is used.
  • Mass spectroscopy may also be used.
  • the sequencing reactions can be analyzed using methods well-known in the art, such as polyacrylamide gel electrophoresis.
  • the sequencing reactions are carried out and analyzed using a fluorescent automated sequencing system such as the Applied Biosystems, Inc. ("ABI", Foster City, CA) system.
  • Applied Biosystems, Inc. "ABI", Foster City, CA) system.
  • PCR products serving as templates are fluorescently labeled using the Taq Dye Terminator ® Kit (Perkin-Elmer cat# 401628).
  • Dideoxy DNA sequencing is performed in both forward and reverse directions on an ABI automated Model 377 ® sequencer.
  • the resulting data can be analyzed using "Sequence Navigator ® " software available through ABI.
  • the functional allele profiles identified in accordance with the invention may contain different alleles.
  • each allele may contain multiple allelic variations, such as multiple polymo ⁇ hisms.
  • two different alleles may differ in sequence from one another at multiple nucleotide positions.
  • two such multiply polymo ⁇ hic alleles may be present in the same individual, i.e., a heterozygote.
  • the genomic sample of the gene of such a heterozygous individual is sequenced, the variations at each position can be detected. They are the alternative sequences present at particular positions in the composite sequence obtained from the diploid genome.
  • which variations are grouped together in each individual haplotype or allele, i.e., the phase of the variations, cannot be determined.
  • genomic sequence analysis of a hypothetical gene from a heterozygous individual may reveal that polymo ⁇ hic positions 1, 2, or 3 each contain either an A or a G.
  • heterozygous genomic sequences obtained for the pu ⁇ ose of determining a functional allele profile are compared to an initial haplotype sequence.
  • Some haplotypes can also be determined upon sequencing chromosomal samples from a homozygous individual according to the methods above.
  • homozygous sequence analyses contain no ambiguities in sequence between the two alleles because they are identical.
  • an initial haplotype sequence is obtained by determining the cDNA sequence of an individual identified as being at low risk for carrying a mutation as described above. Because the full-length of a cDNA of the gene of interest is derived from a single mRNA transcript, it contains the allelic variations of a single haplotype. It contains all of the allelic variations present in a single allele of the individual from which it was obtained. Thus, the cDNA sequence contains half of the allelic variations present in the composite genomic sequence of a heterozygous individual containing that allele. Moreover, unlike sequence information from a heterozygous chromosomal sample, such cDNA sequence indicates which of the allelic variations are grouped together in one allele, i.e., the phase of the variations.
  • the companion haplotype present in a heterozygote can be determined by subtracting this sequence from the composite genomic sequence. For example, if in the illustration set forth above, the cDNA sequenced has an A in position 1, a G in position 2 and an A in position 3, then the initial haplotype is A,G 2 A 3 . This sequence is then subtracted from the composite genomic sequence to yield the companion haplotype, namely G,A 2 G 3 .
  • the initial haplotype identified in a given individual also can be used to determine the presence of the haplotype in other individuals by comparing the initial haplotype sequence to the composite genomic sequence from such other individuals.
  • this method of subtracting the initial haplotype sequence from the composite genomic sequence of other individuals readily provides recognizably distinct haplotypes which are independent of each other. See, for example, the OMI 1 and GB haplotypes in FIG. 1, which differ from each other in each of seven sites of allelic variation.
  • haplotype determined in one individual is used to determine the haplotypes present in the composite genomic sequence of other individuals, the presence of that particular haplotype, and its companion haplotype as determined by subtraction from a composite genomic sequence, should be confirmed.
  • confirmation of the occurrence of a given haplotype in the population can be carried out, for example, by 1) sequencing cDNA samples, as described in this section, from such other heterozygous individuals; or 2) identifying individuals homozygous for the haplotype either among the initial set of sequenced chromosomal samples or by additional confirmatory sequencing of chromosomal samples as described below.
  • cDNA sequences for determining the initial haplotype can be obtained using standard techniques well known in the art. First, mRNA is isolated from an individual, for example, from blood or skin cells. The mRNA is initially reversed-transcribed into double stranded cDNA and then amplified according to the well known technique of RT- PCR (see, for example, U.S Patent No. 5,561,058 by Gelfand et al.).
  • the resulting cDNA whose sequence represents a single haplotype, can be sequenced according to the methods above.
  • haplotypes After all haplotypes have been identified in the study population, their relative frequencies are determined. For example, if five chromosomes out of a total often chromosomes are of one haplotype, then its frequency is 50%. Subsequently, each haplotype is ranked in order from the most frequent to the least frequent to yield the functional allele profile.
  • initial sequence analysis is performed on a small group of individuals, most preferably five individuals, screened according to the methods described above.
  • haplotypes found occurring in the population are used as references to inte ⁇ ret the haplotypes present in any heterozygous individuals encountered during the confirmatory sequencing analysis of additional individuals.
  • additional samples can be added to the functional allele profile to provide more precise frequencies of occurrence of each allele in the population.
  • additional samples may contain a new functional allele with a new haplotype. This is particularly likely to be found for uncommon ( ⁇ 10%) or rare ( ⁇ 1%) haplotypes.
  • confirmatory sequence analysis ensures that the haplotypes determined by subtracting an initial haplotype from a composite heterozygous sequence is indeed represented in the population.
  • Such techniques may also be used when multiple common haplotypes exist for the gene and it is uncertain which to use for subtraction.
  • first and second degree relatives Approximately 150 volunteers are screened in order to identify individuals with no cancer history in their immediate family (i.e. first and second degree relatives). Each person is asked to fill out the hereditary cancer prescreening questionnaire shown in Table 1, above.
  • a first degree relative is a parent, sibling, or offspring.
  • a second degree relative is an aunt, uncle, grandparent, grandchild, niece, nephew, or half-sibling. Among those individuals who answered "no" to all questions, five individuals are randomly chosen for end-to-end sequencing of their MSH2 gene.
  • Genomic DNA (100 nanograms) is extracted from white blood cells of five individuals designated as low risk of being carriers of mutations in the MSH2 gene from analysis of their answers to the questionnaire set forth in Table 1 above.
  • the MSH2 coding region in each of the five samples is sequenced end-to-end by amplifying each exon individually. Each sample is amplified in a final volume of 25 microliters containing 1 microliter (100 nanograms) genomic DNA, 2.5 microliters 10X PCR buffer
  • the primers in Table 2, below, are used to carry out amplification of the various sections of the MSH2 gene samples.
  • the primers are synthesized on an DNA/RNA
  • PCR products are purified using Qia-quick ® PCR purification kits (Qiagen ® , cat# 28104; Chatsworth, CA). Yield and purity of the PCR product determined spectrophotometrically at OD 260 on a Beckman DU 650 spectrophotometer.
  • Variant #2 is an uncommon derivative chromosome of variant #1
  • **Variant #3 is a rarer derivative chromosome of GenBank cDNA
  • allelic variations occurring at nucleotide positions 2201, 2430, 2731, 3232, 3667, 4427, and 4956 are shown.
  • the haplotype previously available in GenBank (as Accession No. U14680) is completely unshaded and designated "GB".
  • the most common, "consensus” haplotype occurs in five separate chromosomes labeled with the OMI symbol (haplotypes 1-5 from left to right).
  • Two additional haplotypes (BRCAl (om ⁇ 2) , and BRCAl (om ⁇ 3) are represented with mixed shaded and unshaded positions (numbers 7 and 9 from left to right).
  • the glucose-6-phosphate dehydrogenase gene is located on the X chromosome. Individuals with certain sequence variations in the G6PDH gene lead relatively normal lives unless they are exposed to certain chemicals found in fava beans, primaquine and sulfonamide antibiotics (sulfisoxazole, sulfamethoxazole, sulfathiazole, sulfacetamide, etc.). Upon administration of such compounds to the individual, severe reactions including hemolytic anemia occur in individuals having certain haplotype(s) of the G6PDH gene. These individuals are generally of African and Mediterranean heritage. Because these sequence variations are otherwise of little importance, they have been called both polymo ⁇ hisms and mutations in the literature.
  • SNPs single nucleotide polymo ⁇ hisms
  • Missense mutations occur at amino acids 32, 48, 58, 68, 106, 126, 131, 156, 163, 165, 181, 182, 188, 198, 213, 216, 227, 282, 285, 291, 317, 323, 335, 342, 353, 363, 385, 386, 387, 393, 394, 398, 410, 439, 447, 454, 459, 463 and amino acid 35 deleted.
  • Many mutations are restricted to certain haplotypes. Thus, haplotype determination provides an indication of whether the individual is sensitive to the drugs listed above.
  • Blood is drawn from 30 individuals of African- American heritage with urinary tract infections having bacteria sensitive to sulfa antibiotics and for whom treatment with trimethoprim-sulfamethiazole is otherwise deemed appropriate.
  • 1 mg of genomic DNA from individuals is isolated from peripheral blood lymphocytes and amplified by PCR using the primers listed in Hirono et al, Proc. Natl. Acad. Sci. USA 85:3951-3954 (1988) and Beutler et al, Human Genetics 87:462-464 (1990) according to the methods in Example 1 above.
  • Amplified fragments are divided into five aliquots and four of which are cleaved by a restriction enzyme, either PvuII, Nla III, Fok I or Pst I, according to the manufacturer's (Stratagene and New England Biolabs) instructions.
  • the digests are electrophoresed in a 4% agarose gel (NuSieve, FMC) with 10 ml of ethidium bromide (10 mg/ml) and the number of bands counted under ultraviolet light. The number of bands indicates the presence or absence of restriction enzyme cleavage and presence of a particular nucleotide at the polymo ⁇ hic site.
  • oligonucleotide probe for determining the polymo ⁇ hic site at nucleotide 1311 is listed in Beutler et al, Human Genetics 87:462-464 (1990). The fifth aliquot is immobilized on a membrane and an ASO (allele specific oligonucleotide) hybridization assay is performed according to the method of Example 5 below. The presence or absence of the label indicating hybridization is considered indicative of the presence of a particular nucleotide at the polymo ⁇ hic site.
  • haplotype particularly the polymo ⁇ hism at nucleotide 1116, indicative of very low likelihood of a G6PDH mutation sensitive to sulfamethiazole are given 160 mg trimethoprim with 800 mg sulfamethiazole (SEPTRA DS).
  • haplotype or polymo ⁇ hism indicative of a possible presence of a G6PDH mutation sensitive to sulfamethiazole are given a different antibiotic (varied with the patient) to which their infecting organism was susceptible.
  • Example 5 Pharmacogenetic Analysis of BRCAl. BRCA2. PTEN. BAPl. BARDl and hRAD51 Haplotypes and the Use of Tamoxifen to Prevent Breast Cancer
  • the BRCAl, BRCA2, PTEN, BAPl, BARDl and hRAD51 proteins are either involved in breast, ovarian, prostate and other cancer susceptibility, in the metabolic pathway of or interact with such proteins. It was determined that the most common form of heriditary breast and ovarian cancer, the BRCAl 185delAG mutation, was found essentially exclusively in one haplotype, namely haplotype OMI1 as defined in Example 1, Fig. 1 and U.S. Patent 5,654,155. As such it was applicants hypothesis that the haplotypes of other related and similar genes alone or in certain combinations provide an indication of association with breast and other cancers associated with these genes, e.g. ovarian, pancreatic, prostate, colon, etc.
  • the various treatments and prophylactics useful against the disease are also believed to be related to the haplotypes. It is already known that certain mutant genes result in different presentations of cancers and different treatment. For example, BRCAl mutations in the early part of the coding sequence generally form cancers at a younger age than mutations in the later part of the coding sequence. Likewise, breast cancer arising from BRCA2 mutations are typically more sensitive to radiation treatment than other breast cancers. Since some of these proteins actually bind to each other, different combinations of haplotypes may bind with different avidity to each other and operate slightly differently under certain circumstances. Likewise for proteins which act at separate reactions within the tumor- suppressing mechanisms.
  • BRCAl Blood samples are drawn from 47 women prescribed tamoxifen to prevent breast cancer or having had breast cancer to prevent reoccurrence of breast cancer.
  • the DNA sequence for BRCAl is determined in the regions of the single nucleotide polymo ⁇ hic sites which constitute the haplotype use the primers according to U.S. Patent 5,654,155. Those of BRCA2 are determined by using the primers of U.S. Patent application 09/084,471 filed May 22, 1998 or using the primers: TABLE 8 BRCA2 PRIMERS
  • the primers for amplifying hRAD51 are: 5'GGGCCCGGATCCATGGCAATGCAGATGCAGC 3' and 5'GGGCCCCAATGGATATCATTCAGTCTTTGGCATCTCCCACTCC 3'
  • the primers for amplifying BAPl are: PRIMER SEQUENCE
  • the primers for amplifying BARDl are: 5'AACAGTACAATGACTGGGCTC 3' and 5 CAGCGCTTCTGCACACAGT 3'
  • the PCR products are sequenced in entirety. All procedures (e.g., isolation of genomic DNA, amplification, sequencing, and analysis of sequence data) are carried out as described in Example 1. The method as described in Examples 1-3 is used to determine the common haplotypes in these genes.
  • the amplified fragments of BRCAl, BRCA2, PTEN and BAPl, produced by PCR are assayed by hybridization to allele-specific oligonucleotides (ASO) which distinguish the polymo ⁇ hic site directly.
  • ASO allele-specific oligonucleotides
  • the PCR products are denatured no more than 30 minutes prior to binding the PCR products to the nylon membrane.
  • the remaining PCR reaction (45 ml) and the appropriate positive control mutant gene amplification product are diluted to 200 ml final volume with PCR Diluent Solution (500 mM NaOH, 2.0 M NaCI, 25 mM EDTA) and mixed thoroughly. The mixture is heated to 95°C for 5 minutes, and immediately placed on ice and held on ice until loaded onto dot blotter, as described below.
  • PCR products are bound to 9 cm by 13 cm nylon ZETA PROBE BLOTTING MEMBRANE (BIO-RAD, Hercules, CA, catalog number 162-0153) using a BIO-RAD dot blotter apparatus. Forceps and gloves are used at all times throughout the ASO analysis to manipulate the membrane, with care taken never to touch the surface of the membrane with bare hands or latex gloves.
  • Pieces of 3MM filter paper [WHATMAN®, Clifton, NJ] and nylon membrane are pre-wet in 10X SSC prepared fresh from 20X SSC buffer stock.
  • the vacuum apparatus is rinsed thoroughly with dH 2 0 prior to assembly with the membrane.
  • 100 ml of each denatured PCR product is added to the wells of the blotting apparatus.
  • Each row of the blotting apparatus contains a set of reactions for a single exon to be tested, including a placental DNA (negative) control, a synthetic oligonucleotide with the desired mutation or a PCR product from a known mutant sample (positive control), and three no template DNA controls.
  • the nylon filter is placed DNA side up on a piece of 3MM filter paper saturated with denaturing solution (1.5M NaCI, 0.5 M NaOH) for 5 minutes.
  • the membrane is transferred to a piece of 3MM filter paper saturated with neutralizing solution (1M Tris-HCl, pH 8, 1.5 M NaCI) for 5 minutes.
  • the neutralized membrane is then transferred to a dry 3MM filter DNA side up, and exposed to ultraviolet light (STRALINKER, STRATAGENE, La Jolla, CA) for exactly 45 seconds to fix the DNA to the membrane. This UV crosslinking should be performed within 30 min. of the denaturation/neutralization steps.
  • the nylon membrane is then cut into strips such that each strip contains a single row of blots of one set of reactions for a single exon.
  • the strip is prehybridized at 52°C incubation using the HYBAID® (SAVANT INSTRUMENTS, INC., Holbrook, NY) hybridization oven.
  • 2X SSC (15 to 20 ml) is preheated to 52°C in a water bath.
  • HYBAID® SAVANT INSTRUMENTS, INC., Holbrook, NY
  • 2X SSC 15 to 20 ml
  • a single piece of nylon mesh cut slightly larger than the nylon membrane strip (approximately 1" x 5") is pre-wet with 2X SSC.
  • Each single nylon membrane is removed from the prehybridization solution and placed on top of the nylon mesh. The membrane/mesh "sandwich” is then transferred onto a piece of ParafilmTM.
  • the membrane/mesh sandwich is rolled lengthwise and placed into an appropriate HYBAID® bottle, such that the rotary action of the HYBAID® apparatus caused the membrane to unroll.
  • the bottle is capped and gently rolled to cause the membrane/mesh to unroll and to evenly distribute the 2X SSC, making sure that no air bubbles formed between the membrane and mesh or between the mesh and the side of the bottle.
  • the 2X SSC is discarded and replaced with 5 ml TMAC Hybridization Solution, which contained 3 M TMAC (tetramethyl ammoniumchloride - SIGMA T-3411), 100 mM Na 3 PO 4 (pH 6.8), 1 mM EDTA, 5X Denhardt's (1% Ficoll, 1% polyvinylpyrrolidone, 1% BSA (fraction V)), 0.6% SDS, and 100 mg/ml Herring Sperm DNA.
  • the filter strips were prehybridized at 52°C with medium rotation (approx. 8.5 setting on the HYBAID® speed control) for at least one hour. Prehybridization can also be performed overnight.
  • the DNA sequences of the oligonucleotide probes used to detect the BRCAl, BRCA2, PTEN, and BAPl single nucleotide polymo ⁇ hisms (SNPs) are as follows (for each polymo ⁇ hism both options for the oligonucleotide are given below): The complements of these probes may also be used. Preliminary laboratory data indicates that probes with either greater specificity or sensitivity can be prepared by slightly varing the length and amount overlapping each side of the polymo ⁇ hic region. It is expected that better probes will be prepared by routine experimentation.
  • Each labeling reaction contains 2- ⁇ l 5X Kinase buffer (or l ⁇ l of 10X Kinase buffer), 5 ⁇ l gamma- ATP 32 P (not more than one week old), l ⁇ l T4 polynucleotide kinase, 3 ⁇ l oligonucleotide (20 ⁇ M stock), sterile H 2 O to 10 ⁇ l final volume if necessary.
  • the reactions are incubated at 37°C for 30 minutes, then at 65°C for 10 minutes to heat inactivate the kinase.
  • the kinase reaction is diluted with an equal volume (lO ⁇ l) of sterile dH 2 0 (distilled water).
  • the oligonucleotides are purified on STE MICRO SELECT-D, G-25 spin columns (catalog no. 5303-356769), according to the manufacturer's instructions.
  • the amount of radioactivity in the oligonucleotide sample is determined by measuring the radioactive counts per minute (cpm). The total radioactivity must be at least 2 million cpm. For any samples containing less than 2 million total, the labeling reaction is repeated.
  • Approximately 2-5 million counts of the labeled oligonucleotide probe is diluted into 5 ml of TMAC hybridization solution, containing 40 ⁇ l of 20 ⁇ M stock of unlabeled alternative polymo ⁇ hism oligonucleotide.
  • the probe mix is preheated to 52°C in the hybridization oven.
  • the pre-hybridization solution is removed from each bottle and replaced with the probe mix.
  • the filter is hybridized for 1 hour at 52°C with moderate agitation. Following hybridization, the probe mix is decanted into a storage tube and stored at -20°C.
  • the filter is rinsed by adding approximately 20 ml of 2x SSC + 0.1 % SDS at room temperature and rolling the capped bottle gently for approximately 30 seconds and pouring off the rinse. The filter is then washed with 2x SSC + 0.1% SDS at room temperature for 20 to 30 minutes, with shaking.
  • the membrane is removed from the wash and placed on a dry piece of 3MM WHATMAN filter paper then wrapped in one layer of plastic wrap, placed on the autoradiography film, and exposed for about five hours depending upon a survey meter indicating the level of radioactivity.
  • the film is developed in an automatic Film processor.
  • the pu ⁇ ose of this step is to ensure that the PCR products are transferred efficiently to the nylon membrane.
  • each nylon membrane is washed in 2X SSC, 0.1% SDS for 20 minutes at 65°C to melt off the bound oligonucleotide probes.
  • the nylon strips are then prehybridized together in 40 ml of TMAC hybridization solution for at least 1 hour at 52°C in a shaking water bath. 2-5 million counts of each of the normal labeled oligonucleotide probes plus 40 ⁇ l of 20 ⁇ M stock of unlabeled normal oligonucleotide are added directly to the container containing the nylon membranes and the prehybridization solution.
  • the filter and probes are hybridized at 52°C with shaking for at least 1 hour.
  • Hybridization can be performed overnight, if necessary.
  • the hybridization solution is poured off, and the nylon membrane is rinsed in 2X SSC, 0.1 % SDS for 1 minute with gentle swirling by hand.
  • the rinse is poured off and the membrane is washed in 2X SSC, 0.1 % SDS at room temperature for 20 minutes with shaking.
  • the nylon membrane is removed and placed on a dry piece of 3MM WHATMAN filter paper.
  • the nylon membrane is then wrapped in one layer of plastic wrap and placed on autoradiography film. The exposure is for at least 1 hour.
  • the pattern of hybridization using the probes from the panel according to Tables 9-12 determine the haplotype of the patient sample when compared to the known haplotypes.
  • the degree of breast, ovarian and other cancer prevention with and without tamoxifen and the degree of prevention of reoccurrence of breast and ovarian cancer with and without tamoxifen are compared for patients grouped by BRCAl, BRCA2, PTEN, BAPl, BARDl, hRAD51 haplotype separately and in all possible combinations using various proprietary data mining techniques similar to the RecognizerTM methodology described in U.S. Patent 5,642,936. Appropriate recommendations regarding the use of tamoxifen for patients of different haplotypes are then be made for patients with and without a history of breast or ovarian cancer.
  • test individuals While this example is a retrospective study and thus unacceptable for proof of efficacy for the U.S. Food and Drug Administration, p rospective studies are also part of the present invention. In a prospective study, the test individuals have their haplotypes determined for each pertinent gene prior to determining whether or not they will be accepted for the drug trial or initiate tamoxifen therapy.
  • Example 6 Pharmacogenetic Analysis of a p53 polymorphism and the Appropriateness of the Human Papiloma Virus Vaccine
  • HPV Human papiloma virus
  • Many strains of the virus cause veneral warts, vulval, penile and perianal cancers.
  • HPV- 16 is believed to be responsible for about half of all cases of cervical cancer.
  • Three other strains are responsible for another 35% of all cervical cancer cases with HPV-18 causing malignant tumors while HPV-6 and HPV-11 usually forming benign lesions.
  • HPV vaccines are made by Medlmmune, Inc. (Gaithersburg, Maryland) and Merck & Co. Clinical trials have already begun.
  • HPV may induce cancer by interacting with p53 in a manner which inhibits the action of p53 to prevent runaway cell growth. It has been known that HPV protein E6 inactivates only p53 proteins from some individuals and not other individuals. Medcalf et al, Onco ene, 8: 2847-2851 (1993). Therefore, determining the haplotype(s) of the p53 gene is believed to indicate who is susceptible to cervical cancer induced by HPV and is therefore a candidate for a HPV vaccine.
  • Previous commercial p53 gene testing of patient samples performed by Oncormed, Inc. involved various sequencing techniques and functional assays for prognostic testing on various tumor samples and susceptibility testing of genomic samples in patients with an inherited mutant p53 gene (Li-Fraumeni Syndrome). While apparent single nucleotide polymo ⁇ hisms were noticed, such results were not reported as the samples are suspected to contain p53 mutations and do not originate from healthy individuals without a genetic history indicating inheritance of two functional p53 alleles.
  • Blood samples are from 53 healthy individuals having a history of veneral warts or at risk from exposure to HPV. Exposure is defined as an individual having regular sexual contact with an infected individual without a barrier preventing transmission of HPV. These individuals have either stage I (normal) or stage II (inflammation) PAP smears. Some of the individuals had been previously treated for veneral warts with one or more of the following treatments: podophyllin, trichloroacetic acid, cryosurgury, cauderization or interferon. Also, blood samples are from 12 patients with a history of cervical cancer as defined by a stage " IV (carcinoma in-situ) or greater PAP smear result.
  • stage I normal
  • stage II inflammation
  • the genomic DNA is used as a template to amplify a DNA fragment encompassing the site of the mutation to be tested.
  • the 25 ml PCR reaction contains the following components: 1 ml template (100 ng/ ml) DNA, 2.5 ml 1 OX PCR Buffer (PERKTN-ELMER), 1.5 ml dNTP (2 mM each dATP, dCTP, dGTP, dTTP), 1.5 ml Forward Primer (10 mM), 1.5 ml Reverse Primer (10 mM), 0.5 ml (2.5 U total) AMPLITAQ GOLDTM TAQ DNA POLYMERASE or AMPLITAQ® TAQ DNA POLYMERASE (PERKIN-ELMER), 1.0 to 5.0 ml (25 mM) MgCl 2 (depending on the primer) and distilled water (dH 2 0) up to 25 ml. All reagents for each exon except the genomic DNA can be combined in a master mix and aliquoted into the reaction
  • INTRON refers to the location in the intron where the primer anneals.
  • primers for exons 2 and 3 may be amplified together with primers: p53-2/3F 5'GAAGCGTCTCATGCTGGAT 3' p53-2/3R 5'GGGGACTGTAGATGGGTGAA 3'
  • control PCRs For each exon analyzed, the following control PCRs are set up:
  • thermocycling conditions PCR for all exons is performed using the following thermocycling conditions:
  • the quality of the PCR products is examined prior to further analysis by electrophoresing an aliquot of each PCR reaction sample on an agarose gel. 5 ⁇ l of each PCR reaction is run on an agarose gel along side a DNA 100 BP DNA LADDER (Gibco BRL cat# 15628-019).
  • the electrophoresed PCR products are analyzed according to the following criteria: Each patient sample must show a single band of the size corresponding the number of base pairs expected from the length of the PCR product from the forward primer to the reverse primer. If a patient sample demonstrates smearing or multiple bands, the PCR reaction must be repeated until a clean, single band is detected. If no PCR product is visible or if only a weak band is visible, but the control reactions with placental DNA template produced a robust band, the patient sample should be re- amplified with 2X as much template DNA.
  • the optimum amount of PCR product on the gel should be between 50 and 100 ng, which can be determined by comparing the intensity of the patient sample PCR products with that of the DNA ladder. If the patient sample PCR products contain less than 50 to 100 ng, the PCR reaction should be repeated until sufficient quantity is obtained.
  • double stranded PCR products are labeled with four different fluorescent dyes, one specific for each nucleotide, in a cycle sequencing reaction.
  • Dye Terminator Chemistry when one of these nucleotides is inco ⁇ orated into the elongating sequence it causes a termination at that point. Over the course of the cycle sequencing reaction, the dye-labeled nucleotides are inco ⁇ orated along the length of the PCR product generating many different length fragments.
  • the dye-labeled PCR products will separate according to size when electrophoresed through a polyacrylamide gel.
  • the fragments pass through a region where a laser beam continuously scans across the gel.
  • the laser excites the fluorescent dyes attached to the fragments causing the emission of light at a specific wavelength for each dye.
  • Either a photomultiplier tube (PMT) detects the fluorescent light and converts is into an electrical signal (ABI 373) or the light is collected and separated according to wavelength by a spectrograph onto a cooled, charge coupled device (CCD) camera (ABI 377). In either case the data collection software will collect the signals and store them for subsequent sequence analysis.
  • PMT photomultiplier tube
  • CCD charge coupled device
  • PCR products are first purified for sequencing using a QIAQUICK-SPIN PCR PURIFICATION KIT (QIAGEN #28104).
  • the purified PCR products are labeled by adding primers, fluorescently tagged dNTPs and Taq Polymerase FS in an ABI Prism Dye Terminator Cycle Sequencing Kit (PERKIN ELMER/ ABI catalog #02154) in a PERKIN ELMER GENEAMP 9600 thermocycler.
  • thermocycling conditions are: Temperature Time # of Cycles
  • the product is then loaded into a gel and placed into an ABI DNA Sequencer (Models 373A & 377) and run.
  • the sequence obtained is analyzed by comparison to the wild type (reference) sequence using SEQUENCE NAVIGATOR software. When a sequence does not align, it indicates a possible mutation or polymo ⁇ hism.
  • the DNA sequence is determined in both the forward and reverse directions. All results are provided to a second reader for review.
  • the genomic DNA is used as a template to amplify a separate DNA fragment encompassing the site of the mutation to be tested.
  • the 50 ⁇ l PCR reaction contains the following components: 1 ⁇ l template (100 ng/ ⁇ l) DNA, 5.0 ⁇ l 10X PCR Buffer (PERKIN-ELMER), 2.5 ⁇ l dNTP (2mM each dATP, dCTP, dGTP, dTTP), 2.5 ⁇ l Forward Primer (10 mM), 2.5 ⁇ l Reverse Primer (10 ⁇ M), 0.5 ⁇ l (2.5 U total) AMPLITAQ® TAQ DNA POLYMERASE or AMPLITAQ GOLDTM DNA POLYMERASE (PERKIN-ELMER), 1.0 to 5.0 ⁇ l (25 mM) MgCl 2 (depending on the primer) and distilled water (dH 2 O) up to 50 ⁇ l. All reagents for each exon except the genomic DNA can be combined in a master mix and aliquoted into the reaction tubes as
  • control PCRs For each exon analyzed, the following control PCRs are set up:
  • thermocycling conditions PCR for all exons is performed using the following thermocycling conditions:
  • the PCR products are denatured no more than 30 minutes prior to binding the PCR products to the nylon membrane.
  • the remaining PCR reaction (45 ⁇ l) and the appropriate positive control polymo ⁇ hism gene amplification product are diluted to 200 ⁇ l final volume with PCR Diluent Solution (500 mM NaOH, 2.0 M NaCI, 25 mM EDTA) and mixed thoroughly.
  • PCR Diluent Solution 500 mM NaOH, 2.0 M NaCI, 25 mM EDTA
  • the mixture is heated to 95°C for 5 minutes, and immediately placed on ice and held on ice until loaded onto dot blotter, as described below.
  • the PCR products are bound to 9 cm by 13 cm nylon ZETA PROBE BLOTTING MEMBRANE (BIO-RAD, Hercules, CA, catalog number 162-0153) using a BIO-RAD dot blotter apparatus.
  • Pieces of 3MM filter paper [WHATMAN®, Clifton, NJ] and nylon membrane are pre-wet in 10X SSC prepared fresh from 20X SSC buffer stock.
  • the vacuum apparatus is rinsed thoroughly with dH 2 O prior to assembly with the membrane.
  • 100 ⁇ l of each denatured PCR product is added to the wells of the blotting apparatus.
  • Each row of the blotting apparatus contains a set of reactions for a single exon to be tested, including a placental DNA (negative) control, a synthetic oligonucleotide with the desired mutation or a PCR product from a known polymo ⁇ hic sample (positive control), and three no template DNA controls.
  • the nylon filter is placed DNA side up on a piece of 3MM filter paper saturated with denaturing solution (1.5 M NaCI, 0.5 M NaOH) for 5 minutes.
  • the membrane is transferred to a piece of 3MM filter paper saturated with neutralizing solution (1 M Tris-HCl, pH 8, 1.5 M NaCI) for 5 minutes.
  • the neutralized membrane is then transferred to a dry 3MM filter DNA side up, and exposed to ultraviolet light (STRALINKER, STRATAGENE, La Jolla, CA) for exactly 45 seconds to fix the DNA to the membrane. This UV crosslinking should be performed within 30 min. of the denaturation/neutralization steps.
  • the nylon membrane is then cut into strips such that each strip contains a single row of blots of one set of reactions for a single exon.
  • the strip is prehybridized at 52°C incubation using the HYBAID® (SAVANT INSTRUMENTS, INC., Holbrook, NY) hybridization oven.
  • 2X SSC (15 to 20 ml) is preheated to 52°C in a water bath.
  • HYBAID® SAVANT INSTRUMENTS, INC., Holbrook, NY
  • 2X SSC 15 to 20 ml
  • a single piece of nylon mesh cut slightly larger than the nylon membrane strip (approximately 1" x 5") is pre-wet with 2X SSC.
  • Each single nylon membrane is removed from the prehybridization solution and placed on top of the nylon mesh. The membrane/mesh "sandwich” is then transferred onto a piece of ParafilmTM.
  • the membrane/mesh sandwich is rolled lengthwise and placed into an appropriate HYBAID® bottle, such that the rotary action of the HYBAID® apparatus caused the membrane to unroll.
  • the bottle is capped and gently rolled to cause the membrane/mesh " to unroll and to evenly distribute the 2X SSC, making sure that no air bubbles formed between the membrane and mesh or between the mesh and the side of the bottle.
  • the 2X SSC is discarded and replaced with 5 ml TMAC Hybridization Solution, which contains 3 M TMAC (tetramethyl ammoniumchloride - SIGMA T-3411), 100 mM Na 3 PO 4 (pH 6.8), 1 mM EDTA, 5X Denhardt's (1% Ficoll, 1% polyvinylpyrrolidone, 1% BSA (fraction V)), 0.6% SDS, and 100 mg/ml Herring Sperm DNA.
  • the filter strips are prehybridized at 52°C with medium rotation (approx. 8.5 setting on the HYBAID® speed control) for at least one hour. Prehybridization can also be performed overnight.
  • the DNA sequences of the numerous oligonucleotide probes are used to detect the p53 mutation. For each mutation, a polymo ⁇ hic and a normal oligonucleotide must be labeled. While only five pairs of oligonucleotide probes are listed below, corresponding oligonucleotides for each mutation may be prepared and used in the same manner.
  • Each labeling reaction contains 2 ⁇ l 5X Kinase buffer (or 1 ⁇ l of 10X Kinase buffer), 5 ⁇ l gamma- ATP 32 P (not more than one week old), 1 ⁇ l T4 polynucleotide kinase, 3 ⁇ l oligonucleotide (20 ⁇ M stock), sterile H 2 O to 10 ⁇ l final volume if necessary.
  • the reactions are incubated at 37°C for 30 minutes, then at 65°C for 10 minutes to heat inactivate the kinase.
  • the kinase reaction is diluted with an equal volume (10 ⁇ l) of sterile dH 2 0 (distilled water).
  • the oligonucleotides are purified on STE MICRO SELECT-D, G-25 spin columns (catalog no. 5303-356769), according to the manufacturer's instructions.
  • the amount of radioactivity in the oligonucleotide sample is determined by measuring the radioactive counts per minute (cpm). The total radioactivity must be at least 2 million cpm. For any samples containing less than 2 million cpm total, the labeling reaction is repeated.
  • Approximately 2-5 million cpm of the labeled polymo ⁇ hic oligonucleotide probe is diluted into 5 ml of TMAC hybridization solution, containing 40 ⁇ l of 20 ⁇ M stock of unlabeled normal oligonucleotide.
  • the probe mix is preheated to 52°C in the hybridization oven.
  • the pre-hybridization solution is removed from each bottle and replaced with the probe mix.
  • the filter is hybridized for 1 hour at 52°C with moderate agitation. Following hybridization, the probe mix is decanted into a storage tube and stored at -20°C.
  • the filter is rinsed by adding approximately 20 ml of 2x SSC + 0.1 % SDS at room temperature and rolling the capped bottle gently for approximately 30 seconds and pouring off the rinse. The filter is then washed with 2x SSC + 0.1 % SDS at room temperature for 20 to 30 minutes, with shaking.
  • the membrane is removed from the wash and placed on a dry piece of 3MM WHATMAN filter paper then wrapped in one layer of plastic wrap, placed on the autoradiography film, and exposed for about five hours depending upon a survey meter indicating the level of radioactivity.
  • the film is developed in an automatic film processor.
  • the pu ⁇ ose of this step is to ensure that the PCR products are transferred efficiently to the nylon membrane.
  • each nylon membrane is washed in 2X SSC, 0.1% SDS for 20 minutes at 65°C to melt off the polymo ⁇ hic oligonucleotide probes.
  • the nylon strips are then prehybridized together in 40 ml of TMAC hybridization solution for at least 1 hour at 52°C in a shaking water bath. 2-5 million counts of each of the normal labeled oligonucleotide probes plus 40 ml of 20 mM stock of unlabeled normal oligonucleotide are added directly to the container containing the nylon membranes and the prehybridization solution.
  • the filter and probes are hybridized at 52°C with shaking for at least 1 hour.
  • Hybridization can be performed overnight, if necessary.
  • the hybridization solution is poured off, and the nylon membrane is rinsed in 2X SSC, 0.1% SDS for 1 minute with gentle swirling by hand.
  • the rinse is poured off and the membrane is washed in 2X SSC, 0.1 % SDS at room temperature for 20 minutes with shaking.
  • the nylon membrane is removed placed on a dry piece of 3MM WHATMAN filter paper.
  • the nylon membrane is then wrapped in one layer of plastic wrap and placed on autoradiography film, and exposure is for at least 1 hour.
  • Aspirin has been a standard anticoagulant therapy for patients who have had a heart attack.
  • aspirin therapy has been extended to individuals with a history or at risk for stroke (apoplexy) and phlebitis. It has even been proposed that every person over 50 years of age should take aspirin.
  • Platlet aggregration is recognized as an important step in the formation of a blockage which will cause a myocardial infarction and unstable angina.
  • Platlet aggregration is based on glycoprotein gpIIb/IIIa. Different forms of this glycoprotein have been known. Weiss et al, Tissue Antigens. 46: 374-381 (1995), Kunicki et al, Molecular Immunology 16: 353-60 (1979). Methods for determining various polymo ⁇ hisms may be done by DNA analysis. Newman et al, Journal of Clinical Investigation 83:1778-81 (1989).
  • Blood samples are taken from 50 healthy individuals ages 50-55. Family history and personal histories of heart disease and other thrombogenic disorders are recorded. White blood cells are collected and genomic DNA is extracted from the white blood cells, PCR amplified and the sequence determined by ASO or sequenced as in the Examples above using different primers and probes. Newman et al, Journal of Clinical Investigation 83:1778-81 (1989). As before, PCR primers and ASO probes are designed to type these individuals for exon 2 to determine which base exists at nucleotide position 1565: a T or a C. at the amino acid level, codon 33 is changed from a leucine to a proline.
  • haplotype PI A2 Individuals having haplotype PI A2 either in homozygous or heterozygous form are instructed to either take high dosages of aspirin (2000 mg per day) or not take aspirin and given other medication appropriate for their individual needs. Individuals homozygous for haplotype PI A1 are instructed to take aspirin at low dosages (350 mg per day)

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Methods for identifying functional allele profiles of a given gene are disclosed. Functional allele profiles comprise the commonly occurring alleles in a population, and the relative frequencies at which such alleles of a given gene occur. Functional allele profiles are useful in treatment and diagnosis of diseases, for genetic and pharmacogenetic applications and for evaluating the degree to which the gene(s) are under selective pressure.

Description

DETERMINING COMMON FUNCTIONAL ALLELES IN A POPULATION
AND USES THEREFORE
This application is a continuation- in-part of co-pending application number 08/905,772, filed August 4, 1997, and is also a continuation-in-part of co-pending application number U.S. Patent application 09/084,471, filed May 22, 1998, each of which is hereby incorporated by referenced in its entirety.
FIELD OF THE INVENTION
The invention relates to methods for identifying functional alleles commonly occurring in a population, for finding new functional alleles, for determining the relative frequencies at which such alleles, for genetic and pharmacogenetic applications of the methods and products produced thereby.
BACKGROUND OF THE INVENTION
An increasing number of genes which play a role in many different diseases are being identified. Detection of mutations in such genes is instrumental in determining susceptibility to or diagnosing these diseases. Some diseases, such as sickle cell disease, are known to be monomorphic; i.e., the disease is generally caused by a single mutation present in the population. In such cases where one or only a few known mutations are responsible for the disease, methods for detecting the mutations are targeted to the site within the gene at which they are known to occur. However, the mutation responsible for such a monomoφhic disease can only be established in the first instance if there exists an accurate reference sequence for the non-pathological state.
In many other cases individuals affected by a given disease display extensive allelic heterogeneity. For example, more than 125 mutations in the human BRCA1 gene have been reported (Breast Cancer Information Core world wide web site at http://www.nchgr.nih.gov/dir/lab_transfer/bic, which became publicly available on November 1, 1995; Friend, S. et al., 1995, Nature Genetics ϋ:238). Mutations in the BRCA1 gene are thought to account for roughly 45% of inherited breast cancer and 80- 90% of families with increased risk of early onset breast and ovarian cancer (Easton, 1993, et al., American Journal of Human Genetics 52: 678-701).
Other examples of genes for which the population displays extensive allelic heterogeneity and which have been implicated in disease include CFTR (cystic fibrosis), dystrophin (Duchenne muscular dystrophy, and Becker muscular dystrophy), and p53 (Li-Fraumeni syndrome).
Breast cancer is also an example of a disease in which, in addition to allelic heterogeneity, there is genetic heterogeneity. In addition to BRCA1, the BRCA2 and BRCA3 genes have been linked to breast cancer. Similarly, the NFI and NFII genes are involved in neurofibromatosis (types I and II, respectively). Furthermore, hereditary non- polyposis colorectal cancer (HNPCC) is a disease in which four genes, MSH2, MLH1, PMS1, and PMS2, have been implicated. It is yet another example of a disease in which there is both allelic and genetic heterogeneity of mutations. A cDNA sequence for MSH2 has been deposited in GenBank as Accession No. U03911; and a cDNA sequence for MLH1 has been deposited in GenBank as Accession No. U40978.
Additionally, disease or disease susceptibility also results from the interaction of more than one gene or the interaction of an environmental, chemical or biological influence on one or more genes. For example, measles virus infects many people; some are immune due to vaccination or previous infection, some are infected but asymptomatic, some become sick with a rash, some develop an encephalitis and some die. Genetic susceptibility and many other factors are involved in the outcome.
A common misconception in the field of molecular genetics is that for any given gene there exists a single "normal" or "wild-type" sequence. Often, research into such wild-type sequences ends once a single sequence associated with normal function is identified. For example, information in GenBank concerning the BRCA1 sequence represented by GenBank Accession No. U14680 does not indicate a basis for whether this sequence is representative of the population at large. Even when polymoφhisms of the BRCA1 gene were identified, no analysis was provided of the arrangement of such sequence variations in a given allele (i.e., the haplotype) (Miki et al., 1994, Science 266: 66-71).
In the fields of plant and animal breeding, the "wild-type" may not be the desirable or may be one of several possibilities. For some domesticated plants and animals, the "wild-type" of any gene may not even be known. In the Brassica family, debate exists as to exactly what is a wild cabbage plant, much less which of the many genes or traits constitutes a "wild-type". By definition, a wild-type is not pathological but sometimes this definition seems inappropriate. For example, the Macintosh apple is propagated asexually exclusively. An inability to reproduce naturally may be considered the result of pathological mutation(s) but is none the less the desired trait. In other situations, different strains of a plant are cross-breed where each set of genes from each parent strain may be considered "wild-type".
Identification of a mutation provides for early diagnosis which is essential for effective treatment of many diseases. However, in order to identify a mutation, it is necessary to have an accurate understanding of the proper reference sequences which encode the non-pathological functional gene products occurring in the population. Prior research efforts and publications have neither suggested nor taught a systematic approach to both identify a functional allele of a given gene and determine the relative frequency with which the allele occurs in the population.
Certain wild-type sequences of a gene may be otherwise indistinguishable from others except under certain circumstances. For example, a gene involved in resistance or susceptibility to a certain infectious agent is only recognized when the individual plant or animal is exposed to the infectious agent. Likewise chemical sensitivity may be a wild- type which is pathological under only certain circumstances which may never occur in the individual. Drought tolerance traits are significant only under environmental stress which may or may not occur. Therefore, the type of wild-type sequence is of importance.
SUMMARY OF THE INVENTION
It is an object of the invention to provide an integrated, systematic process for determining the functional allele profile for a given gene in a population. In accordance with the invention, a functional allele profile contains 1) the identity of the key functional allele or alleles for a given gene in the population, including the "consensus" sequence, and 2) the relative frequency with which these functional alleles occur in the population. Thus, the functional allele profile includes the identification of the consensus normal sequence, i.e., the most commonly occurring functional allele.
The present invention, therefore, provides a normal sequence which is the most likely sequence to be found in the majority of the normal population, the (i.e., "consensus normal DNA sequence"). A consensus normal allele sequence of a gene more accurately reflects the most likely sequence to be found in the population. Determining the consensus sequence is useful in both the diagnosis and treatment of disease. For example, use of the consensus normal gene sequence reduces the likelihood of misinteφreting a "sequence variation" found in the normal population with a pathologic "mutation" (i.e. causes disease in the individual or puts the individual at a high risk of developing the disease). A consensus normal DNA sequence makes it possible for true pathological mutations to be easily identified or differentiated from polymoφhisms.
With large interest in mutation and polymoφhism testing such as cancer predisposition testing, misinteφretation of sequence data is a particular concern. Individuals diagnosed with cancer want to know their prognosis and whether their disease is caused by a heritable genetic mutation. Likewise for other disease and traits and those who manage or manipulate these traits. Relatives of those with cancer who have not yet been diagnosed with the disease are also concerned whether they carry such a heritable mutation. Carrying such a mutation may increase risk of contracting the disease sufficiently to warrant an aggressive surveillance program. Accurate and efficient identification of mutations in genes linked to disease is crucial for widespread diagnostic screening for hereditary diseases.
In addition, the consensus sequence, or other sequences identified in the functional allele profile, allow for the selection of therapeutically optimal nucleotide sequences to be administered in gene therapy or gene replacement, or optimal amino acid sequence in the therapeutic administration of active proteins or peptides. The consensus sequence is generally the easiest target for various agonists, antagonists and measuring interactions with the gene or expression product appropriate for pharmacogenetic analysis. Moreover, determining a functional allele profile of genes allows for an evaluation of the degree to which the gene is under selective pressure.
It is another embodiment of the present invention to find a new allele having a different wild-type haplotype from that previously known.
It is another embodiment of the present invention to determine the haplotype of a sample by determining the polymoφhisms constituting the haplotype. Such a technique applies to one and plural genes, especially genes which interact or express products which interact with each other directly, interact with the same or similar other compound or are along the same metabolic pathway. As such, the method of the present invention determines combinations of haplotypes in different genes.
It is another embodiment of the present invention is determining how an individual will react to a particular chemical, environmental or biological influence. It is a premise of the present invention that different wild-type genes or their expression products interact differently in some circumstances.
Another embodiment of the present invention is the determination of traits and susceptibilities of plants and animals during breeding experiments by detecting the polymoφhisms constituting the gene haplotype associated with the trait or susceptibility of interest.
BRIEF DESCRIPTION OF THE FIGURE
FIG. 1 : Figure 1 shows alternative alleles containing polymoφhic (non-mutation causing variations) sites along the BRCA1 gene, represented as individual "haplotypes" of the BRCA1 gene. The alternative allelic variations occurring at nucleotide positions 2201, 2430, 2731, 3232, 3667, 4427, and 4956 are shown. The BRCAl(omil) haplotype is indicated with dark shading. For comparison, the haplotype available in GenBank is completely unshaded and designated as "GB". Two additional haplotypes (BRCAl(omι2), and BRCAl(omι3) are represented with mixed shaded and unshaded positions (numbers 7 and 9 from left to right). DETAILED DESCRIPTION OF THE INVENTION
The invention provides an integrated, systematic process for determining the functional allele profile for a given gene or combination of genes in a population. In accordance with the invention, a functional allele profile contains 1) the identity of the key functional allele or alleles for a given gene in the population, including the "consensus" sequence, and 2) the relative frequency with which these functional alleles occur in the population. Thus, the functional allele profile includes the identification of the consensus normal sequence, i.e., the most commonly occurring functional allele.
The present invention, therefore, provides a normal sequence which is the most likely sequence to be found in the majority of the normal population, the (i.e., "consensus normal DNA sequence"). A consensus normal allele sequence of a gene more accurately reflects the most likely sequence to be found in the population. In the process for determining functional alleles or afterward, one may search for and discover or synthesize a heretofor unknown or "new" allele.
A functional allele profile can be determined for any gene in which an altered or deficient function produces a recognizable, phenotypic trait, including, but not limited to, pathology. The invention is set forth for the puφose of illustration, and not by way of limitation, for determining the functional allele profile of three different genes associated with disease - for example, the MSH2 and MLHl genes, each associated with hereditary non-polyposis colorectal cancer (HNPCC), and the BRCA1 gene, associated with breast, ovarian, prostate and other cancers.
The following terms as used herein are defined as follows:
"Allele" refers to an alternative version (i.e., nucleotide sequence) of a gene or DNA sequence at a specific chromosomal locus.
"Allelic variation" or "sequence variation" refers to a particular alternative nucleotide or nucleotide sequence at a position within a gene (e.g., a polymoφhic site or mutation) whose sequence varies from one allele to another.
"Coding sequence" or "DNA coding sequence" refers to those portions of a gene which, taken together, code for a peptide (protein), or which nucleic acid itself has function. "Composite genomic sequence" refers to the combination of the two allelic nucleotide sequences (i.e., maternal and paternal) obtained from sequencing a diploid genomic sample.
"Consensus" refers to the most commonly occurring in the population.
"Functional allele" refers to an allele which is naturally transcribed and translated into a functioning protein.
"Functional Allele Profile" refers to a set of functional alleles which are representative of the most common alleles occurring in a population, wherein the functional alleles are identified by nucleotide sequence and the relative frequencies with which the functional alleles occur in the population.
"Haplotype" refers to a set of nucleotides or nucleotide sequences occurring at sites of allelic variation occurring within a locus on a single chromosome (of either maternal or paternal origin). The "locus" includes the entire coding sequence.
"Mutation" refers to a base change or a gain or loss of base pair(s) in a DNA sequence, which results in a DNA sequence which codes for a non-functioning protein or a protein with substantially reduced or altered function.
"Agent for polymerization" refers to an enzyme which may be heat stable, e.g. Taq polymerase, or function at lower temperatures, e.g., room temperature, that effects an extension of DNA from a short primer sequence annealed to the target DNA of interest.
"Polymoφhism" refers to an allelic variation which occurs in greater than or equal to 1% of the normal healthy population.
"Single nucleotide polymoφhism" (SNP) refers to an allelic variation which is defined by two (and only two) alternative bases found at a specific and particular nucleotide in genomic DNA. It may be within a gene (i.e., exonic or intronic) or outside of a gene (such as in a promoter or other regulatory structure) or lastly found between genes.
"Individual" refers to a single organism which may be human, plant or non-human animal. The individual may be intact or a biological sample taken from the individual which contains sufficient substances or information regarding the individual.
"Protein variant" and "variant amino acid sequence" refers to different amino acid sequences from that in one naturally occurring wild-type protein and is generally considered the same protein. Some "different haplotypes have variant amino acid sequences.
"Expression product" refers to an RNA, spliced or unspliced, a pre-, pro-, prepro- or a peptide which alone or in conjunction with other peptides constitutes a protein.
"Pharmaceutical" refers to any bio-effecting chemical drug or biological agent which alters or induces an alteration in the metabolism of an "individual". Pharmaceuticals include compositions for use on veternary animals and agricultural and ornamental plants.
"Trait" refers to a phenotypically determinable characteristic resulting from the influence of one or more genes, alone or in conjunction with an environmental condition or exposure to other agents. Traits include susceptibilities to chemicals, infectious agents and environmental conditions (temperature, drought etc.).
Utility of the Invention
A person skilled in the art of genetic testing will find the present invention useful for diagnosis and treatment of diseases and susceptibility thereto. The invention is especially useful for establishing the "standard" (i.e., consensus normal DNA sequence) and new haplotypes for clinical diagnostic, therapeutic, genetic testing and breeding uses.
Diagnostics
The diagnostic applications for which determining a functional allele profile in accordance with the invention include, but are not limited to, the following: a) identifying individuals having a gene with no coding mutations, which individuals are therefore not at risk or have no increased susceptibility to the pathology(s) associated with a mutation in the gene in question; b) avoiding misinteφretation of functional polymoφhisms detected in the gene as mutations; c) identifying individuals having a potentially abnormal gene that does not match the Consensus Normal DNA sequence; d) determining ethnic founder haplotypes so that clinical analysis is appropriate for an individual from this ethnic group; e) determining a sequence under strongest selective pressure; and f) determining an amino acid and/or short nucleic acid sequence which may be derived from the consensus normal DNA sequence to make diagnostic and probes antibodies. Labeled diagnostic probes may be used by any hybridization method to determine the level of protein in serum or lysed cell suspension of a patient, or solid surface cell sample such as for immunohistochemical analysis. g) detecting a new haplotype and determining the polymoφhisms constituting the new haplotype. h) detecting a new protein variant type and determining the variant amino acids constituting the new protein variant, i) determining the combination of one haplotype or polymoφhism for one gene and the haplotype or polymoφhism for another different gene in the same individual. Generally, the genes or their expression products interact with each other directly, e.g. bind to each other, or indirectly by functioning with each other on the same substrate, are in different stages in a metabolic pathway, or are related to the same disease, susceptibility, condition or trait, j) determining whether to administer a bioeffecting composition to an individual wherein individuals with different haplotypes for one or more genes respond differently to the composition, k) determining susceptibility to disease or other pathology to decide on prophylaxis, therapy or differential monitoring. 1) determining a trait by quick assay of a genetic engineered or selectively bred individual. This permits one to determine the trait without actually measuring the trait phenotypically. m) developing probe chips and panels of allele-specific oligonucleotide(s) to assay for the haplotypes or polymoφhisms in one or more genes. Therapeutics Certain "normal" alleles maybe more functional or hyper- functional than the minimum needed to maintain a normal phenotype in an individual, particularly when stressed. By determining the most common allele in a population one may be observing empiric data for such suitability for survival (the effects may be so subtle that scientists have not determined the basis of this selection). For example, alleles with longer mRNA or protein half-lives (i.e., stability) may produce healthier cells, and, thus, healthier people. Conversely, there may also be a selective advantage to a very short RNA half-life such as in proteins involved in the cell cycle pathway. Furthermore, proteases are known to have favored cutting sites which may be present or absent in different normal alleles leading to peptides that have intrinsic activity themselves.
Thus the determination of the functional allele profile or a new functional allele in accordance with the invention is useful in clinical therapy for: a) selecting optimal alleles for performing gene repair or gene therapy; and b) selecting optimal amino acid sequence for administration of functional protein in treatment or prevention of diseases.
Evolution and Population Genetics Analysis
The determination of the functional allele profile or a new functional allele in accordance with the invention is useful for: a) determining whether a particular gene is under strong selective pressure; and b) determining which of two or more genes which encode proteins with similar functions represents a redundant, or back-up copy of the gene.
Stepwise Process For Determining Functional Allele Profile
For the puφose of illustration, and not by way of limitation, the invention is described below for determining the functional allele profile of three cancer genes. However, the same principles can be applied in accordance with the invention to any gene in which a sequence variation results in a phenotypic trait, in any population within any species. Screening for Individuals with Functional Allele Phenotype
In accordance with the invention, a group of individuals determined to be at low risk for carrying a mutation in the gene of interest is used as a source for genetic material. Any standard method known in the art for performing pedigree analysis can be used for this selection process. See, for example, Haφer, P.S., Practical Genetic Counseling, 3d. ed., 1988 (Wright/Butterworth & Co. Ltd.: Boston), especially at pages 4-7. For example, individuals can be screened in order to identify those with no disease history in their immediate family, i.e., among their first and second degree relatives. A first degree relative is a parent, sibling, or offspring. A second degree relative is an aunt, uncle, grandparent, grandchild, niece, nephew, or half- sibling.
In a preferred embodiment for when a functional allele profile of an autosomal dominant disorder with relatively high penetrance (e.g., greater than 50%) is desired, each person is asked to fill out a hereditary cancer prescreening questionnaire. More preferably, when an autosomal dominant cancer gene with such relatively high penetrance is the gene of interest, the questionnaire set forth in Table 1, below, is used.
Table 1 Hereditary Cancer Pre-Screening Questionnaire
Part A: Answer the following questions about your family
1. To your knowledge, has anyone in your family been diagnosed with a very specific hereditary colon disease called Familial Adenomatous Polyposis (FAP)?
2. To your knowledge, have you or any aunt had breast cancer diagnosed before the age 35?
3. Have you had Inflammatory Bowel Disease, also called Crohn's Disease or Ulcerative Colitis, for more than 7 years?
Part B: Refer to the list of cancers below for your responses only to questions in Part B
Bladder Cancer, Lung Cancer, Pancreatic Cancer, Breast Cancer, Gastric Cancer, Prostate Cancer, Colon Cancer, Malignant Melanoma, Renal Cancer, Endometrial Cancer, Ovarian Cancer, Thyroid Cancer
4. Have your mother or father, your sisters or brothers or your children had any of the listed cancers? 5. Have there been diagnosed in your mother's brothers or sisters, or your mother's parents more than one of the cancers in the above list?
6. Have there been diagnosed in your father's brothers or sisters, or your father's parents more than one of the cancers in the above list?
Part C: Refer to the list of relatives below for responses only to questions in Part C
You, Your mother, Your sisters or brothers, Your mothers's sisters or brothers (maternal aunts and uncles), Your children, Your mother's parents (maternal grandparents)
7. Have there been diagnosed in these relatives 2 or more identical types of cancer?
Do not count "simple" skin cancer, also called basal cell or squamous cell skin cancer.
8. Is there a total of 4 or more of any cancers in the list of relatives above other than "simple" skin cancers?
Part D: Refer to the list of relatives below for responses only to questions in Part D.
You, Your father, Your sisters or brothers, Your fathers's sisters or brothers (paternal aunts and uncles)
Your children, Your father's parents (paternal grandparents)
9. Have there been diagnosed in these relatives 2 or more identical types of cancer?
Do not count "simple" skin cancer, also called basal cell or squamous cell skin cancer.
10. Is there a total of 4 or more of any cancers in the list of relatives above other than "simple" skin cancers?
© Copyright 1996, OncorMed, Inc.
Individuals who answer no to all questions in Table 1 are designated as low risk of being carriers of mutations in the gene of interest and, therefore, in accordance with the invention, are candidates for further analysis set forth below.
Sequencing
From the group of individuals determined to have a low risk of being carriers for a mutant allele of the gene of interest, a group is selected for genomic DNA sequence analysis. Any number of samples may be analyzed. Preferably, a number of samples which is small enough for convenient, accurate sequence analysis, but large enough to provide a reliable representation of the population is analyzed. Most preferably, initial sequencing may be performed on ten different chromosomes by analyzing samples from five unrelated individuals.
Preferably, sequencing template is obtained by amplifying the coding region and optionally one or more related sequences (e.g. splice site junctions, enhancers, introns, promotors and other regulatory elements) of the gene of interest. Any nucleic acid specimen, in purified or non-purified form, can be utilized as the starting nucleic acid or acids, providing it contains, or is suspected of containing, the specific nucleic acid sequence containing a polymoφhic locus. Thus, the process may amplify, for example, DNA or RNA, including messenger RNA, wherein DNA or RNA may be single stranded or double stranded. In the event that RNA is to be used as a template, enzymes, and/or conditions optimal for reverse transcribing the template to DNA would be utilized. In addition, a DNA-RNA hybrid which contains one strand of each may be utilized. A mixture of nucleic acids may also be employed, or the nucleic acids produced in a previous amplification reaction herein, using the same or different primers may be so utilized. The specific nucleic acid sequence to be amplified, i.e., the polymoφhic locus, may be a fraction of a larger molecule or can be present initially as a discrete molecule, so that the specific sequence constitutes the entire nucleic acid. It is not necessary that the sequence to be amplified be present initially in a pure form; it may be a minor fraction of a complex mixture, such as contained in whole human DNA.
While the primer pairs used are greater than required to amplify the particular polymoφhisms, the primer set actually used is listed below. For larger scale testing of polymoφhisms for haplotype determination, only the primer pairs actually amplifying the polymoφhism are required. Additionally, primers which amplify a shorter region, as short as the one nucleotide polymoφhism may be used.
When a gene containing exons is analyzed, preferably the exonic sequences are individually amplified from genomic template DNA using a pair of primers specific for the intronic regions proximally bordering each individual exon. DNA utilized herein may be "extracted from a body sample, such as blood, tissue material and the like by a variety of techniques such as that described by Maniatis, et. al. in Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, NY, pp. 280-281, 1982). If the extracted sample is impure, it may be treated before amplification with an amount of a reagent effective to open the cells, or animal cell membranes of the sample, and to expose and or separate the strand(s) of the nucleic acid(s). This lysing and nucleic acid denaturing step to expose and separate the strands will allow amplification to occur much more readily.
The deoxyribonucleotide triphosphates dATP, dCTP, dGTP, and dTTP are added to the synthesis mixture, either separately or together with the primers, in adequate amounts and the resulting solution is heated to about 90°-100°C from about 1 to 10 minutes, preferably from 1 to 4 minutes. After this heating period, the solution is allowed to cool, which is preferable for the primer hybridization. To the cooled mixture is added an appropriate agent for effecting the primer extension reaction (called herein "agent for polymerization"), and the reaction is allowed to occur under conditions known in the art. The agent for polymerization may also be added together with the other reagents if it is heat stable. This synthesis (or amplification) reaction may occur at room temperature up to a temperature above which the agent for polymerization no longer functions. Thus, for example, if DNA polymerase is used as the agent, the temperature is generally no greater than about 40°C. Most conveniently the reaction occurs at room temperature.
The primers used to carry out this invention embrace oligonucleotides of sufficient length and appropriate sequence to provide initiation of polymerization. Environmental conditions conducive to synthesis include the presence of nucleoside triphosphates and an agent for polymerization, such as DNA polymerase, and a suitable temperature and pH. The primer is preferably single stranded for maximum efficiency in amplification, but may be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent for polymerization. The exact length of primer will depend on many factors, including temperature, buffer, and nucleotide composition. The oligonucleotide primer typically contains 12-20 or more nucleotides, although it may contain fewer nucleotides.
Primers used to carry out this invention are designed to be substantially complementary to each strand of the genomic locus to be amplified. This means that the primers must be sufficiently complementary to hybridize with their respective strands under conditions which allow the agent for polymerization to perform. In other words, the primers should have sufficient complementarity with the 5' and 3' sequences flanking the mutation to hybridize therewith and permit amplification of the genomic locus.
Oligonucleotide primers of the invention are employed in the amplification process which is an enzymatic chain reaction that produces exponential quantities of polymoφhic locus relative to the number of reaction steps involved. Typically, one primer is complementary to the negative (-) strand of the polymoφhic locus and the other is complementary to the positive (+) strand. Annealing the primers to denatured nucleic acid followed by extension with an enzyme, such as the large fragment of DNA polymerase I (Klenow) and nucleotides, results in newly synthesized + and - strands containing the target polymoφhic locus sequence. Because these newly synthesized sequences are also templates, repeated cycles of denaturing, primer annealing, and extension results in exponential production of the region (i.e., the target polymoφhic locus sequence) defined by the primers. The product of the chain reaction is a discreet nucleic acid duplex with termini corresponding to the ends of the specific primers employed.
The oligonucleotide primers of the invention may be prepared using any suitable method, such as conventional phosphotriester and phosphodiester methods or automated embodiments thereof. In one such automated embodiment, diethylphosphoramidites are used as starting materials and may be synthesized as described by Beaucage, et al., Tetrahedron Letters. 22:1859-1862, (1981). One method for synthesizing oligonucleotides on a modified solid support is described in U.S. Patent No. 4,458,066.
The agent for polymerization may be any compound or system which will function to accomplish the synthesis of primer extension products, including enzymes. Suitable enzymes for this puφose include, for example, E. coli DNA polymerase I, Klenow fragment of E. coli DNA polymerase, polymerase muteins, reverse transcriptase, other enzymes, including heat-stable enzymes (i.e., those enzymes which perform primer extension after being subjected to temperatures sufficiently elevated to cause denaturation), such as Taq polymerase. Suitable enzyme will facilitate combination of the nucleotides in the proper manner to form the primer extension products which are complementary to each polymoφhic locus nucleic acid strand. Generally, the synthesis will be initiated at the 3' end of each primer and proceed in the 5' direction along the template strand, until synthesis terminates, producing molecules of different lengths.
The newly synthesized strand and its complementary nucleic acid strand will form a double-stranded molecule under hybridizing conditions described above and this hybrid is used in subsequent steps of the process. In the next step, the newly synthesized double-stranded molecule is subjected to denaturing conditions using any of the procedures described above to provide single-stranded molecules.
The steps of denaturing, annealing, and extension product synthesis can be repeated as often as needed to amplify the target polymoφhic locus nucleic acid sequence to the extent necessary for detection. The amount of the specific nucleic acid sequence produced will accumulate in an exponential fashion. Amplification is described in PCR. A Practical Approach, ILR Press, Eds. M. J. McPherson, P. Quirke, and G. R. Taylor, 1992.
The amplification products may be detected by Southern blots analysis, without using radioactive probes. In such a process, for example, a small sample of DNA containing a very low level of the nucleic acid sequence of the polymoφhic locus is amplified, and analyzed via a Southern blotting technique or similarly, using dot blot analysis. The use of non-radioactive probes or labels is facilitated by the high level of the amplified signal. Alternatively, probes used to detect the amplified products can be directly or indirectly detectably labeled, for example, with a radioisotope, a fluorescent compound, a bioluminescent compound, a chemiluminescent compound, a metal chelator or an enzyme. Those of ordinary skill in the art will know of other suitable labels for binding to the probe, or will be able to ascertain such, using routine experimentation.
Sequences amplified by the methods of the invention can be further evaluated, detected, cloned, sequenced, and the like, either in solution or after binding to a solid support, by any method usually applied to the detection of a specific DNA sequence such as PCR, oligomer restriction (Saiki, et al, Bio/Technology. 3:1008-1012, (1985)), allele-specific oligonucleotide (ASO) probe analysis (Conner, et al, Proc. Natl. Acad. Sci. U.S.A., 80:278, (1983)), oligonucleotide ligation assays (OLAs) (Landgren, et al, Science, 241 :1007, (1988)), heteroduplex analysis, chromatographic separation and the like. Molecular techniques for DNA analysis have been reviewed (Landgren, et al, Science. 242:229-237, (1988)).
Preferably, the method of amplifying is by PCR, as described herein and as is commonly used by those of ordinary skill in the art. Alternative methods of amplification have been described and can also be employed as long as the genetic locus amplified by PCR using primers of the invention is similarly amplified by the alternative means. Such alternative amplification systems include but are not limited to self-sustained sequence replication, which begins with a short sequence of RNA of interest and a T7 promoter. Reverse transcriptase copies the RNA into cDNA and degrades the RNA, followed by reverse transcriptase polymerizing a second strand of DNA. Another nucleic acid amplification technique is nucleic acid sequence-based amplification (NASBA) which uses reverse transcription and T7 RNA polymerase and incoφorates two primers to target its cycling scheme. NASBA can begin with either DNA or RNA and finish with either, and amplifies to 108 copies within 60 to 90 minutes. Alternatively, nucleic acid can be amplified by ligation activated transcription (LAT). LAT works from a single- stranded template with a single primer that is partially single-stranded and partially double-stranded. Amplification is initiated by ligating a cDNA to the promoter oligonucleotide and within a few hours, amplification is 10s to IO9 fold. Another amplification system useful in the method of the invention is the QB Replicase System. The QB replicase system can be utilized by attaching an RNA sequence called MDV-1 to RNA complementary to a DNA sequence of interest. Upon mixing with a sample, the hybrid RNA finds its complement among the specimen's mRNAs and binds, activating the replicase to copy the tag-along sequence of interest. Another nucleic acid amplification technique, ligase chain reaction (LCR), works by using two differently labeled halves of a sequence of interest which are covalently bonded by ligase in the presence of the contiguous sequence in a sample, forming a new target. The repair chain reaction (RCR) nucleic acid amplification technique uses two complementary and target- specific oligonucleotide probe pairs; thermostable polymerase and ligase, and DNA nucleotides to geometrically amplify targeted sequences. A 2-base gap separates the oligonucleotide probe pairs, and the RCR fills and joins the gap, mimicking DNA repair. Nucleic acid amplification by strand displacement activation (SDA) utilizes a short primer containing a recognition site for Hindi with short overhang on the 5' end which binds to target DNA. A DNA polymerase fills in the part of the primer opposite the overhang with sulfur-containing adenine analogs. Hindi is added but only cuts the unmodified DNA strand. A DNA polymerase that lacks 5' exonuclease activity enters at the cite of the nick and begins to polymerize, displacing the initial primer strand downstream and building a new one which serves as more primer. SDA produces greater than 107-fold amplification in 2 hours at 37°C. Unlike PCR and LCR, SDA does not require instrumented temperature cycling.
Another method is a process for amplifying nucleic acid sequences from a DNA or RNA template which may be purified or may exist in a mixture of nucleic acids. The resulting nucleic acid sequences may be exact copies of the template, or may be modified. The process has advantages over PCR in that it increases the fidelity of copying a specific nucleic acid sequence, and it allows one to more efficiently detect a particular point mutation in a single assay. A target nucleic acid is amplified enzymatically while avoiding strand displacement. Three primers are used. A first primer is complementary to the first end of the target. A second primer is complementary to the second end of the target. A third primer which is similar to the first end of the target and which is substantially complementary to at least a portion of the first primer such that when the third primer is hybridized to the first primer, the position of the third primer complementary to the base at the 5' end of the first primer contains a modification which substantially avoids strand displacement. This method is detailed in U.S. Patent 5,593,840 to Bhatnagar et al., 1997. Although PCR is the preferred method of amplification if the invention, these other methods can also be used to amplify the gene of interest.
A number of methods well-known in the art can be used to carry out the sequencing reactions. Preferably, enzymatic sequencing based on the Sanger dideoxy method is used. Mass spectroscopy may also be used.
The sequencing reactions can be analyzed using methods well-known in the art, such as polyacrylamide gel electrophoresis. In a preferred embodiment for efficiently processing multiple samples, the sequencing reactions are carried out and analyzed using a fluorescent automated sequencing system such as the Applied Biosystems, Inc. ("ABI", Foster City, CA) system. For example, PCR products serving as templates are fluorescently labeled using the Taq Dye Terminator® Kit (Perkin-Elmer cat# 401628). Dideoxy DNA sequencing is performed in both forward and reverse directions on an ABI automated Model 377® sequencer. The resulting data can be analyzed using "Sequence Navigator®" software available through ABI.
Alternatively, large numbers of samples can be prepared for and analyzed by capillary electrophoresis, as described, for example, in Yeung et al., U.S. Patent No. 5,498,324. Initial and Companion Haplotype Determination
The functional allele profiles identified in accordance with the invention may contain different alleles. Furthermore, each allele may contain multiple allelic variations, such as multiple polymoφhisms. In other words, two different alleles may differ in sequence from one another at multiple nucleotide positions. Moreover, two such multiply polymoφhic alleles may be present in the same individual, i.e., a heterozygote. When the genomic sample of the gene of such a heterozygous individual is sequenced, the variations at each position can be detected. They are the alternative sequences present at particular positions in the composite sequence obtained from the diploid genome. However, at this stage, which variations are grouped together in each individual haplotype or allele, i.e., the phase of the variations, cannot be determined.
For example, genomic sequence analysis of a hypothetical gene from a heterozygous individual may reveal that polymoφhic positions 1, 2, or 3 each contain either an A or a G. However, it cannot be determined from this information alone whether the variations are distributed between the two alleles as: allele 1 = A, A2A3 and allele 2 = G,G2G3; or allele 1 = A, A2G3 and allele 2 = G^A-,; or allele 1 = A,G2G3 and allele 2 = G,A2A3, etc.
In accordance with the invention, such heterozygous genomic sequences obtained for the puφose of determining a functional allele profile are compared to an initial haplotype sequence. Some haplotypes can also be determined upon sequencing chromosomal samples from a homozygous individual according to the methods above. Such homozygous sequence analyses contain no ambiguities in sequence between the two alleles because they are identical.
Preferably, an initial haplotype sequence is obtained by determining the cDNA sequence of an individual identified as being at low risk for carrying a mutation as described above. Because the full-length of a cDNA of the gene of interest is derived from a single mRNA transcript, it contains the allelic variations of a single haplotype. It contains all of the allelic variations present in a single allele of the individual from which it was obtained. Thus, the cDNA sequence contains half of the allelic variations present in the composite genomic sequence of a heterozygous individual containing that allele. Moreover, unlike sequence information from a heterozygous chromosomal sample, such cDNA sequence indicates which of the allelic variations are grouped together in one allele, i.e., the phase of the variations.
By determining an initial haplotype, the companion haplotype present in a heterozygote can be determined by subtracting this sequence from the composite genomic sequence. For example, if in the illustration set forth above, the cDNA sequenced has an A in position 1, a G in position 2 and an A in position 3, then the initial haplotype is A,G2A3. This sequence is then subtracted from the composite genomic sequence to yield the companion haplotype, namely G,A2G3.
In general, the initial haplotype identified in a given individual also can be used to determine the presence of the haplotype in other individuals by comparing the initial haplotype sequence to the composite genomic sequence from such other individuals. When the number of allelic variations detected within a gene is four or greater, and especially when the number of allelic variations is five or greater, this method of subtracting the initial haplotype sequence from the composite genomic sequence of other individuals readily provides recognizably distinct haplotypes which are independent of each other. See, for example, the OMI1 and GB haplotypes in FIG. 1, which differ from each other in each of seven sites of allelic variation.
When a haplotype determined in one individual is used to determine the haplotypes present in the composite genomic sequence of other individuals, the presence of that particular haplotype, and its companion haplotype as determined by subtraction from a composite genomic sequence, should be confirmed. Such confirmation of the occurrence of a given haplotype in the population can be carried out, for example, by 1) sequencing cDNA samples, as described in this section, from such other heterozygous individuals; or 2) identifying individuals homozygous for the haplotype either among the initial set of sequenced chromosomal samples or by additional confirmatory sequencing of chromosomal samples as described below.
If an initial haplotype is not represented in any heterozygous composite genomic sequences obtained, one or more additional haplotypes should be obtained from such a heterozygous individual or from different individuals screened as above. cDNA sequences for determining the initial haplotype can be obtained using standard techniques well known in the art. First, mRNA is isolated from an individual, for example, from blood or skin cells. The mRNA is initially reversed-transcribed into double stranded cDNA and then amplified according to the well known technique of RT- PCR (see, for example, U.S Patent No. 5,561,058 by Gelfand et al.).
The resulting cDNA, whose sequence represents a single haplotype, can be sequenced according to the methods above.
Determining the Relative Frequencies of the Haplotype
After all haplotypes have been identified in the study population, their relative frequencies are determined. For example, if five chromosomes out of a total often chromosomes are of one haplotype, then its frequency is 50%. Subsequently, each haplotype is ranked in order from the most frequent to the least frequent to yield the functional allele profile.
Confirmatory Analysis of Additional Samples
As described above, initial sequence analysis is performed on a small group of individuals, most preferably five individuals, screened according to the methods described above.
After identifying the haplotypes and determining their relative frequencies among the initial set of alleles analyzed, it may be desirable, in accordance with the invention, to perform follow-up, confirmatory sequencing on additional individuals who are also screened according to the methods described above. Confirmatory sequencing can be carried out as above.
The haplotypes found occurring in the population are used as references to inteφret the haplotypes present in any heterozygous individuals encountered during the confirmatory sequencing analysis of additional individuals.
By sequencing such additional samples, additional data points can be added to the functional allele profile to provide more precise frequencies of occurrence of each allele in the population. Furthermore, additional samples may contain a new functional allele with a new haplotype. This is particularly likely to be found for uncommon (<10%) or rare (<1%) haplotypes.
Furthermore, confirmatory sequence analysis ensures that the haplotypes determined by subtracting an initial haplotype from a composite heterozygous sequence is indeed represented in the population. Such techniques may also be used when multiple common haplotypes exist for the gene and it is uncertain which to use for subtraction.
When no sequence variation is found in the initial set of chromosomes, this indicates that the polymoφhism rate of the gene of interest is uncommon (e.g., polymoφhisms occur in <10% of the alleles in the population studied). In such situations, identification of uncommon alleles and determination of their frequencies requires a confirmatory sequence analysis of samples from additional individuals. This method was used to detect such an uncommon polymoφhism in exon 8 of the MLHl gene, in Example 2 below.
Such confirmatory sequencing analysis also resulted in the identification and determination of relative frequency of occurrence of polymoφhisms in intronic sequences, bordering exonic regions, of both the MSH2 and MLHl genes, as detailed in Examples 1 and 2, respectively, below. The invention is illustrated by way of the Examples below.
EXAMPLE 1 : Determining the Functional Allele Profile for MSH2
Approximately 150 volunteers are screened in order to identify individuals with no cancer history in their immediate family (i.e. first and second degree relatives). Each person is asked to fill out the hereditary cancer prescreening questionnaire shown in Table 1, above. A first degree relative is a parent, sibling, or offspring. A second degree relative is an aunt, uncle, grandparent, grandchild, niece, nephew, or half-sibling. Among those individuals who answered "no" to all questions, five individuals are randomly chosen for end-to-end sequencing of their MSH2 gene.
Genomic DNA (100 nanograms) is extracted from white blood cells of five individuals designated as low risk of being carriers of mutations in the MSH2 gene from analysis of their answers to the questionnaire set forth in Table 1 above. The MSH2 coding region in each of the five samples is sequenced end-to-end by amplifying each exon individually. Each sample is amplified in a final volume of 25 microliters containing 1 microliter (100 nanograms) genomic DNA, 2.5 microliters 10X PCR buffer
(100 mM Tris, pH 8.3, 500 mM KCl, 1.2 mM MgCl2), 2.5 microliters 10X dNTP mix (2 mM each nucleotide), 2.5 microliters forward primer, 2.5 microliters reverse primer, and
1 microliter Taq polymerase (5 units), and 13 microliters of water.
The primers in Table 2, below, are used to carry out amplification of the various sections of the MSH2 gene samples. The primers are synthesized on an DNA/RNA
Synthesizer Model 394®.
Table 2
MSH2 PRIMER SEQUENCES
Exon Primer Sequence
1 MSH1F-1 5'-CGC GTC TGC TTA TGA TTG G-3'
MSH1R-1 5'-TCT CTG AGG CGG GAA AGG-3'
2 MSH2-2F-2-INSIDE 5-.TTT TTT TTT TTT TAA GGA GC-3'
MSH2-2R-FULL 5'-CAC ATT TTT ATT TTT CTA CTC-3'
3 MSH3F 5'-GCT TAT AAA ATT TTA AAG TAT GTT C-3'
MSH3R-2 5'-CTG GAA TCT CCT CTA TCA C-3'
4 MSH4F 5'-TTC ATT TTT GCT TTT CTT ATT CC-3'
MSH4R 5'-ATA TGA CAG AAA TAT CCT TC-3*
5 MSH2-5F-1 5'-CAG TGG TAT AGA AAT CTT CGA-3'
MSH2-5R-2-INSIDE 5'_TTT TTT TTT TTT TTA CCT GA-3'
MSH6F-1 5'-ACT AAT GAG CTT GCC ATT CT-3' MSH6R-1 5'-TGG GTA ACT GCA GGT TAC A-3'
7 MSH7F 5'-GAC TTA CGT GCT TAG TTG-3' MSH7R 5'-AGT ATA TAT TGT ATG AGT TGA AGG-3'
8 MSH8F 5'-GAT TTG TAT TCT GTA AAA TGA GAT C-3' MSH8R 5'-GGC CTT TGC TTT TTA AAA ATA AC-3'
MSH9F 5*-GTC TTT ACC CAT TAT TTA TAG G-3' MSH9R 5'-GTA TAG ACA AAA GAA TTA TTC C-3'
10 MSH10F 5'-GGT AGT AGG TAT TTA TGG AAT AC-3' MSH10R 5'CAT GTT AGA GCA TTT AGG G-3* 11 MSH11F 5'-CAC ATT GCT TCT AGT ACA C-3' MSH11R 5'-CCA GGT GAC ATT CAG AAC-3'
12 MSH12F 5'-ATT CAG TAT TCC TGT GTA C-3' MSH12R 5'-CGT TAC CCC CAC AAA GC-3'
13 MSH13F-1 5'ATG CTA TGT CAG TGT AAA CC-3' MSH13R-1 5'CCA CAG GAA AAC AAC TAT TA-3'
14 MSH14F 5'-TAC CAC ATT TTA TGT GAT GG-3* MSH14R 5'-GGG GTA GTA AGT TTC CC-3'
15 MSH15F 5'-CTC TTC TCA TGC TGT CCC-3' MSH15R 5'-ATA GAG AAG CTA AGT TAA AC-3'
16 MSH16F 5'-TAA TTA CTC ATG GGA CAT TC-3' MSH16R-1 5'GGC ACT GAC AGT TAA CAC TA-3'
NOTE: These MSH2 primers are M-13 tailed: Ml 3 tail for F:5'-TGT AAA ACG ACG GCC AGT-3' added to 5' end of primer above
M13 tail for R:5'-CAG GAA ACA GCT ATG ACC-3' added to 5' end of primer above
Thirty-five cycles are performed, each consisting of denaturing (95°C; 30 seconds), annealing (55°C; 1 minute), and extension (72°C; 90 seconds), except during the first cycle in which the denaturing time was increased to 5 minutes, and during the last cycle in which the extension time was increased to 5 minutes.
PCR products are purified using Qia-quick® PCR purification kits (Qiagen®, cat# 28104; Chatsworth, CA). Yield and purity of the PCR product determined spectrophotometrically at OD260 on a Beckman DU 650 spectrophotometer.
All exons of the MSH2 gene are subjected to direct dideoxy sequence analysis by asymmetric amplification using the polymerase chain reaction (PCR) to generate a single stranded product amplified from this DNA sample. Shuldiner, et al, Handbook of Techniques in Endocrine Research, p. 457-486, DePablo, F., Scanes, C, eds., Academic Press, Inc., 1993. Fluorescent dye is attached to PCR products for automated sequencing using the Taq Dye Terminator Kit (Perkin-Elmer® cat# 401628). DNA sequencing is performed in both forward and reverse directions on an Applied Biosystems, Inc. (ABI) Foster City, CA., automated sequencer (Model 377). The software used for analysis of the resulting data is "Sequence Navigator®" purchased through ABI.
Results
No differences in nucleotide sequence are observed among the coding exons of the five normal individuals (10 chromosomes), nor between these 10 chromosomal sequences and the sequence published in GenBank (Accession No. U03911) for MSH2. Thus, all ten individuals are homozygous for the same allele. An additional sixty-two normal individuals are sequenced end-to-end to confirm this result. Once again no sequence variation is found within the exons. However, minor variation in three single nucleotide polymoφhisms are found in non-coding intronic sequences (IVS9-9; IVS10+6; IVS 10+12). The results are summarized in Table 3, below.
Table 3 MSH2 HAPLOTYPES
Allelic Variations
Haplotype IVS9-9 IVS 10+6 IVS 10+12 Number of
Chromosomes
GenBank sequence T T A 98 (73%)
(U03911)
Variant #1 A C G 28 (21%)
Variant #2 A C A* 6 (4.5%)
Variant #3 T Q** A 2 (1.5%)
* Variant #2 is an uncommon derivative chromosome of variant #1
**Variant #3 is a rarer derivative chromosome of GenBank cDNA
Since the exonic coding sequence is maintained on all 4 haplotypes, such non- coding sequence variation did not result in any new "normal" coding consensus sequence ofthe MSH2 gene.
These results demonstrate that the sequence in the GenBank Repository is the "consensus normal DNA sequence" that should be used for comparison in all clinical applications to determine an individual with a hereditary susceptibility to HNPCC. In addition, these results indicate that normal MSH2 protein function, i.e., mismatch repair function, is under a large degree of selective pressure to maintain viability in the human population. Very little if any variation in the activity of the MSH2 protein's mismatch repair function is tolerated, as reflected by the extraordinarily high degree of conservation of the normal sequence.
EXAMPLE 2: Determining the Functional Allele Profile for MLHl
All procedures (e.g., selection of five individuals at low risk of being carriers for
MLHl mutations, isolation of genomic DNA, amplification of exons, sequencing of amplified exons, and analysis of sequence data) are carried out as described in Example
1, above, except that the amplification is carried out using primers specific to the MLHl exons as set forth in Table 4, below.
Table 4
MLHl PRIMER SEQUENCES
Exon Primer Sequence
1 MLHAF 5'-AGG CAC TGA GGT GAT TGG C-3'
MLHAR 5'-TCG TAG CCC TTA AGT GAG C-3'
2 MLHBF-2 5'-TGA GGC ACT ATT GTT TGT ATT T-3'
MLHBR-2 5'-TGT TGG TGT TGA ATT TTT CAG T-3*
3 MLHCF 5'-AGA GAT TTG GAA AAT GAG TAA C-3'
MLHCR 5'-ACA ATG TCA TCA CAG GAG G-3'
4 MLHDF-1 5'-TGA GGT GAC AGT GGG TGA-3'
MLHCR 5'-GAT TAC TCT GAG ACC TAG GC-3'
5 MLHEF 5'-GAT TTT CTC TTT TCC CCT TGG G-3'
MLHER 5'-CAA ACAAAG CTT CAA CAA TTT AC-3'
6 MLHFF 5'-GGG TTT TAT TTT CAA GTA CTT CTA TG-3'
MLHFR 5'-GCT CAG CAA CTG TTC AAT GTA TGA GC-3'
7 MLHGF 5'-CTA-GTG TGT GTT TTT GGC-3'
MLHGR 5'-CAT AAC CTT ATC TCC ACC-3'
8 MLHHF 5'-CTC AGC CAT GAG ACA ATA AAT CC-3'
MLHHR 5'-GGT TCC CAAATA ATG TGA TGG-3'
9 MLHIF-1 5'-GTT TAT GGG AAG GAA CCT TGT-3' MLHIR-1 5'-TGG TCC CAT AAA ATT CCC TGT-3'
10 MLHJF 5'-CAT GAC TTT GTG TGA ATG TAC ACC-3'
MLHJR 5'-GAG GAGAGC CTG ATA GAA CAT CTG-3'
11 MLHKF 5'-GGG CTT TTT CTC CCC CTC CC-3'
MLHKR 5'-AAAATC TGG GCT CTC ACG-3'
12 MLH1-LAF-2-INSIDE 5'-TTT AAT ACA GAC TTT GCT AC-3'
MLH1-LBR 5'-GAAAAG CCA AAG TTA GAA GG-3'
13 MLHMF 5'-TGC AAC CCA CAAAAT TTG GC-3'
MLHMR 5'-CTT TCT CCA TTT CCAAAA CC-3'
14 MLHNF 5'-TGG TGT CTC TAGTTC TGG-3'
MLHNR 5'-CAT TGT TGT AGT AGC TCT GC-3'
15 MLHOF-2* 5'-GCA GAA CTA TGT CTG TCT CAT-3'
MLHOR 5'-CGG TCA GTT GAA ATG TCA G-3'
16 MLHPF 5'-CAT TTG GAT CCG TTAAAG C-3'
MLHPR 5'-CAC CCG GCT GGA AAT TTT ATT TG-3'
17 MLHQF 5'-GGAAAG GCA CTG GAG AAATGG G-3'
MLHQR 5'-CCC TCC AGC ACA CAT GCA TGT ACC G-
3'
18 MLHRF 5'-TAA GTA GTC TGT GAT CTC CG-3'
MLHRR 5'-ATG TAT GAG GTC CTG TCC-3'
19 MLHSF 5'-GAC ACC AGT GTA TGT TGG-3'
MLHSR* 5'-GAG AAA GAA GAA CAC ATC CC-3'
NOTE: MLHl primers are M-13 tailed,
*EXCEPT for MLHl primers MLHOF-2, MLHOR & MLHSR:
M13 tail for F: 5'-TGT AAA ACG ACG GCC AGT-3' added to 5' end of primer above
Ml 3 tail for R: 5'-CAG GAA ACA GCT ATG ACC-3' added to 5' end of primer above
Results
No differences are observed among the coding exons of the five normal individuals (10 chromosomes), nor between these 10 chromosomal sequences and the sequence published in GenBank (Accession No. U40978) for the MLHl gene. In order to confirm these findings confirmatory sequencing is performed on an additional 62 samples. Among these sixty-two samples, variations are identified in only two positions as summarized in Table 5, below.
Table 5
MLHl Haplotypes
Allelic Variation
EXON 8 Number of
Haplotvpe codon 219 IVS 14- 19 Chromosomes
GenBank Sequence A A 114 (92.5%) (040978)
Variant #1 A G 5 (3.7%)
Variant #2 G G 4 (3.1%)
Variant #3 G A 1 (0.7%
Total 134 (100%) One sequence variation is within exon 8 wherein a single nucleotide change from A to G in the first position of codon 219 (ATC --> GTC) changes the amino acid from He to Val. This sequence variation occurs approximately 3.7% of the time in this population. The second sequence variation is deep within an intron (IV514-19) and can be found to be independently segregating with the exon 8 polymoφhisms. While there were two "normal" exonic haplotypes identified in MLHl (A versus G at codon 219), the most commonly found haplotype (i.e. consensus normal DNA sequence) having an A at the first position of codon 219 is the sequence currently in the GenBank database which should be used as the standard for clinical comparisons. In addition, this analysis demonstrated that there is less selective pressure on the MLHl gene (since codon 219 can have two forms) than on the MSH2 gene where no exonic sequence variation was tolerated. Given that these two genes are both mismatch repair genes, this observation indicates that the degree of redundancy of function (i.e., level of hierarchy between these proteins) is MSH2 as the primary system with MLHl only as secondary or backup when MSH2 is dysfunctional (i.e., mutant). While empiric data from other studies proposed such a relationship, only determining the actual functional allele profiles for these two genes provides an accurate understanding of the basis of previous observations from population studies.
EXAMPLE 3: Determining the Functional Allele Profile for BRCAl
All procedures (e.g., selection of five individuals at low risk of being carriers for BRCAl mutations, isolation of genomic DNA, amplification of exons, and sequencing of amplified exons, and analysis of sequence data) are carried out as described in Example 1, above, except that the amplification is carried out using primers specific to the BRCAl exons as set forth in Table 6, below.
Table 6 BRCAl PRIMERS FOR SEQUENCING TEMPLATES Exon Pπmer SEQUENCE Mg++ SIZE
2 2F 5' GAAGTTGTCATTTTATAAACCTTT-3' 1.6 -275 2R 5' TGTCTTTTCTTCCCTAGTATGT-3'
3 3F 5' TCCTGACACAGCAGACATTA-3' 1.4 -375 3R 5' TTGGATTTCGTTCTCACTTA-3'
5 5F 5' CTCTTAAGGGCAGTTGTGAG-3' 1.2 -275 5R 5' TTCCTACTGTGGTTGCTTCC-3'
6 6/7F 5' CTTATTTTAGTGTCCTTAAAAGG-3' 1.6 -250 6R 5' TTTCATGGACAGCACTTGAGTG-3'
7 7F 5' CACAACAAAGAGCATACATAGGG-3' 1.6 -275 6/7R 5' TCGGGTTCACTCTGTAGAAG-3'
8 8F1 5' TTCTCTTCAGGAGGAAAAGCA-3' 1.2 -270 8R1 5' GCTGCCTACCACAAATACAAA-3'
9 9F 5' CCACAGTAGATGCTCAGTAAA TA-3' 1.2 -250 9R 5' TAGGAAAATACCAGCTTCATAGA-3'
10 10F 5' TGGTCAGCTTTCTGTAATCG-3' 1.6 -250 10R 5' GTATCTACCCACTeTCTTCTTCAG-3'
11A 11AF 5' CCACCTCCAAGGTGTATCA-3' 1.2 372
11 AR 5' TGTTATGTTGGCTCCTTGCT-3'
11B 11BF1 5' CACTAAAGACAGAATGAATCTA-3; 1.2 -400
11BR1 5' GAAGAACCAGAATATTCATCTA-3'
11C 11CF1 5' TGATGGGGAGTCTGAATCAA-3' 1.2 -400
11 CR1 5' TCTGCTTTCTTGATAAAATCCT-3'
11D 11DF1 5' AGCGTCCCCTCACAAATAAA-3' 1.2 -400
11DR1 5' TCAAGCGCATGAATATGCCT-3'
HE 11EF 5' GTATAAGCAATATGGAACTCGA-3' 1.2 388
11ER 5' TTAAGTTCACTGGTATTTGAACA-3,
11F 11FF 5' GACAGCGATACTTTCCCAGA-3' 1.2 382
11FR 5' TGGAACAACCATGAATTAGTC-3'
11G 11GF 5' GGAAGTTAGCACTCTAGGGA-3' 1.2 423
11 GR 5' GCAGTGATATTAACTGTCTGTA-3'
11H 11HF 5' TGGGTCCTTAAAGAAACAAAGT-3' 1.2 366
11HR 5' TCAGGTGACATTGAATCTTCC-3'
111 11IF 5' CCACTTTTTCCCATCAAGTCA-3' 1.2 377
11IR 5' TCAGGATGCTTACAATTACTTC-3'
I U 11JF 5' CAAAATTGAATGCTATGCTTAGA-3' 1.2 377
11 JR 5' TCGGTAACCCTGAGCCAAAT-3'
UK 11KF 5' GCAAAAGCGTCCAGAAAGGA-3' 1.2 396
11KR-1 5' TATTTGCAGTCAAGTCTTCCAA-3'
11L 11LF-1 5' GTAATATTGGCAAAGGCATCT-3' 1.2 360
11 LR 5' TAAAATGTGCTCCCCAAAAGCA-3 '
12 12F 5' GTCCTGCCAATGAGAAGAAA-3' 1.2 -300 12R 5' TGTCAGCAAACCTAAGAATGT-3'
13 13F 5' AATGGAAAGCTTCTCAAAGTA-3' 1.2 -325 13R 5' ATGTTGGAGCTAGGTCCTTAC-3'
14 14F 5' CTAACCTGAATTATCACTATCA-3' 1.2 -310 14R 5' GTGTATAAATGCCTGTATGCA-3'
15 15F 5' TGGCTGCCCAGGAAGTATG-3' 1.2 -375 15R 5' AACCAGAATATCTTTATGTAGGA-3'
16 16F 5' AATTCTTAACAGAGACCAGAAC-3' 1.6 -550 16R 5ΑAAACTCTTTCCAGAATGTTGT-3'
17 17F 5' GTGTAGAACGTGCAGGATTG-3' 1.2 -275 17R 5' TCGCCTCATGTGGTTTTA-3' 18 18F 5' GGCTCTTTAGCTTeTTAGGAC-3' 1.2 -350
18R 5' GAGACCATTTTCCCAGCATC-3'
19 19F 5' CTGTCATTCTTCCTGTGCTC-3' 1.2 -250
19R 5' CATTGTTAAGGAAAGTGGTGC-3'
20 20F 5ΑTATGACGTGTCTGCTCCAC-3' 1.2 -425
20R 5' GGGAATCCAAATTACACAGC-3'
21 21F 5' AAGCTCTTCCTTTTTGAAAGTC-3' 1.6 -300
21R 5' GTAGAGAAATAGAATAGCCTCT-3'
22 22F 5' TCCCATTGAGAGGTCTTGCT-3' 1.6 -300
22R 5' GAGAAGACTTCTGAGGCTAC-3'
23 23F-1 5' TGAAGTGACAGTTCCAGTAGT-3' 1.2 -250 23R-1 5' CATTTTAGCCATTCATTCAACAA-3'
24 24F 5' ATGAATTGACACTAATCTCTGC-3' 1.4 -285 24R 5' GTAGCCAGGACAGTAGAAGGA-3'
1 M13 tailed
Results
Differences in the nucleotide sequences of the five normal individuals are found in seven locations on the gene. The data show that for each of the samples, the BRCAl gene is identical except in the region of seven single nucleotide polymoφhisms. The changes and their positions are summarized on Table 7, below, and are depicted in schematic form in FIG. 1. The alternative alleles containing polymoφhic (non-mutation causing allelic variations) sites along the BRCAl gene are represented in FIG. 1 as individual "haplotypes" of the BRCAl gene. The BRCAl(om,l) haplotype is shown in FIG. 1 and indicated with dark shading. The alternative allelic variations occurring at nucleotide positions 2201, 2430, 2731, 3232, 3667, 4427, and 4956 are shown. For comparison, the haplotype previously available in GenBank (as Accession No. U14680) is completely unshaded and designated "GB". As can be seen, the most common, "consensus" haplotype occurs in five separate chromosomes labeled with the OMI symbol (haplotypes 1-5 from left to right). Two additional haplotypes (BRCAl(omι2), and BRCAl (omι3) are represented with mixed shaded and unshaded positions (numbers 7 and 9 from left to right). In total, 7 of the ten 10 haplotypes identified in the group of five individuals tested are not the haplotype available in GenBank. The changes, their positions, and their frequencies among the five individuals (ten chromosomes) initially analyzed are summarized on Table 7, below.
Table 7
NORMAL PANEL TYPING
AMINO
ACID EXON 1 2 3 4 5 FREQUENCY
CHANGE
SER(SER) HE C/C C/T C/T T/T T/T 0.4 C 0.6 T (694)
LEU(LEU) 11F T/T C/T C/T C/C C/C 0.4 T 0.6 C
(771)
PRO(LEU) 11G C/T C/T C/T T/T T/T 0.3 C 0.7 T
(871)
GLU(GLY) 111 A A A/G A/G G/G G/G 0.4 A 0.6 G
(1038)
LYS(ARG) 11J A/A A/G A/G G/G G/G 0.4 A 0.6 G
(1183)
SER(SER) 13 T/T T/T T/C C/C C/C 0.5 T 0.5 C
(1436)
SER(GLY) 16 A/A A/G A/G G/G G/G 0.4 A 0.6 G
(1613)
Note that there is no requirement to sequence the additional normal individuals available, as has been done for MSH2 (Example 1, above) and MLHl (Example 2, above) to more accurately determine the frequencies of uncommon polymoφhisms. A common haplotype (the "consensus") is readily evident as different from the GenBank sequence (FIG. 1, "GB") in 50% of chromosomes and indeed is homozygous in two normal individuals. Thus, the "consensus" sequence of the BRCA (omi1) should be used as the only true standard for clinical diagnostic analysis in order to avoid misinteφreting polymoφhisms as pathologic mutations.
In the alternative, one could compare the test sequence against all four of the BRCAl functional haplotypes.
Example 4: Pharmacogenetic Analysis of Sulfa Drug Sensitivity
The glucose-6-phosphate dehydrogenase gene is located on the X chromosome. Individuals with certain sequence variations in the G6PDH gene lead relatively normal lives unless they are exposed to certain chemicals found in fava beans, primaquine and sulfonamide antibiotics (sulfisoxazole, sulfamethoxazole, sulfathiazole, sulfacetamide, etc.). Upon administration of such compounds to the individual, severe reactions including hemolytic anemia occur in individuals having certain haplotype(s) of the G6PDH gene. These individuals are generally of African and Mediterranean heritage. Because these sequence variations are otherwise of little importance, they have been called both polymoφhisms and mutations in the literature. For the puφoses of this application, they are called mutations to distinguish them from clear polymoφhisms. Genetic analysis in chimpanzees and various human populations indicate that the probable natural "wild-type" is found in individuals sensitive to sulfonamide antibiotics. Beutler et al, Blood 74: 2550-2555 (1989).
A number of apparently inconsequential single nucleotide polymoφhisms (SNPs) in the G6PDH gene are known including at intron 5 (PvuII site), nucleotides 202 (Nla III site), 376 (Fok I site), 1311 and 1116 (Pst I sites). These constitute and define the haplotype. Missense mutations occur at amino acids 32, 48, 58, 68, 106, 126, 131, 156, 163, 165, 181, 182, 188, 198, 213, 216, 227, 282, 285, 291, 317, 323, 335, 342, 353, 363, 385, 386, 387, 393, 394, 398, 410, 439, 447, 454, 459, 463 and amino acid 35 deleted. Many mutations are restricted to certain haplotypes. Thus, haplotype determination provides an indication of whether the individual is sensitive to the drugs listed above.
Experimental
Blood is drawn from 30 individuals of African- American heritage with urinary tract infections having bacteria sensitive to sulfa antibiotics and for whom treatment with trimethoprim-sulfamethiazole is otherwise deemed appropriate. 1 mg of genomic DNA from individuals is isolated from peripheral blood lymphocytes and amplified by PCR using the primers listed in Hirono et al, Proc. Natl. Acad. Sci. USA 85:3951-3954 (1988) and Beutler et al, Human Genetics 87:462-464 (1990) according to the methods in Example 1 above. Amplified fragments are divided into five aliquots and four of which are cleaved by a restriction enzyme, either PvuII, Nla III, Fok I or Pst I, according to the manufacturer's (Stratagene and New England Biolabs) instructions. The digests are electrophoresed in a 4% agarose gel (NuSieve, FMC) with 10 ml of ethidium bromide (10 mg/ml) and the number of bands counted under ultraviolet light. The number of bands indicates the presence or absence of restriction enzyme cleavage and presence of a particular nucleotide at the polymoφhic site.
An oligonucleotide probe for determining the polymoφhic site at nucleotide 1311 is listed in Beutler et al, Human Genetics 87:462-464 (1990). The fifth aliquot is immobilized on a membrane and an ASO (allele specific oligonucleotide) hybridization assay is performed according to the method of Example 5 below. The presence or absence of the label indicating hybridization is considered indicative of the presence of a particular nucleotide at the polymoφhic site.
Individuals having a haplotype, particularly the polymoφhism at nucleotide 1116, indicative of very low likelihood of a G6PDH mutation sensitive to sulfamethiazole are given 160 mg trimethoprim with 800 mg sulfamethiazole (SEPTRA DS). Individuals having a haplotype or polymoφhism indicative of a possible presence of a G6PDH mutation sensitive to sulfamethiazole are given a different antibiotic (varied with the patient) to which their infecting organism was susceptible.
Confirmatory sequencing of both alleles (60 chromosomes) of the coding region of the G6PDH gene is later performed by the techniques of Example 1 to determine the presence of a sensitizing mutation. The haplotype(s) associated with a mutation and those not associated with a mutation are recorded. A panel of oligonucleotides bound to a membrane or other solid phase such as a DNA chip distinguishing the haplotypes and/or the common mutations also is to become part of the present invention. Example 5: Pharmacogenetic Analysis of BRCAl. BRCA2. PTEN. BAPl. BARDl and hRAD51 Haplotypes and the Use of Tamoxifen to Prevent Breast Cancer
While every step in carcinogenesis is not known, the BRCAl, BRCA2, PTEN, BAPl, BARDl and hRAD51 proteins are either involved in breast, ovarian, prostate and other cancer susceptibility, in the metabolic pathway of or interact with such proteins. It was determined that the most common form of heriditary breast and ovarian cancer, the BRCAl 185delAG mutation, was found essentially exclusively in one haplotype, namely haplotype OMI1 as defined in Example 1, Fig. 1 and U.S. Patent 5,654,155. As such it was applicants hypothesis that the haplotypes of other related and similar genes alone or in certain combinations provide an indication of association with breast and other cancers associated with these genes, e.g. ovarian, pancreatic, prostate, colon, etc.
The various treatments and prophylactics useful against the disease are also believed to be related to the haplotypes. It is already known that certain mutant genes result in different presentations of cancers and different treatment. For example, BRCAl mutations in the early part of the coding sequence generally form cancers at a younger age than mutations in the later part of the coding sequence. Likewise, breast cancer arising from BRCA2 mutations are typically more sensitive to radiation treatment than other breast cancers. Since some of these proteins actually bind to each other, different combinations of haplotypes may bind with different avidity to each other and operate slightly differently under certain circumstances. Likewise for proteins which act at separate reactions within the tumor- suppressing mechanisms.
Experimental
Blood samples are drawn from 47 women prescribed tamoxifen to prevent breast cancer or having had breast cancer to prevent reoccurrence of breast cancer. The DNA sequence for BRCAl is determined in the regions of the single nucleotide polymoφhic sites which constitute the haplotype use the primers according to U.S. Patent 5,654,155. Those of BRCA2 are determined by using the primers of U.S. Patent application 09/084,471 filed May 22, 1998 or using the primers: TABLE 8 BRCA2 PRIMERS
EXON SEQUENCE POLYMORPHISM lOAF 5'GAATAATATAAATTATATGGCTTA 3' 1093
10AR 5'CCTAGTCTTGCTAGTTCTT 3' 1093
10BF 5'ARCTGAAGTGGAACCAAATGATAC 3' 1593
10BR 5'ACGTGGCAAAGAATTCTCTGAAGTAA 3' 1593
11BF 5'AAGAAGCAAAATGTAATAAGGA 3' 2457
11BR 5'CATTTAAAGCACATACATCTTG 3' 2457
11CF 5'TCTAGAGGCAAAGAATCATAC 3' 2908
11CR 5'CAAGATTATTCCTTTCATTAGC 3' 2908
11DF 5'AACCAAAACACAAATCTAAGAG 3' 3199
11DR 5'GTCATTTTTATATGCTGCTTTAC 3' 3199
11EF 5'GGTTTTATATGGAGACACAGG 3' 3624
HER 5'GTATTTACAATTTCAACACAAGC 3' 3624
11FF 5'ATCACAGTTTTGGAGGTAGC 3' 4035
11FR 5'CTGACTTCCTGATTCTTCTAA 3' 4035
14F 5'ACCATGTAGCAAATGAGGGTCT 3' 7470
14R 5'GCTTTTGTCTGTTTTCCTCCAA 3' 7470
22F 5'AACCACACCCTTAAGATGA 3' 9079
22R 5'GCATAAGTAGTGGATTTTGC 3' 9079
The DNA sequences for haplotypes of PTEN are determined by using the published primers of Table 3, Liaw et al, Nature Genetics. 16(1): p. 64-67 (1997).
The primers for amplifying hRAD51 are: 5'GGGCCCGGATCCATGGCAATGCAGATGCAGC 3' and 5'GGGCCCCAATGGATATCATTCAGTCTTTGGCATCTCCCACTCC 3'
The primers for amplifying BAPl are: PRIMER SEQUENCE
BAP1A-F 5' CACGAGGCATGGCGCTGAGG 3' BAP1A-R 5' CCGGGCCTTGTCTGTCCACT 3' BAP1B-F 5* GTCTACCCCATTGACCATGG 3' BAP1B-R 5' TCATCATCTGAGTACTGCTG 3' BAP1C-F 5' TGCAGGAGGAAGAAGACCTG 3' BAP1C-R 5' TCTGTCAGCGCCAGGGGACT 3' BAP1D-F 5' AGCACAGGCCTGCTGCACCT 3' BAP1D-R 5' GAAAAGGGGAAGTGGGGCAG 3' The primers for amplifying BAPl for polymoφhism detection in the 3' UTR are: BAP1-PF 5'AGCCCAGGCCCCAACACAGCCCCATGGCCTCT 3' BAP1-PR 5'CTTAGGAGAGTTTTATTCATTCATTGATCCAG 3'
The primers for amplifying BARDl are: 5'AACAGTACAATGACTGGGCTC 3' and 5 CAGCGCTTCTGCACACAGT 3'
In the cases of BARDl and hRAD51, the PCR products are sequenced in entirety. All procedures (e.g., isolation of genomic DNA, amplification, sequencing, and analysis of sequence data) are carried out as described in Example 1. The method as described in Examples 1-3 is used to determine the common haplotypes in these genes.
Once standardized by sequencing, the amplified fragments of BRCAl, BRCA2, PTEN and BAPl, produced by PCR are assayed by hybridization to allele-specific oligonucleotides (ASO) which distinguish the polymoφhic site directly. The ASO assay is performed as described in the following experiment.
Binding PCR Products to Nylon Membrane
The PCR products are denatured no more than 30 minutes prior to binding the PCR products to the nylon membrane. To denature the PCR products, the remaining PCR reaction (45 ml) and the appropriate positive control mutant gene amplification product are diluted to 200 ml final volume with PCR Diluent Solution (500 mM NaOH, 2.0 M NaCI, 25 mM EDTA) and mixed thoroughly. The mixture is heated to 95°C for 5 minutes, and immediately placed on ice and held on ice until loaded onto dot blotter, as described below.
The PCR products are bound to 9 cm by 13 cm nylon ZETA PROBE BLOTTING MEMBRANE (BIO-RAD, Hercules, CA, catalog number 162-0153) using a BIO-RAD dot blotter apparatus. Forceps and gloves are used at all times throughout the ASO analysis to manipulate the membrane, with care taken never to touch the surface of the membrane with bare hands or latex gloves.
Pieces of 3MM filter paper [WHATMAN®, Clifton, NJ] and nylon membrane are pre-wet in 10X SSC prepared fresh from 20X SSC buffer stock. The vacuum apparatus is rinsed thoroughly with dH20 prior to assembly with the membrane. 100 ml of each denatured PCR product is added to the wells of the blotting apparatus. Each row of the blotting apparatus contains a set of reactions for a single exon to be tested, including a placental DNA (negative) control, a synthetic oligonucleotide with the desired mutation or a PCR product from a known mutant sample (positive control), and three no template DNA controls.
After applying PCR products, the nylon filter is placed DNA side up on a piece of 3MM filter paper saturated with denaturing solution (1.5M NaCI, 0.5 M NaOH) for 5 minutes. The membrane is transferred to a piece of 3MM filter paper saturated with neutralizing solution (1M Tris-HCl, pH 8, 1.5 M NaCI) for 5 minutes. The neutralized membrane is then transferred to a dry 3MM filter DNA side up, and exposed to ultraviolet light (STRALINKER, STRATAGENE, La Jolla, CA) for exactly 45 seconds to fix the DNA to the membrane. This UV crosslinking should be performed within 30 min. of the denaturation/neutralization steps. The nylon membrane is then cut into strips such that each strip contains a single row of blots of one set of reactions for a single exon.
Hybridizing Labeled Oligonucleotides to the Nylon Membrane Prehybridization
The strip is prehybridized at 52°C incubation using the HYBAID® (SAVANT INSTRUMENTS, INC., Holbrook, NY) hybridization oven. 2X SSC (15 to 20 ml) is preheated to 52°C in a water bath. For each nylon strip, a single piece of nylon mesh cut slightly larger than the nylon membrane strip (approximately 1" x 5") is pre-wet with 2X SSC. Each single nylon membrane is removed from the prehybridization solution and placed on top of the nylon mesh. The membrane/mesh "sandwich" is then transferred onto a piece of Parafilm™. The membrane/mesh sandwich is rolled lengthwise and placed into an appropriate HYBAID® bottle, such that the rotary action of the HYBAID® apparatus caused the membrane to unroll. The bottle is capped and gently rolled to cause the membrane/mesh to unroll and to evenly distribute the 2X SSC, making sure that no air bubbles formed between the membrane and mesh or between the mesh and the side of the bottle. The 2X SSC is discarded and replaced with 5 ml TMAC Hybridization Solution, which contained 3 M TMAC (tetramethyl ammoniumchloride - SIGMA T-3411), 100 mM Na3PO4(pH 6.8), 1 mM EDTA, 5X Denhardt's (1% Ficoll, 1% polyvinylpyrrolidone, 1% BSA (fraction V)), 0.6% SDS, and 100 mg/ml Herring Sperm DNA. The filter strips were prehybridized at 52°C with medium rotation (approx. 8.5 setting on the HYBAID® speed control) for at least one hour. Prehybridization can also be performed overnight.
Labeling Oligonucleotides
The DNA sequences of the oligonucleotide probes used to detect the BRCAl, BRCA2, PTEN, and BAPl single nucleotide polymoφhisms (SNPs) are as follows (for each polymoφhism both options for the oligonucleotide are given below): The complements of these probes may also be used. Preliminary laboratory data indicates that probes with either greater specificity or sensitivity can be prepared by slightly varing the length and amount overlapping each side of the polymoφhic region. It is expected that better probes will be prepared by routine experimentation.
TABLE 9 - BRCAl
2201 C5' ACATGACAGCGATACTT 3' 2201 T5' ACATGACAGTGATACTT 3'
2430 T5' AGTATTTCATTGGTACC 3' 2430 C5' AGTATTTCACTGGTACC 3'
2731 C5' CATTTGCTCCGTTTTCA 3' 2731 T5' CATTTGCTCJGTTTTCA 3'
3232 A5' TTTTTAAAGAAGCCAGC 3' 3232 G5' TTTTTAAAGGAGCCAGC 3'
3667 A5' GCGTCCAGAAAGGAGAG 3' 3667 G5' GCGTCCAGAGAGGAGAG 3'
4427 T5' AAGTGACTCTTCTGCCC 3' 4427 C5' AAGTGACTCCTCTGCCC 3'
4956 A5' TGTGCCCAGAGTCCAGC 3' 4956 G5' TGTGCCCAGGGTCCAGC 3'
1186 A5' GGAATAAGCAGAAACTG 3' 1186 G5' GGAATAAGCGGAAACTG3'
2196 G5' AAAAGACATGACAGCGA 3' 2196 A5' AAAAGACATAACAGCGA 3'
3238 G5' AAGAAGCCAGCTCAAGC 3' 3238 A5' AAGAAGCCAACTCAAGC 3'
2202 G5' CATGACAGTGATACTTT 3' 2202 A5' CATGACAGTAATACTTT 3'
TABLE 10 - BRCA2 PROBE SEQUENCE
1093 A5TAGGACATTGGCATTGA 3' 1093 C5'TAGGACATGTGGCATTGA 3'
1342 A5'CTTCTGATTTGCTACATT 3' 1342 C5'CTTCTGATGTGCTACATT 3'
1593 A5'GGCTTCTCTGATTTTGGT 3' 1593 G5'GGCTTCTCGGATTTTGGT 3'
2457 T5TTTTGAATATTGTACTGG 3' 2457 C5TTTTGAATGTTGTACTGG 3'
2908 G5ΑTTAGCTACTTGGAAGAC 3' 2908 A5ΑTTAGCTATTTGGAAGAC 3'
3199 A5'CCATTTGTTCATGTAATC 3' 3199 G5'CCATTTGTCCATGTAATC 3'
3624 A5TAGCTTGGTTTTCTAAAC 3' 3624 G5TAGCTTGGCTTTCTAAAC 3'
4035 T 5ΑTTGAAACAACAGAATCA 3' 4035 C5ΑTTGAAACGACAGAATCA 3'
7470 A5TGAAAATGTGATTTAGTT 3' 7470 G5TGAAAATGCGATTTAGTT 3'
9079 G5TTCCATGGCCTTCCTAAT 3' 9079 A5 TCCATGGTCTTCCTAAT 3'
TABLE 11 - PTEN 132 C 5'CTTGAAGGCGTATACAGG 3' 132 T 5'CTTGAAGGTGTATACAGG 3'
TABLE 12 - BAPl +1102 5'ATGGCCTCTACCAGATGGC 3' +1102 5'ATGGCCTCTCCCAGATGGC 3' +1102 5'ATGGCCTCTGCCAGATGGC 3' +11025'ATGGCCTCTTCCAGATGGC 3'
+11165'CAGATGGCTTTGAAAAAGG 3' +11165'CAGATGGCTTTGCAAAAGG 3' +11165'CAGATGGCTTTGGAAAAGG 3' +11165'CAGATGGCTTTGTAAAAGG 3'
+11315'GATCCAAACAGGCCCCTTT 3' +11315'GATCCAACCAGGCCCCTTT 3' +11315'GATCCAAGCAGGCCCCTTT 3' +11315'GATCCAA1CAGGCCCCTTT 3'
+12335'CCCTGTAAAAACTGGATCA 3' +12335'CCCTGTAAACACTGGATCA 3' +12335'CCCTGTAAAGACTGGATCA 3' +12335'CCCTGTAAATACTGGATCA 3'
Each labeling reaction contains 2-μl 5X Kinase buffer (or lμl of 10X Kinase buffer), 5μl gamma- ATP 32P (not more than one week old), lμl T4 polynucleotide kinase, 3μl oligonucleotide (20 μM stock), sterile H2O to 10 μl final volume if necessary. The reactions are incubated at 37°C for 30 minutes, then at 65°C for 10 minutes to heat inactivate the kinase. The kinase reaction is diluted with an equal volume (lOμl) of sterile dH20 (distilled water).
The oligonucleotides are purified on STE MICRO SELECT-D, G-25 spin columns (catalog no. 5303-356769), according to the manufacturer's instructions. The 20μl synthetic oligonucleotide eluate is diluted with 80 μl dH20 (final volume = 100 μl). The amount of radioactivity in the oligonucleotide sample is determined by measuring the radioactive counts per minute (cpm). The total radioactivity must be at least 2 million cpm. For any samples containing less than 2 million total, the labeling reaction is repeated.
Hybridization with Oligonucleotides
Approximately 2-5 million counts of the labeled oligonucleotide probe is diluted into 5 ml of TMAC hybridization solution, containing 40 μl of 20 μM stock of unlabeled alternative polymoφhism oligonucleotide. The probe mix is preheated to 52°C in the hybridization oven. The pre-hybridization solution is removed from each bottle and replaced with the probe mix. The filter is hybridized for 1 hour at 52°C with moderate agitation. Following hybridization, the probe mix is decanted into a storage tube and stored at -20°C. The filter is rinsed by adding approximately 20 ml of 2x SSC + 0.1 % SDS at room temperature and rolling the capped bottle gently for approximately 30 seconds and pouring off the rinse. The filter is then washed with 2x SSC + 0.1% SDS at room temperature for 20 to 30 minutes, with shaking.
The membrane is removed from the wash and placed on a dry piece of 3MM WHATMAN filter paper then wrapped in one layer of plastic wrap, placed on the autoradiography film, and exposed for about five hours depending upon a survey meter indicating the level of radioactivity. The film is developed in an automatic Film processor.
Control Hybridization with Normal Oligonucleotides
The puφose of this step is to ensure that the PCR products are transferred efficiently to the nylon membrane.
Following hybridization with the bound oligonucleotide, as described above, each nylon membrane is washed in 2X SSC, 0.1% SDS for 20 minutes at 65°C to melt off the bound oligonucleotide probes. The nylon strips are then prehybridized together in 40 ml of TMAC hybridization solution for at least 1 hour at 52°C in a shaking water bath. 2-5 million counts of each of the normal labeled oligonucleotide probes plus 40 μl of 20μM stock of unlabeled normal oligonucleotide are added directly to the container containing the nylon membranes and the prehybridization solution. The filter and probes are hybridized at 52°C with shaking for at least 1 hour. Hybridization can be performed overnight, if necessary. The hybridization solution is poured off, and the nylon membrane is rinsed in 2X SSC, 0.1 % SDS for 1 minute with gentle swirling by hand. The rinse is poured off and the membrane is washed in 2X SSC, 0.1 % SDS at room temperature for 20 minutes with shaking.
The nylon membrane is removed and placed on a dry piece of 3MM WHATMAN filter paper. The nylon membrane is then wrapped in one layer of plastic wrap and placed on autoradiography film. The exposure is for at least 1 hour.
For each sample, adequate transfer to the membrane is indicated by a strong autoradiographic hybridization signal. For each sample, an absent or weak signal when hybridized with its normal oligonucleotide, indicates an unsuccessful transfer of PCR product, and it is a false negative. The ASO analysis must be repeated for any sample that did not successfully transfer to the nylon membrane.
The pattern of hybridization using the probes from the panel according to Tables 9-12 determine the haplotype of the patient sample when compared to the known haplotypes.
The degree of breast, ovarian and other cancer prevention with and without tamoxifen and the degree of prevention of reoccurrence of breast and ovarian cancer with and without tamoxifen are compared for patients grouped by BRCAl, BRCA2, PTEN, BAPl, BARDl, hRAD51 haplotype separately and in all possible combinations using various proprietary data mining techniques similar to the Recognizer™ methodology described in U.S. Patent 5,642,936. Appropriate recommendations regarding the use of tamoxifen for patients of different haplotypes are then be made for patients with and without a history of breast or ovarian cancer.
While this example is a retrospective study and thus unacceptable for proof of efficacy for the U.S. Food and Drug Administration, p rospective studies are also part of the present invention. In a prospective study, the test individuals have their haplotypes determined for each pertinent gene prior to determining whether or not they will be accepted for the drug trial or initiate tamoxifen therapy.
Example 6: Pharmacogenetic Analysis of a p53 polymorphism and the Appropriateness of the Human Papiloma Virus Vaccine
Human papiloma virus (HPV) currently infects up to 40 million Americans with at least one of about 80 different strains. Many strains of the virus cause veneral warts, vulval, penile and perianal cancers. One strain in particular, HPV- 16, is believed to be responsible for about half of all cases of cervical cancer. Three other strains are responsible for another 35% of all cervical cancer cases with HPV-18 causing malignant tumors while HPV-6 and HPV-11 usually forming benign lesions. HPV vaccines are made by Medlmmune, Inc. (Gaithersburg, Maryland) and Merck & Co. Clinical trials have already begun. While applicant does not wish to be bound by any theory, it is believed that HPV may induce cancer by interacting with p53 in a manner which inhibits the action of p53 to prevent runaway cell growth. It has been known that HPV protein E6 inactivates only p53 proteins from some individuals and not other individuals. Medcalf et al, Onco ene, 8: 2847-2851 (1993). Therefore, determining the haplotype(s) of the p53 gene is believed to indicate who is susceptible to cervical cancer induced by HPV and is therefore a candidate for a HPV vaccine.
Previous commercial p53 gene testing of patient samples performed by Oncormed, Inc. (the owner of this application) involved various sequencing techniques and functional assays for prognostic testing on various tumor samples and susceptibility testing of genomic samples in patients with an inherited mutant p53 gene (Li-Fraumeni Syndrome). While apparent single nucleotide polymoφhisms were noticed, such results were not reported as the samples are suspected to contain p53 mutations and do not originate from healthy individuals without a genetic history indicating inheritance of two functional p53 alleles.
Only polymoφhisms in the coding region are analyzed because women having cervical cancers are believed to have a p53 protein which is "in-activatable" because the coding sequence for p53 is usually not mutated in cervical cancers. Vogelstein et al, Cell, 70: 523-526 (1992). Thus, the haplotypes were determined based on the single nucleotide polymoφhisms at codon 21 (which may be either GAC or GAT), codon 36 (which may be either CCG or CCA), codon 47 (which may be either CCG or TCG), codon 72 (which may be either CGC or CCC) and codon 213 (which may be either CGA or CGG).
Experimental protocol
Blood samples are from 53 healthy individuals having a history of veneral warts or at risk from exposure to HPV. Exposure is defined as an individual having regular sexual contact with an infected individual without a barrier preventing transmission of HPV. These individuals have either stage I (normal) or stage II (inflammation) PAP smears. Some of the individuals had been previously treated for veneral warts with one or more of the following treatments: podophyllin, trichloroacetic acid, cryosurgury, cauderization or interferon. Also, blood samples are from 12 patients with a history of cervical cancer as defined by a stage" IV (carcinoma in-situ) or greater PAP smear result. Note that individuals having a stage III PAP smear (dysplasia) are not included in this study. White blood cells are collected and genomic DNA is extracted from the white blood cells according to well-known methods (Sambrook, et al, Molecular Cloning, A Laboratory Manual, 2nd Ed., 1989, Cold Spring Harbor Laboratory Press, at 9.16 - 9.19).
PCR Amplification for Sequencing
The genomic DNA is used as a template to amplify a DNA fragment encompassing the site of the mutation to be tested. The 25 ml PCR reaction contains the following components: 1 ml template (100 ng/ ml) DNA, 2.5 ml 1 OX PCR Buffer (PERKTN-ELMER), 1.5 ml dNTP (2 mM each dATP, dCTP, dGTP, dTTP), 1.5 ml Forward Primer (10 mM), 1.5 ml Reverse Primer (10 mM), 0.5 ml (2.5 U total) AMPLITAQ GOLD™ TAQ DNA POLYMERASE or AMPLITAQ® TAQ DNA POLYMERASE (PERKIN-ELMER), 1.0 to 5.0 ml (25 mM) MgCl2 (depending on the primer) and distilled water (dH20) up to 25 ml. All reagents for each exon except the genomic DNA can be combined in a master mix and aliquoted into the reaction tubes as a pooled mixture. The primers are listed below. NAME SEQUENCE LENGTH INTRON
2F 5'-TCATGCTGGATCCCCACTTTTCCTCTTG-3' 28 31
2R 5'-GGTGGCCTGCCCTTCCAATGGATCCACT-3' 28 3
3F 5'-AATTCATGGGACTGACTTTCTGCTCTTGTC-3' 30 6
3R 5'-TCCAGGTCCCAGCCCAACCCTTGTCC-3' 26 4
4F S'-GTCCTCTGACTGCTCTTTTCACCCATCTAC-S' 30 2
4R 5'-GGGATACGGCCAGGCATTGAAGTCTC-3' 26 29
5F S'-CTTGTGCCCTGACTTTCAACTCTGTCTC-S' 28 16
5R 5'-TGGGCAACCAGCCCTGTCGTCTCTCCA-3' 27 15
6F 5'-CCAGGCCTCTGATTCCTCACTGATTGCTC-3' 29 4
6R 5'-GCCACTGACAACCACCCTTAACCCCTC-3' 27 29
7F 5'-GCCTCATCTTGGGCCTGTGTTATCTCC-3' 27 3
7R 5'-GGCCAGTGTGCAGGGTGGCAAGTGGCTC-3' 28 5
8F 5'-GTAGGACCTGATπCCTTACTGCCTCTTGC-3' 30 23
8R 5'-ATAACTGCACCCTTGGTCTCCTCCACCGC-3' 29 20
9F S'-CACTTTTATCACCTTTCCTTGCCTCTTTCC-S' 30 3
9R 5'-AACTTTCCACTTGATAAGAGGTCCCAAGAC-3' 30 7
10F 5'-ACTTACTTCTCCCCCTCCTCTGTTGCTGC-3' 29 10R 5'-ATGGAATCCTATGGCTTTC-CAACCTAGGAAG-3' 31 39
11F 5'-CATCTCTCCTCCCTGCTTCTGTCTCCTAC-3' 29 2
11R 5'-CTGACGCACACCTATTGCAAGCAAGGGTTC-3' 30 80
The term "INTRON" refers to the location in the intron where the primer anneals.
Alternatively the primers for exons 2 and 3 may be amplified together with primers: p53-2/3F 5'GAAGCGTCTCATGCTGGAT 3' p53-2/3R 5'GGGGACTGTAGATGGGTGAA 3'
For each exon analyzed, the following control PCRs are set up:
(1) "Negative" DNA control (100 ng placental DNA (SIGMA CHEMICAL CO., St. Louis, MO)
(2) Three "no template" controls
PCR for all exons is performed using the following thermocycling conditions:
Temperature Time Number of Cycles
95°C 5 min. (AMPLITAQ) 1 or 10 min. (GOLD) 95°C 30 sec. \
55°C 30 sec. } 30 cycles
72°C 1 min /
72°C 5 min. 1
4°C hold 1
Quality control agarose gel of PCR amplification:
The quality of the PCR products is examined prior to further analysis by electrophoresing an aliquot of each PCR reaction sample on an agarose gel. 5 μl of each PCR reaction is run on an agarose gel along side a DNA 100 BP DNA LADDER (Gibco BRL cat# 15628-019). The electrophoresed PCR products are analyzed according to the following criteria: Each patient sample must show a single band of the size corresponding the number of base pairs expected from the length of the PCR product from the forward primer to the reverse primer. If a patient sample demonstrates smearing or multiple bands, the PCR reaction must be repeated until a clean, single band is detected. If no PCR product is visible or if only a weak band is visible, but the control reactions with placental DNA template produced a robust band, the patient sample should be re- amplified with 2X as much template DNA.
All three "no template" reactions must show no amplification products. Any PCR product present in these reactions is the result of contamination. If any one of the "no template" reactions shows contamination, all PCR products should be discarded and the entire PCR set of reactions should be repeated after the appropriate PCR decontamination procedures have been taken.
The optimum amount of PCR product on the gel should be between 50 and 100 ng, which can be determined by comparing the intensity of the patient sample PCR products with that of the DNA ladder. If the patient sample PCR products contain less than 50 to 100 ng, the PCR reaction should be repeated until sufficient quantity is obtained.
DNA Sequencing
For DNA sequencing, double stranded PCR products are labeled with four different fluorescent dyes, one specific for each nucleotide, in a cycle sequencing reaction. With Dye Terminator Chemistry, when one of these nucleotides is incoφorated into the elongating sequence it causes a termination at that point. Over the course of the cycle sequencing reaction, the dye-labeled nucleotides are incoφorated along the length of the PCR product generating many different length fragments.
The dye-labeled PCR products will separate according to size when electrophoresed through a polyacrylamide gel. At the lower portion of the gel on an ABI automated sequencer, the fragments pass through a region where a laser beam continuously scans across the gel. The laser excites the fluorescent dyes attached to the fragments causing the emission of light at a specific wavelength for each dye. Either a photomultiplier tube (PMT) detects the fluorescent light and converts is into an electrical signal (ABI 373) or the light is collected and separated according to wavelength by a spectrograph onto a cooled, charge coupled device (CCD) camera (ABI 377). In either case the data collection software will collect the signals and store them for subsequent sequence analysis.
PCR products are first purified for sequencing using a QIAQUICK-SPIN PCR PURIFICATION KIT (QIAGEN #28104). The purified PCR products are labeled by adding primers, fluorescently tagged dNTPs and Taq Polymerase FS in an ABI Prism Dye Terminator Cycle Sequencing Kit (PERKIN ELMER/ ABI catalog #02154) in a PERKIN ELMER GENEAMP 9600 thermocycler.
The amounts of each component are: For Samples For Controls
Reagent Volume Reagent Volume
Dye mix 8.0 μL PGEM 2.0 μL
Primer (1.6 mM) 2.0 μL M13 2.0 μL PCR product 2.0 μL Dye mix 8.0 μL sdH20 8.0 μL sdH20 8.0 μL
The thermocycling conditions are: Temperature Time # of Cycles
96°C 15 sec. \
50°C 5 sec. } 25
60°C 4 min. /
4°C hold 1
The product is then loaded into a gel and placed into an ABI DNA Sequencer (Models 373A & 377) and run. The sequence obtained is analyzed by comparison to the wild type (reference) sequence using SEQUENCE NAVIGATOR software. When a sequence does not align, it indicates a possible mutation or polymoφhism. The DNA sequence is determined in both the forward and reverse directions. All results are provided to a second reader for review.
PCR Amplification for ASO
The genomic DNA is used as a template to amplify a separate DNA fragment encompassing the site of the mutation to be tested. The 50 μl PCR reaction contains the following components: 1 μl template (100 ng/ μl) DNA, 5.0 μl 10X PCR Buffer (PERKIN-ELMER), 2.5 μl dNTP (2mM each dATP, dCTP, dGTP, dTTP), 2.5 μl Forward Primer (10 mM), 2.5 μl Reverse Primer (10 μM), 0.5 μl (2.5 U total) AMPLITAQ® TAQ DNA POLYMERASE or AMPLITAQ GOLD™ DNA POLYMERASE (PERKIN-ELMER), 1.0 to 5.0 μl (25 mM) MgCl2 (depending on the primer) and distilled water (dH2O) up to 50 μl. All reagents for each exon except the genomic DNA can be combined in a master mix and aliquoted into the reaction tubes as a pooled mixture. The primers described above are used.
For each exon analyzed, the following control PCRs are set up:
(1) "Negative" DNA control (100 ng placental DNA (SIGMA CHEMICAL CO., St. Louis, MO)
(2) Three "no template" controls.
PCR for all exons is performed using the following thermocycling conditions:
Temperature Time Number of Cycles
95°C 5 min.(AMPLITAQ) 1 or 10 min. (GOLD) 95°C 30 sec. \
55°C 30 sec. } 30 cycles
72°C 1 min /
72°C 5 min. 1
4°C hold 1
The quality control agarose gel of PCR amplification is performed as above. Binding PCR Products to Nylon Membrane
The PCR products are denatured no more than 30 minutes prior to binding the PCR products to the nylon membrane. To denature the PCR products, the remaining PCR reaction (45 μl) and the appropriate positive control polymoφhism gene amplification product are diluted to 200 μl final volume with PCR Diluent Solution (500 mM NaOH, 2.0 M NaCI, 25 mM EDTA) and mixed thoroughly. The mixture is heated to 95°C for 5 minutes, and immediately placed on ice and held on ice until loaded onto dot blotter, as described below. The PCR products are bound to 9 cm by 13 cm nylon ZETA PROBE BLOTTING MEMBRANE (BIO-RAD, Hercules, CA, catalog number 162-0153) using a BIO-RAD dot blotter apparatus.
Pieces of 3MM filter paper [WHATMAN®, Clifton, NJ] and nylon membrane are pre-wet in 10X SSC prepared fresh from 20X SSC buffer stock. The vacuum apparatus is rinsed thoroughly with dH2O prior to assembly with the membrane. 100 μl of each denatured PCR product is added to the wells of the blotting apparatus. Each row of the blotting apparatus contains a set of reactions for a single exon to be tested, including a placental DNA (negative) control, a synthetic oligonucleotide with the desired mutation or a PCR product from a known polymoφhic sample (positive control), and three no template DNA controls.
After applying PCR products, the nylon filter is placed DNA side up on a piece of 3MM filter paper saturated with denaturing solution (1.5 M NaCI, 0.5 M NaOH) for 5 minutes. The membrane is transferred to a piece of 3MM filter paper saturated with neutralizing solution (1 M Tris-HCl, pH 8, 1.5 M NaCI) for 5 minutes. The neutralized membrane is then transferred to a dry 3MM filter DNA side up, and exposed to ultraviolet light (STRALINKER, STRATAGENE, La Jolla, CA) for exactly 45 seconds to fix the DNA to the membrane. This UV crosslinking should be performed within 30 min. of the denaturation/neutralization steps. The nylon membrane is then cut into strips such that each strip contains a single row of blots of one set of reactions for a single exon.
Hybridizing Labeled Oligonucleotides to the Nylon Membrane Prehybridization
The strip is prehybridized at 52°C incubation using the HYBAID® (SAVANT INSTRUMENTS, INC., Holbrook, NY) hybridization oven. 2X SSC (15 to 20 ml) is preheated to 52°C in a water bath. For each nylon strip, a single piece of nylon mesh cut slightly larger than the nylon membrane strip (approximately 1" x 5") is pre-wet with 2X SSC. Each single nylon membrane is removed from the prehybridization solution and placed on top of the nylon mesh. The membrane/mesh "sandwich" is then transferred onto a piece of Parafilm™. The membrane/mesh sandwich is rolled lengthwise and placed into an appropriate HYBAID® bottle, such that the rotary action of the HYBAID® apparatus caused the membrane to unroll. The bottle is capped and gently rolled to cause the membrane/mesh" to unroll and to evenly distribute the 2X SSC, making sure that no air bubbles formed between the membrane and mesh or between the mesh and the side of the bottle. The 2X SSC is discarded and replaced with 5 ml TMAC Hybridization Solution, which contains 3 M TMAC (tetramethyl ammoniumchloride - SIGMA T-3411), 100 mM Na3PO4(pH 6.8), 1 mM EDTA, 5X Denhardt's (1% Ficoll, 1% polyvinylpyrrolidone, 1% BSA (fraction V)), 0.6% SDS, and 100 mg/ml Herring Sperm DNA. The filter strips are prehybridized at 52°C with medium rotation (approx. 8.5 setting on the HYBAID® speed control) for at least one hour. Prehybridization can also be performed overnight.
Labeling Oligonucleotides
The DNA sequences of the numerous oligonucleotide probes are used to detect the p53 mutation. For each mutation, a polymoφhic and a normal oligonucleotide must be labeled. While only five pairs of oligonucleotide probes are listed below, corresponding oligonucleotides for each mutation may be prepared and used in the same manner.
Polymoφhism in codon 21 wild-type 5'TTTTCAGACCTATGGAAAC 3' other wt 5'TTTTCAGATCTATGGAAAC 3'
Polymoφhism in codon 36 wild-type 5'CCCTTGCCGTCCCAAGCA 3' other wt 5'CCCTTGCCATCCCAAGCA 3'
Polymoφhism in codon 47 wild-type 5'CTGTCCCCGGACGATATT 3' other wt 5'CTGTCCCCAGACGATATT 3'
Polymoφhism in codon 72 wild-type 5*GCTCCCCCCGTGGCCCCT 3' other wt 5'GCTCCCCGCGTGGCCCCT 3'
Polymoφhism in codon 213 wild-type 5'ACTTTTCGACATAGTGTG 3' other wt 5'ACTTTTCGGCATAGTGTG 3' Each labeling reaction contains 2 μl 5X Kinase buffer (or 1 μl of 10X Kinase buffer), 5 μl gamma- ATP 32P (not more than one week old), 1 μl T4 polynucleotide kinase, 3 μl oligonucleotide (20 μM stock), sterile H2O to 10 μl final volume if necessary. The reactions are incubated at 37°C for 30 minutes, then at 65°C for 10 minutes to heat inactivate the kinase. The kinase reaction is diluted with an equal volume (10 μl) of sterile dH20 (distilled water).
The oligonucleotides are purified on STE MICRO SELECT-D, G-25 spin columns (catalog no. 5303-356769), according to the manufacturer's instructions. The 20 μl synthetic oligonucleotide eluate is diluted with 80 μl dH20 (final volume = 100 μl). The amount of radioactivity in the oligonucleotide sample is determined by measuring the radioactive counts per minute (cpm). The total radioactivity must be at least 2 million cpm. For any samples containing less than 2 million cpm total, the labeling reaction is repeated.
Hybridization with Oligonucleotides
Approximately 2-5 million cpm of the labeled polymoφhic oligonucleotide probe is diluted into 5 ml of TMAC hybridization solution, containing 40 μl of 20 μM stock of unlabeled normal oligonucleotide. The probe mix is preheated to 52°C in the hybridization oven. The pre-hybridization solution is removed from each bottle and replaced with the probe mix. The filter is hybridized for 1 hour at 52°C with moderate agitation. Following hybridization, the probe mix is decanted into a storage tube and stored at -20°C. The filter is rinsed by adding approximately 20 ml of 2x SSC + 0.1 % SDS at room temperature and rolling the capped bottle gently for approximately 30 seconds and pouring off the rinse. The filter is then washed with 2x SSC + 0.1 % SDS at room temperature for 20 to 30 minutes, with shaking.
The membrane is removed from the wash and placed on a dry piece of 3MM WHATMAN filter paper then wrapped in one layer of plastic wrap, placed on the autoradiography film, and exposed for about five hours depending upon a survey meter indicating the level of radioactivity. The film is developed in an automatic film processor. Control Hybridization with Normal Oligonucleotides
The puφose of this step is to ensure that the PCR products are transferred efficiently to the nylon membrane.
Following hybridization with the polymoφhic oligonucleotide each nylon membrane is washed in 2X SSC, 0.1% SDS for 20 minutes at 65°C to melt off the polymoφhic oligonucleotide probes. The nylon strips are then prehybridized together in 40 ml of TMAC hybridization solution for at least 1 hour at 52°C in a shaking water bath. 2-5 million counts of each of the normal labeled oligonucleotide probes plus 40 ml of 20 mM stock of unlabeled normal oligonucleotide are added directly to the container containing the nylon membranes and the prehybridization solution. The filter and probes are hybridized at 52°C with shaking for at least 1 hour. Hybridization can be performed overnight, if necessary. The hybridization solution is poured off, and the nylon membrane is rinsed in 2X SSC, 0.1% SDS for 1 minute with gentle swirling by hand. The rinse is poured off and the membrane is washed in 2X SSC, 0.1 % SDS at room temperature for 20 minutes with shaking.
The nylon membrane is removed placed on a dry piece of 3MM WHATMAN filter paper. The nylon membrane is then wrapped in one layer of plastic wrap and placed on autoradiography film, and exposure is for at least 1 hour.
For each sample, adequate transfer to the membrane is indicated by a strong autoradiographic hybridization signal. For each sample, an absent or weak signal when hybridized with its normal oligonucleotide, indicates an unsuccessful transfer of PCR product, and it is a false negative. The ASO analysis must be repeated for any sample that did not successfully transfer to the nylon membrane.
Homozygous individuals having haplotypes with the single nucleotide polymoφhism (SNP) arginine at codon 72 are overrepresentated in the genomic alleles of cervical cancer patients. In addition, it was recently published that cervical tumors have the SNP arginine at codon 72 at significantly higher frequency than normal tissue. Storey et al, Nature. 393: 229-234 (1998). Healthy women having such haplotypes are candidates for the HPV vaccines to prevent HPV invection, treat veneral warts, treat cervical and other related cancers, and prevent reoccurrence of veneral warts previously treated. Example 7: Pharmacogenetic Analysis of PI Haplotype and Platelet Sensitivity to Aspirin
Aspirin has been a standard anticoagulant therapy for patients who have had a heart attack. In recent years, aspirin therapy has been extended to individuals with a history or at risk for stroke (apoplexy) and phlebitis. It has even been proposed that every person over 50 years of age should take aspirin.
However, some people cannot take aspirin due to allergy, erosion of the stomach lining etc. Furthermore, research has shown that aspirin prevents heart attacks in about 40 percent of patients taking aspirin. Thus, it is desirable to determine which people will respond to aspirin and which will not in order to administer other anticoagulant or antiplatelet medication.
Platlet aggregration is recognized as an important step in the formation of a blockage which will cause a myocardial infarction and unstable angina. Platlet aggregration is based on glycoprotein gpIIb/IIIa. Different forms of this glycoprotein have been known. Weiss et al, Tissue Antigens. 46: 374-381 (1995), Kunicki et al, Molecular Immunology 16: 353-60 (1979). Methods for determining various polymoφhisms may be done by DNA analysis. Newman et al, Journal of Clinical Investigation 83:1778-81 (1989). It has been reported that patients having one polymoφhic form of the PI gene have a higher incidence for acute coronary thrombosis, particularly in patients younger than 60. Weiss et al, New England Journal of Medicine 334(17):1090-1094 (1996). However, these findings were contradicted by Ridker, et al, Lancet 349: 385-388 (1997) with comments in Lancet on pages 370-371, 1099-1100 and 1100-1. Adding to the debate, it was recently published that platelet aggregation from haplotype PIA2 containing individuals are less inhibited by aspirin at certain concentrations than individuals homozygous for haplotype PIA1. Cooke et al, Lancet 351 : 1253 (1998).
Resolving the issue for people at risk of heart attacks, stroke and other thrombogenic disorders is desirable, particularly in distinguishing between those who can take aspirin or who should take other medication which is more costly and with greater side effects. Experimental protocol
Blood samples are taken from 50 healthy individuals ages 50-55. Family history and personal histories of heart disease and other thrombogenic disorders are recorded. White blood cells are collected and genomic DNA is extracted from the white blood cells, PCR amplified and the sequence determined by ASO or sequenced as in the Examples above using different primers and probes. Newman et al, Journal of Clinical Investigation 83:1778-81 (1989). As before, PCR primers and ASO probes are designed to type these individuals for exon 2 to determine which base exists at nucleotide position 1565: a T or a C. at the amino acid level, codon 33 is changed from a leucine to a proline.
Individuals having haplotype PIA2 either in homozygous or heterozygous form are instructed to either take high dosages of aspirin (2000 mg per day) or not take aspirin and given other medication appropriate for their individual needs. Individuals homozygous for haplotype PIA1 are instructed to take aspirin at low dosages (350 mg per day)
The present invention is not to be limited in scope by the specific embodiments described herein. Indeed, various modifications of the invention in addition to those described herein will become apparent to those skilled in the art from the foregoing description and accompanying figure. Such modifications are intended to fall within the scope of the appended claims.
Various publications are cited herein, the disclosures of which are incoφorated by reference in their entireties.

Claims

WHAT IS CLAIMED IS:
1. A method of determining a functional allele profile of a gene in a population, comprising:
(a) identifying the nucleotide sequence of a gene of interest out of genomic DNA from each of a population of individuals identified as having a family history which indicates inheritance of functional alleles of the gene of interest;
(b) identifying the haplotype sequence of at least one individual identified as having a family history which indicates inheritance of only functional alleles of a gene of interest;
(c) if any heterozygous sequence is identified in step (a), then subtracting the haplotype sequence identified in (b) from said heterozygous sequence to identify the companion haplotype to the haplotype identified in step (b);
(d) determining the frequency of occurrence of the haplotypes determined in steps (b) and (c); and
(e) rank ordering the frequency of occurrence of each haplotype, whereby the identity of the alleles containing each haplotype and the determination of their relative frequencies constitutes the functional allele profile of the gene of interest in the population.
2. The method of claim 1 wherein a haplotype is identified in step (b) by determining the sequence of a homozygous individual.
3. The method of claim 2 wherein the sequence of the homozygous individual is identified in step (a).
4. The method of claim 2 wherein the sequence of the homozygous individual is obtained from an individual not among the individuals identified in step (a).
5. The method of claim 1 wherein a haplotype is identified in step (b) by sequencing analysis of a cDNA sample.
6. The method of claim 5 wherein the cDNA sample is obtained from an individual whose sequence is identified in step (a).
7. The method of claim 5 wherein the cDNA sample is obtained from an individual not among the population in step (a).
8. The method of claim 1 wherein at least one family history in step (a) is determined by pedigree analysis.
9. The method of claim 1 wherein at least one family history in step (a) is determined by questionnaire.
10. The method of claim 1 wherein at least one genomic sequence of step (a) contains all exons of the gene.
11. The method of claim 1 wherein at least one genomic sequence of step (a) contains intronic sequences.
12. The method of claim 10 or 11 wherein at least one genomic sequence of individual amplified exons is identified.
13. A method of determining the consensus functional sequence of a gene in a population, comprising:
(a) identifying the sequence of a gene of interest out of genomic DNA from each of a population of individuals identified as having a family history which indicates inheritance of functional alleles of the gene of interest;
(b) identifying the haplotype sequence of at least one individual identified as having a family history which indicates inheritance of only functional alleles of a gene of interest; (c) if any heterozygous sequence is identified in step (a), then subtracting the haplotype sequence identified in (b) from said heterozygous sequence to identify the companion haplotype to the haplotype identified in step (b);
(d) determining the frequency of occurrence of the haplotypes determined in steps (b) and (c); and
(e) rank ordering the frequency of occurrence of each haplotype, whereby the most frequently occurring sequence is the consensus functional sequence of the gene in the population.
14. The method of claim 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 , or 13 wherein the population in step (a) contains at least five individuals.
15. A method of determining a functional allele profile comprising, in this order:
(a) determining the nucleotide sequence of at least one allele containing the nucleotide sequence of an isolated coding region of a gene of interest from a single individual identified as having a family history which indicates inheritance of functional alleles of the gene of interest;
(b) determining the genomic sequence of the same gene of interest inclusive of any naturally occurring polymoφhisms from a subpopulation of at least five unrelated individuals, other than the individual in step (a), identified as having a family history which indicates inheritance of functional alleles of the gene of interest;
(c) subtracting the sequence in (a) from the sequence identified in (b) for all individuals tested, such that the sequence remaining after subtraction determines the companion allele to the allele in (a);
(d) if the sequence determined in (a) is not present among the sequences determined in (b), determining an alternative allele having a haplotype for comparison and substraction in the individuals in (b) by identifying at least one individual among the population in (b) homozygous for the allele having said haplotype;
(e) determining the frequency of occurrence of the allele determined in (a), (c) and (d) among the samples in (b); and (f) rank ordering the frequency of occurrence of the alleles to obtain a "functional allele profile" for the gene of interest.
16. A method of determining the consensus functional sequence of a gene in a population, comprising in this order:
(a) determining the nucleotide sequence of at least one allele containing the nucleotide sequence of an isolated coding region of a gene of interest from a single individual identified as having a family history which indicates inheritance of functional alleles of the gene of interest;
(b) determining the genomic sequence of the same gene of interest inclusive of any naturally occurring polymoφhisms from a subpopulation of at least five unrelated individuals, other than the individual in step (a), identified as having a family history which indicates inheritance of functional alleles of the gene of interest;
(c) subtracting the sequence in (a) from the sequence identified in (b) for all individuals tested, such that the sequence remaining after subtraction determines the companion allele to the allele in (a);
(d) if the sequence determined in (a) is not present among the sequences determined in (b), determining an alternative allele having a haplotype for comparison and substraction in the individuals in (b) by identifying at least one individual among the population in (b) homozygous for the allele having said haplotype;
(e) determining the frequency of occurrence of the allele determined in (a), (c) and (d) among the samples in (b); and
(f) rank ordering the frequency of occurrence of the alleles, whereby the most frequently occurring sequence is the consensus functional sequence of the gene in the population.
17. The method of claim 1, 13, 15, or 16 wherein the gene of interest has at least 5 naturally occurring polymoφhisms of frequencies of at least 10% in the population.
18. A method for determining a new haplotype of a gene of interest where at least one wild-type nucleotide sequence of said gene of interest is known comprising the steps of:
(a) selecting at least one individual having a genetic history which indicates inheritance of functional alleles of the gene of interest,
(b) determining a nucleotide sequence of said gene, or a fragment thereof, in at least one allele of said individual,
(c) comparing each nucleotide sequence from said individual to that of each wild- type nucleotide sequence, wherein the presence of at least one nucleotide sequence different from each known wild-type nucleotide sequence indicates the presence of said new haplotype and if said new haplotype is not determined by step (c), repeating steps (a), (b) and (c) with a different individual until said new haplotype is determined.
19. The method according to claim 18 wherein at least five individuals are selected.
20. The method according to claim 18 wherein said individual is a human.
21. The method according to claim 18 wherein said new haplotype encodes a protein having at least one amino acid difference in its deduced amino acid sequence from a protein encoded by said at least one wild-type nucleotide sequence.
22. An isolated protein encoded by said new haplotype determinable by the method of claim 21.
23. An isolated DNA comprising the nucleotide sequence of said new haplotype of said gene of interest determinable by the method of claim 18.
24. A DNA comprising the nucleotide sequence of said new haplotype of said gene of interest discovered by the method of claim 18.
25. An isolated DNA comprising a fragment of the nucleotide sequence of said new haplotype of said gene of interest determinable by the method of claim 18, wherein said fragment contains a nucleotide sequence having at least one polymoφhic nucleotide.
26. A method for determining a new wild-type amino acid sequence of a protein of interest where at least one wild-type amino acid sequence of said protein of interest is known comprising the steps of:
(a) selecting at least one individual having a genetic history which indicates inheritance of functional alleles of a gene encoding said protein of interest,
(b) determining or deducing at least one amino acid sequence of said protein produced by said individual,
(c) comparing each of said amino acid sequence from said individual to that of each wild-type amino acid sequence, wherein the presence of at least one amino acid difference from each known wild-type amino acid sequence indicates the presence of said new wild-type amino acid sequence and if said new wild type amino acid sequence is not determined by step (c), repeating steps (a), (b) and (c) with a different individual until said new amino acid sequence for said protein of interest is determined.
27. The method according to claim 26 wherein at least five individuals are selected.
28. The method according to claim 26 wherein said individual is a human.
29. An isolated protein having said new amino acid sequence of said protein of interest determinable by the method of claim 26.
30. An method for determining a haplotype of a gene of interest for an individual comprising:
(a) determining a nucleotide sequence of at least a portion of an allele of said gene of interest in regions of said gene containing all polymoφhic nucleotides constituting the haplotype in a sample from said individual,
(b) comparing the nucleotide sequence, or the polymoφhic nucleotides, to at least two haplotypes of said gene in the sample, and
(c) determining the haplotype of said allele of said gene of interest from said sample.
31. The method according to claim 30 wherein said individual is a human.
32. A method for determining a wild-type amino acid sequence for a protein of interest for an individual comprising:
(a) determining or deducing at least one amino acid sequence of said protein of interest from a sample from said individual, and
(b) comparing the amino acid sequence obtained to at least two known wild-type amino acid sequences, thereby determining the amino acid sequence present in the individual.
33. The method according to claim 32 wherein said individual is a human.
34. A method for determining a new polymoφhism of a gene of interest where at least one wild-type nucleotide sequence of said gene of interest is known comprising the steps of:
(a) selecting at least one individual having a genetic history which indicates inheritance of functional alleles of the gene of interest,
(b) determining a nucleotide sequence of said gene, or a fragment thereof, in at least one allele of said individual,
(c) comparing each nucleotide sequence from said individual to that of each wild- type nucleotide sequence, wherein the presence of at least one nucleotide sequence different from each known wild-type nucleotide sequence indicates the presence of said new polymoφhism, and if said new polymoφhism is not determined by step (c), repeating steps (a), (b) and (c) with a different individual until said new polymoφhism is determined.
35. The method according to claim 34 wherein at least five individuals are selected.
36. The method according to claim 34 wherein said individual is a human.
37. The method according to claim 34 wherein said gene having the new polymoφhism encodes a protein having at least one amino acid difference in its deduced amino acid sequence from a protein encoded by said at least one wild-type nucleotide sequence.
38. An isolated protein encoded by said gene having a new polymoφhism determinable by the method of claim 37.
39. An isolated DNA comprising the nucleotide sequence of said gene having the new polymoφhism determinable by the method of claim 34.
40. A DNA comprising the nucleotide sequence of said gene having the new polymoφhism discovered by the method of claim 34.
41. A method for determining a combination of a haplotype for a first gene of interest and at least one polymoφhism in a second gene of interest comprising:
(a) determining the haplotype of an allele of said first gene of interest in an individual,
(b) determining the nucleotide sequence of each polymoφhism of an allele of the second gene of interest in the same individual, and (c) identifying a combination of the haplotype of said first gene and the polymoφhism of the second gene for the same individual.
42. The method according to claim 41 wherein said at least one polymoφhism determines a haplotype for said second gene of interest.
43. The method according to claim 41 wherein said combination indicates a condition or a susceptibility to a condition.
44. An antibody capable of either binding to either said isolated protein having said new amino acid sequence according to claim 29 or a protein having a known wild- type amino acid sequence, but not both under the same binding conditions.
45. An antibody according to claim 44 bound to a label.
46. An immunoassay capable of distinguishing between a protein having one wild-type amino acid sequence and a protein having a variant wild-type amino acid sequence comprising:
(a) contacting the antibody according to claim 44 with a sample containing at least one of said proteins, and
(b) detecting the presence or absence of binding between said antibody and said protein.
47. A method for determining whether to administer a composition to an individual for a particular use comprising:
(a) determining the nucleotide sequence of at least one polymoφhism of a gene of interest, and
(b) reporting a result indicating the appropriateness of administering the composition to the individual for the particular use, wherein the presence of at least one polymoφhic form of the gene of interest determines the need for or provides a different response to the composition for a particular use from at least one other polymoφhic form of the gene of interest.
48. The method according to claim 47 wherein said at least one polymoφhism defines a haplotype.
49. The method according to claim 47 wherein said composition is a pharmaceutical.
50. A method for determining a trait, condition or susceptibility to a condition associated with a gene of interest comprising:
(a) determining the nucleotide sequence of at least one polymoφhism of the gene of interest, and
(b) reporting a result indicating the presence of said trait, condition or susceptibility to the condition, wherein the presence of at least one polymoφhic form of the gene of interest determines the trait, condition or susceptibility for the condition from at least one other polymoφhic form of the gene of interest.
51. The method according to claim 47 wherein said at least one polymoφhism defines a haplotype.
52. An oligonucleotide, or its complement, capable of recognizing a polymoφhism in a gene of interest by hybridizing to one polymoφhic form of said gene but not another polymoφhic form under the same hybridizing conditions.
53. A panel of oligonucleotides according to claim 52 wherein the panel comprises at least one oligonucleotide for each polymoφhism constituting a haplotype for said gene of interest.
54. A probe chip for determining the presence or absence of a particular nucleotide at a particular polymoφhism determined by the method of claim 34 in a desired gene or fragment thereof, comprising:
(a) a solid phase and (b) a plurality of oligonucleotide probes, wherein the probes are immobilized on the solid phase, wherein the probes comprise at least "n" groups of oligonucleotide probes, wherein each unique probe within the group of oligonucleotide probes is complementary to said desired gene or fragment thereof, and contains a nucleotide complementary to said particular polymoφhism in said desired polynucleotide at a different position within said unique probe, and wherein "n" is an integer greater than 0.
55. The probe chip according to claim 54, further comprising an additional group of complementary probes which are complementary to said group of probes and capable of hybridizing to a complementary strand of said desired gene or fragment thereof.
56. A method for determining a new polymoφhism in a gene comprising the steps of:
(a) selecting at least one individual having a genetic history which indicates inheritance of functional alleles of the gene,
(b) obtaining a sample of genomic DNA from said at least one individual,
(c) determining a nucleotide sequence of at least a fragment of at least one allele of said gene in said individual,
(d) comparing each nucleotide sequence from said at least a fragment said allele to that of at least one wild-type nucleotide sequence of said gene or fragment, wherein the presence of at least one nucleotide difference from each known wild-type nucleotide sequence indicates the presence of said new polymoφhism, and if said new polymoφhism is not determined by step (c), repeating steps (a), (b) and (c) with a different individual until said new polymoφhism is determined.
57. The method according to claim 56 wherein at least five individuals are selected.
58. The method according to claim 56 wherein said individual is a human.
59. The method according to claim 56 wherein said gene having the new polymoφhism encodes a protein having at least one amino acid difference in its deduced amino acid sequence from a protein encoded by said at least one wild-type nucleotide sequence.
60. An isolated protein encoded by said gene having a new polymoφhism determinable by the method of claim 59.
61. An isolated DNA comprising the nucleotide sequence of said gene having the new polymoφhism determinable by the method of claim 56.
62. A DNA comprising the nucleotide sequence of said gene having the new polymoφhism discovered by the method of claim 56.
63. A method for determining a new haplotype of a gene of interest wherein said haplotype comprises at least two single nucleotide polymoφhisms which identify an allele that occurs in the total normal population, comprising the steps of:
(a) selecting at least one individual having a genetic history which indicates inheritance of functional alleles of the gene,
(b) obtaining a sample of genomic DNA from at least one said individual,
(c) determining a nucleotide sequence of at least a fragment of at least one allele of said gene in said individual,
(d) comparing each nucleotide sequence from at least a fragment of said allele to that of at least one wild-type nucleotide sequence of said gene or fragment, wherein the presence of at least two nucleotide differences from each known wild-type nucleotide sequence indicates the presence of said new haplotype, and if said new haplotype is not determined by step (c), repeating steps (a), (b) and (c) with a different individual until said new haplotype is determined.
64. A method according to claim 63 wherein the allele occurs in at least 10%> of the total normal population.
65. A method according to claim 63 wherein the allele occurs in at least 10% of the normal Caucasian population.
66. A method according to claim 63 wherein the allele occurs in at least 10% of the normal Black/ African American population.
67. A method according to claim 63 wherein the allele occurs in at least 10% of the normal Asian population.
68. A method according to claim 63, 64, 65, 66 or 67 wherein said new haplotype encodes a protein having at least one amino acid difference in its deduced amino acid sequence from a protein encoded by at least one wild-type nucleotide sequence.
69. An isolated protein encoded by said new haplotype determinable by the method of claim 63, 64, 65, 66 or 67.
70. An isolated DNA comprising the nucleotide sequence of said new haplotype of said gene of interest determinable by the method of claim 63, 64, 65, 66 or 67.
71. An isolated DNA comprising a fragment of the nucleotide sequence of said new haplotype of said gene of interest determinable the method of claim 63, 64, 65, 66 or
67, wherein said fragment contains a nucleotide sequence having at least two polymoφhic nucleotides.
72. A method for determining a new haplotype of a gene of interest wherein said haplotype comprises at least two single nucleotide polymoφhisms which identify an allele which is associated with an increased risk of identified disease in the total normal population, comprising the steps of:
(a) selecting at least one individual having a genetic history which indicates inheritance of functional alleles of the gene,
(b) obtaining a sample of genomic DNA from at least one said individual,
(c) determining a nucleotide sequence of at least a fragment of at least one allele of said gene in said individual,
(d) comparing each nucleotide sequence from at least a fragment of said allele to that of at least one wild-type nucleotide sequence of said gene or fragment, wherein the presence of at least two nucleotide differences from each known wild-type nucleotide sequence indicates the presence of said new haplotype, and if said new haplotype is not determined by step (c), repeating steps (a), (b) and (c) with a different individual until said new haplotype is determined.
73. A method according to claim 72 wherein the allele occurs in at least 10% of the total normal population.
74. A method according to claim 72 wherein the allele occurs in at least 10% of the normal Caucasian population.
75. A method according to claim 72 wherein the allele occurs in at least 10% of the normal Black/ African American population.
76. A method according to claim 72 wherein the allele occurs in at least 10% of the nromal Asian population.
77. A method according to claim 72, 73, 74, 75 or 76 wherein said new haplotype encodes a protein having at least one amino acid difference in its deduced amino acid sequence from a protein encoded by at least one wild-type nucleotide sequence.
78. An isolated protein encoded by said new haplotype determinable by the method of claim 72, 73, 74, 75 or 76.
79. An isolated DNA comprising the nucleotide sequence of said new haplotype of said gene of interest determinable by the method of claim 72, 73, 74, 75 or 76.
80. An isolated DNA comprising a fragment of the nucleotide sequence of said new haplotype of said gene of interest determinable the method of claim 72, 73, 74, 75 or 76, wherein said fragment contains a nucleotide sequence having at least two polymoφhic nucleotides.
81. A method for determining a plurality of haplotypes of a gene of interest wherein said haplotypes collectively define the alleles of said gene in a normal population, and wherein each said allele comprises at least two single nucleotide polymoφhisms, comprising the steps of:
(a) selecting at least one individual having a genetic history which indicates inheritance of functional alleles of the gene,
(b) obtaining a sample of genomic DNA from at least one said individual,
(c) determining a nucleotide sequence of at least a fragment of at least one allele of said gene in said individual,
(d) comparing each nucleotide sequence from at least a fragment of said allele to that of at least one wild-type nucleotide sequence of said gene or fragment, wherein the presence of at least two nucleotide differences from each known wild-type nucleotide sequence indicates the presence of said new haplotype, and if said new haplotype is not determined by step (c), repeating steps (a), (b) and (c) with a different individual until said new haplotype is determined.
82. A method according to claim 81 wherein the population is the normal Caucasian population.
83. A method according to claim 81 wherein the population is the normal Black/ African American population.
84. A method according to claim 81 wherein the population is the normal Asian population.
85. A method for determining a set of haplotypes of defined alleles from contiguous genes within the same region of a chromosome in the total population:
(a) selecting at least one individual having a genetic history which indicates inheritance of functional alleles of the gene,
(b) obtaining a sample of genomic DNA from at least one said individual,
(c) determining a nucleotide sequence of at least a fragment of at least one allele of said gene in said individual,
(d) comparing each nucleotide sequence from at least a fragment of said allele to that of at least one wild-type nucleotide sequence of said gene or fragment, wherein the presence of at least two nucleotide differences from each known wild-type nucleotide sequence indicates the presence of said new haplotype, and if said new haplotype is not determined by step (c), repeating steps (a), (b) and (c) with a different individual until said new haplotype is determined.
86. A method according to claim 85 wherein the population is the normal Caucasian population.
87. A method according to claim 85 wherein the population is the normal Black/African American population.
88. A method according to claim 85 wherein the population is the normal Asian population.
89. A method for determining a set of haplotypes of defined alleles from noncontiguous genes from different regions of a chromosome in the total population, comprising the steps of:
(a) selecting at least one individual having a genetic history which indicates inheritance of functional alleles of the gene,
(b) obtaining a sample of genomic DNA from at least one said individual,
(c) determining a nucleotide sequence of at least a fragment of at least one allele of said gene in said individual,
(d) comparing each nucleotide sequence from at least a fragment of said allele to that of at least one wild-type nucleotide sequence of said gene or fragment, wherein the presence of at least two nucleotide differences from each known wild-type nucleotide sequence indicates the presence of said new haplotype, and if said new haplotype is not determined by step (c), repeating steps (a), (b) and (c) with a different individual until said new haplotype is determined.
90. A method according to claim 89 wherein the population is the normal Caucasian population.
91. A method according to claim 89 wherein the population is the normal Black/African American population.
92. A method according to claim 89 wherein the population is the normal Asian population.
93. A method for determining a set of haplotypes of defined alleles from noncontiguous genes from different chromosomes in the total population, said method comprising the steps of:
(a) selecting at least one individual having a genetic history which indicates inheritance of functional alleles of the gene,
(b) obtaining a sample of genomic DNA from at least one said individual,
(c) determining a nucleotide sequence of at least a fragment of at least one allele of said gene in said individual,
(d) comparing each nucleotide sequence from at least a fragment of said allele to that of at least one wild-type nucleotide sequence of said gene or fragment, wherein the presence of at least two nucleotide differences from each known wild-type nucleotide sequence indicates the presence of said new haplotype, and if said new haplotype is not determined by step (c), repeating steps (a), (b) and (c) with a different individual until said new haplotype is determined.
94. A method according to claim 89 wherein the population is the normal Caucasian population.
95. A method according to claim 89 wherein the population is the normal Black/ African American population.
96. A method according to claim 89 wherein the population is the normal Asian population.
PCT/US1998/016574 1997-08-04 1998-08-04 Determining common functional alleles in a population and uses therefor WO1999006598A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU87768/98A AU8776898A (en) 1997-08-04 1998-08-04 Determining common functional alleles in a population and uses therefor

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US90577297A 1997-08-04 1997-08-04
US08/905,772 1997-08-04
US8447198A 1998-05-22 1998-05-22
US09/084,471 1998-05-22

Publications (2)

Publication Number Publication Date
WO1999006598A2 true WO1999006598A2 (en) 1999-02-11
WO1999006598A3 WO1999006598A3 (en) 1999-04-29

Family

ID=26771013

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/016574 WO1999006598A2 (en) 1997-08-04 1998-08-04 Determining common functional alleles in a population and uses therefor

Country Status (2)

Country Link
AU (1) AU8776898A (en)
WO (1) WO1999006598A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001075163A2 (en) * 2000-04-04 2001-10-11 Polygenyx, Inc. High throughput methods for haplotyping
WO2004079004A1 (en) * 2003-03-07 2004-09-16 Istituto Oncologico Romagnolo Cooperativa Sociale A R.L. Method for the identification of colorectal tumors
US6844154B2 (en) 2000-04-04 2005-01-18 Polygenyx, Inc. High throughput methods for haplotyping
US8026064B2 (en) 2002-12-31 2011-09-27 Metamorphix, Inc. Compositions, methods and systems for inferring bovine breed

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5654155A (en) * 1996-02-12 1997-08-05 Oncormed, Inc. Consensus sequence of the human BRCA1 gene
WO1998005677A1 (en) * 1996-08-05 1998-02-12 Oncormed, Inc. Susceptibility mutations for breast and ovarian cancer
WO1998044157A2 (en) * 1997-03-28 1998-10-08 Oncormed, Inc. Methods for identifying variations in polynucleotide sequences

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5654155A (en) * 1996-02-12 1997-08-05 Oncormed, Inc. Consensus sequence of the human BRCA1 gene
WO1998005677A1 (en) * 1996-08-05 1998-02-12 Oncormed, Inc. Susceptibility mutations for breast and ovarian cancer
WO1998044157A2 (en) * 1997-03-28 1998-10-08 Oncormed, Inc. Methods for identifying variations in polynucleotide sequences

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
AL MOUDALLAL Z ET AL. : "Monoclonal antibodies as probes of the antigenic structure of tobacco mosaic virus" THE EMBO JOURNAL, vol. 1, no. 8, 1982, pages 1005-1010, XP002095330 *
HACIA J G ET AL: "DETECTION OF HETEROZYGOUS MUTATIONS IN BRCA1 USING HIGH DENSITY OLIGONUCLEOTIDE ARRAYS AND TWO-COLOUR FLUORESCENCE ANALYSIS" NATURE GENETICS, vol. 14, no. 4, December 1996, pages 441-447, XP000615885 *
MADSEN H O ET AL: "A NEW FREQUENT ALLELE IS THE MISSING LINK IN THE STRUCTURAL POLYMORPHISM OF THE HUMAN MANNAN-BINDING PROTEIN" IMMUNOGENETICS, vol. 40, no. 1, 1994, pages 37-44, XP000609909 *
MATTILA P S ET AL.: "Extensive allele sequence variation in the J region of the human immunoglobin heavy chain gene locus" EUROPEAN JOURNAL OF IMMUNOLOGY, vol. 25, 1995, pages 2578-2582, XP002095329 *
MERAJVER ET AL: "Risk assessment and presymptomatic molecular diagnosis in hereditary breast cancer" CLINICS IN LABORATORY MEDICINE, vol. 1, no. 16, March 1996, page 139 139 XP002079197 *
MORRISON N ET AL: "Frequent alleles of the human vitamin D receptor gene are functionally distinct" JOURNAL OF CELLULAR BIOCHEMISTRY. SUPPLEMENT, no. SUPPL. 16C, 21 February 1992, page 20 XP002080808 *
WAINSCOAT J S ET AL.: "Evolutionary relationships of human populations from an analysis of nuclear DNA polymorphisms" NATURE, vol. 319, 1986, pages 491-493, XP002095332 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001075163A2 (en) * 2000-04-04 2001-10-11 Polygenyx, Inc. High throughput methods for haplotyping
WO2001075163A3 (en) * 2000-04-04 2003-01-09 Polygenyx Inc High throughput methods for haplotyping
US6844154B2 (en) 2000-04-04 2005-01-18 Polygenyx, Inc. High throughput methods for haplotyping
US8026064B2 (en) 2002-12-31 2011-09-27 Metamorphix, Inc. Compositions, methods and systems for inferring bovine breed
US8669056B2 (en) 2002-12-31 2014-03-11 Cargill Incorporated Compositions, methods, and systems for inferring bovine breed
US9982311B2 (en) 2002-12-31 2018-05-29 Branhaven LLC Compositions, methods, and systems for inferring bovine breed
US10190167B2 (en) 2002-12-31 2019-01-29 Branhaven LLC Methods and systems for inferring bovine traits
US11053547B2 (en) 2002-12-31 2021-07-06 Branhaven LLC Methods and systems for inferring bovine traits
WO2004079004A1 (en) * 2003-03-07 2004-09-16 Istituto Oncologico Romagnolo Cooperativa Sociale A R.L. Method for the identification of colorectal tumors
US8343722B2 (en) 2003-03-07 2013-01-01 Istituto Oncologico Romagnolo Cooperativa Sociale a.r.l. Method for the identification of colorectal tumors

Also Published As

Publication number Publication date
WO1999006598A3 (en) 1999-04-29
AU8776898A (en) 1999-02-22

Similar Documents

Publication Publication Date Title
US5654155A (en) Consensus sequence of the human BRCA1 gene
Li et al. Carrier Frequency of the Bloom SyndromeblmAshMutation in the Ashkenazi Jewish Population
US20060177863A1 (en) Biallelic markers for use in constructing a high density disequilibrium map of the human genome
JP2005529592A5 (en)
US6083698A (en) Cancer susceptibility mutations of BRCA1
US6051379A (en) Cancer susceptibility mutations of BRCA2
Keating Linkage analysis and long QT syndrome. Using genetics to study cardiovascular disease.
JP2009520460A (en) Genetic polymorphism associated with myocardial infarction, detection method and use thereof
US20040235006A1 (en) Chemical compounds
US6951721B2 (en) Method for determining the haplotype of a human BRCA1 gene
Jirapongsananuruk et al. CYBB mutation analysis in X-linked chronic granulomatous disease
CA2324866A1 (en) Biallelic markers for use in constructing a high density disequilibrium map of the human genome
US20030170674A1 (en) Nitric oxide synthase gene diagnostic polymorphisms
WO1999006598A2 (en) Determining common functional alleles in a population and uses therefor
US6232063B1 (en) Co-dominant genetic diagnosis test
KR101100437B1 (en) A polynucleotide associated with a colon cancer comprising single nucleotide polymorphism, microarray and diagnostic kit comprising the same and method for diagnosing a colon cancer using the polynucleotide
JP2007516719A (en) Polynucleotide involved in type 2 diabetes including single nucleotide polymorphism, microarray and diagnostic kit containing the same, and method for analyzing polynucleotide using the same
US20070128602A1 (en) Polynucleotide associated with a colon cancer comprising single nucleotide poylmorphism, microarray and diagnostic kit comprising the same and method for diagnosing a colon cancer using the polynucleotide
US20040209259A1 (en) Methods of determining genotypes in duplicated genes and genomic regions
US20050084849A1 (en) Diagnostic polymorphisms for the ecnos promoter
JP3682688B2 (en) Osteoporosis drug sensitivity prediction method and reagent kit therefor
US20030235819A1 (en) Mutations in the BRCA1 gene
WO2005059104A2 (en) Slc5a7 genetic markers associated with age of onset of alzheimer&#39;s disease
US20040048265A1 (en) Obesity associated biallelic marker maps
WO2001053522A2 (en) (CA)n POLYMORPHISMS IN AN INTRON OF THE ENDOTHELIAL NITRIC OXIDE SYNTHASE GENE AND THEIR USE IN DIAGNOSTIC AND THERAPEUTIC APPLICATIONS

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase in:

Ref country code: KR

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase in:

Ref country code: CA

122 Ep: pct application non-entry in european phase