THE DETERMINATION OF FUNCTION OF MAMMALIAN PROTEINS BY ANALYSIS OF GLOBAL GENE EXPRESSION PROFILES
FIELD OF THE INVENTION
The invention relates to determination of the function and analysis of the activity of a mammalian protein, by expressing the protein in cells of a eu aryotic organism and by analyzing changes in transcriptional levels of genes of the organism due to the activity of the protein expressed in the cells. The invention also relates to the use of the observed transcriptional changes for developing assays for modulators (inhibitors and activators) of the protein activity and function and to high throughput screens for identification of such modulators.
BACKGROUND OF THE INVENTION
There is a substantial conservation of biochemical functions in many intracellular pathways between cells of higher and lower eukaryotic organisms, for example between human and yeast cells. This means that eukaryotic cells carry out similar biochemical processes with similar proteins and that this similarity is shared between human cells and those of other eukaryotic organisms, such as yeast. This similarity may extend, for example, to the number of proteins used in a pathway, their enzymatic or other function and their interactions. The similarity may also extend to the level of the DNA sequences coding for these proteins. For example, many yeast genes find their homologues in mammalian cells. However, there are many more genes in the human genome (currently estimated at 32,000) than in yeast (about 6,200 ORFs). Many of mammalian genes thus have no homologues in yeast and even in yeast many of the ORFs have no known function. The present invention uses yeast cells to determine the unknown function of mammalian proteins and/or to assay their activity, the latter with a view to determine activators and inhibitors of the activity of mammalian proteins, by transcriptionally profiling the consequences of their expression in yeast.
The yeast Saccharomyces cerevisiae is the first eukaryote whose entire genome has been sequenced (see, for example, http://www.nature.com/genomics/papers/s_cerevisiae.html). Yeasts, such as i Saccharomyces cerevisiae and Schizosaccharomyces pombe, are particularly useful for the purposes of the present invention, as a wealth of genetic tools are available for these organisms, including their complete genome sequence, gene chips corresponding to each identified ORF, a number of sophisticated databases describing gene functions and global gene expression under a variety of conditions, l and the ability to heterologously express mammalian genes using yeast transcriptional and translational signals.
The conservation of biochemical functions in eukaryotic cells also means that non- mammalian eukaryotic cells with genetic deficiencies that disrupt some phase of a
> metabolic or regulatory pathway can be often complemented by corresponding mammalian proteins expressed in the deficient cells. Such functional expression of foreign genes in various mutant yeast strains has been an important tool in molecular biological research. For example, complementation of yeast mutations with mammalian expression libraries has allowed the identification of new classes of
) human G1 Cyclins (Lew D.J. et al., Cell, 66, 1197-1206 (1991 )) and the RGS family of proteins that activate the GTPase of heterotrimeric G protein (Druey et al., Nature, 379, 742-746 (1996)). Similarly, functions of many (but not all) genes from organisms less amenable to genetic analysis and molecular manipulation than S. cerevisiae, such as Candida albicans, have been identified by direct complementation of a gene
5 function in S. cerevisiae (Leberer E. et al., Curr. Biol., 7, 539-546 (1997). In a more expansive strategy pertinent to the present invention, it is also possible, using a well defined yeast phenotype, to select for genes from other organisms that interfere with function of a yeast gene. Using the yeast pheromone response pathway, it was previously shown that one can select from a library of genes of Candida albicans
) expressed in yeast those which interfere with the function of the pathway (Whiteway M. et al., Proc. Natl. Acad. Sci. U.S.A., 89, 9410-9414 (1992)), Analysis showed that
- some of these Candida albicans genes coded for the cognate proteins in S. cerevisiae, such as kinases, but could not complement their function. An explanation of this finding was that the Candida albicans proteins can perform some but not all of the cellular functions of their cognate S. cerevisiae genes, and that their expression in yeast creates a "dominant negative" effect (Herskowitz, Nature 329, 219-222 (1987)) that interferes with the normal function of a pathway. This presented a new and interesting strategy for the preliminary characterization of the function of foreign genes in yeast.
i Expression of a foreign gene in yeast may create a readily assessable phenotype in the yeast cells. This phenotype can result from a genetic manipulation of the yeast strain, to allow detection of the expected function of the mammalian gene. For example, yeast strains constructed to express chimeric yeast/mammalian α-subunits of heterotrimeric G protein were used to detect functional expression in the same cell i of human 7TMD receptor proteins (Price L.A. et al., Mol. Cell. Biol., _5, 6188-6195 (1995)). Expression of a mammalian protein in yeast can create a distinct phenotype even when the yeast pathway affected by the expressed protein is unknown. Examples of this is a lethal phenotype in S. pombe (Superti-Furga G. et al., Nat. Biotechnol., _4, 600-605 (1996)) and growth defects in S. cerevisiae (Murphy S. et ι al., Mol. Cell Biol., 13, 5290-5300 (1993)) created as a result of expression of mammalian Src kinase in the yeast cells, even though no Src-like proteins exist in these yeasts. An easily measurable, usually a visually discernible phenotype created by the expression of a foreign protein in yeast cells can often provide a useful tool for drug discovery, by allowing the screening of potential modulators (e.g., inhibitors) of i the expressed foreign protein (see, for example, Murphy S. et al., supra).
The screening of pharmacologically active substances in eukaryotic cells has relied so far mostly on visible phenotypes that are occasionally created by affecting the function of protein or proteins of a known or unknown metabolic or signaling I pathway. The advent of array display technologies has opened up the possibility of detecting a "global phenotype" in cells of organisms, such as yeast cells. The
response of all the genes in an organism to external or internal changes can be assayed using DNA chips that include all the known genes of that organism. In one relevant application of this analysis, changes can be detected at the molecular level, even when there are no changes in any of the usually observed phenotypes such as growth rate, morphology, or ability to grow on certain substrates.
DNA microarrays (also known as DNA chips) are ordered arrays of oligonucleotide probes immobilized on a solid support, such as a glass slide or silicon chip. Each different nucleotide probe is localized in a predetermined region of the support and is designed to hybridize with a specific nucleic acid sequence in a target nucleic acid (see, for example, WO 89/10977 and WO 89/11548). This allows a simultaneous detection of the presence of a multiplicity of specific sequences in the target nucleic acid.
In particular types of DNA microarrays, the DNA probes, usually fixed to a glass substrate (slide), may be cDNA sequences or, as in the case of yeast, DNA sequences corresponding to complete open reading frames (ORF) of the yeast genome. The usual method of using such a DNA microarray is to isolate RNA from the organism of interest cultured under predetermined experimental conditions, usually differing by a single experimental factor from some standard conditions, and from the same organism cultured under the standard conditions (a control culture). cDNA is then prepared from both RNA samples, thus providing two cDNA samples. The cDNA samples are then labeled with two different dyes, for example Cy3 and Cy5 dyes, mixed, and hybridized to the DNA chip. The cDNA hybridized to specific probes may be then detected and quantified using some observable and measurable property of the labels, such as their fluorescence (see, for example, WO 97/10365). This approach was recently used to create a catalog of 800 yeast genes whose transcription levels vary periodically with the cell cycle and to study the effect of induction of two G1 Cyclins on the transcription levels of these genes (Spellman P. et al., Mol. Biol. Cell, 9, 3273-3297 (1998)). Further transcriptional profiles of yeast
genes from cells grown under a variety of conditions are available, for example at http://genome-www.stanford.edu/Saccharomyces/ and http://transcriptome.ens.fr/ymgv/index.html, and a list of publications at http://genome-www4.stanford.edu/cgi- bin/SGD/reference/geneinfo.pl?topic=Genome-wide+Analysis. DNA chips comprising many hundreds or even thousands of probes of many genes of various eukaryotic species are presently available.
The monitoring of changes in the transcription level of genes of eukaryotic cells for the purpose of drug screening using DNA microarrays has been disclosed in US patent No. 5,569,588. According to this patent, cells of a eukaryotic organism are exposed to a candidate drug, mRNA transcripts are isolated from the cells, and contacted directly or after reverse transcription into cDNA with an ordered DNA microarray of gene probes, each probe being specific for a different gene or ORF of the organism's genome. The hybridization signal is then detected and quantified for each probe and the transcription profile (i.e., the level of transcription of the organism's genes) of the drug-stimulated cells is compared with the transcription profile of cDNA from control cells. After obtaining a specific response profile for a pharmacological compound, the likely pharmacological activity of the compound can be inferred by comparing its response profile with similar profiles of other compounds of known pharmacological activity.
DNA microarrays consisting of probes complementary to all genes or a major subset of genes of a eukaryotic organism, such as yeast, provide a powerful tool for assessing the consequences of any modification of the intracellular or extracellular environment of an organism. Changes in the expression of genes of an eukaryotic organism arising from the expression and activity of a foreign protein in the organism's cells are of particular importance and pertinence to this invention. Such changes provide an assessment of the protein's function or activity and its involvement in metabolic and signaling pathways of the organism of origin. This function or activity may not be apparent from the sequence of the gene and protein.
The present invention provides a new method of determining the function of a mammalian protein, by comparison of its transcriptional profile with transcriptional profiles of proteins of known functions. In addition, the invention provides a method of assaying the activity of the protein, even when the activity measured in the heterologous hosts is unrelated to the protein's function in its native host.
The observed transcriptional changes due to the expression of a foreign protein can also be used to screen for substances, such as low molecular weight compounds, peptides, or proteins, that modulate the activity of the proteins, in particular human proteins associated with various pathological conditions, such as infectious diseases, degenerative diseases and cancer. Once identified, such modulators have the potential of being developed into specific drugs for the treatment of these pathological conditions. Modulating proteins identified in this way can themselves be targets for the development of new therapeutics. This approach may further lead, for example, to reconstruction of complete mammalian signaling pathways and to examining the influence of other proteins and modulators on these pathways. Irregardless of the real function of the mammalian protein in its native environment, the transcriptional profiling (or "transcriptional bar coding") according to the present invention can be used for the screening of pharmacologically active substances. To make the screening system even more versatile and efficient, the control elements of the yeast genes that show the most extreme transcriptional response can be used to construct gene fusions with reporter genes, such as LacZ or endogenous genes for which a simple phenotypic response is available. The present invention recognizes all the above aspects and shows that measuring the transcriptional effects of a mammalian protein expressed in yeast, irrespective of the protein's function in its native host and environment, provides an assay that can be developed into a high throughput screen (HTS) format for screening of modulators (activators and inhibitors) of that protein.
. BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 is a graph showing transcription profiles of yeast genes in response to the expression of various human genes in yeast. The profiles of a selected gene
> implicated in the pheromone response pathway (fusl ) is shown in white, those of other genes are shown in grey above the zero line if activated and below the zero line if repressed. Expression of Grb10 protein for 20 hours does not significantly affect global yeast gene expression; few genes show any difference from the control conditions. Expression of the MEKK3 protein for 5 hours also showed little effect on
) yeast gene transcription, with no genes expressed at 2-fold level above or below the level of the controls. Significant modulation of yeast gene transcription, both induction and repression, was observed for the cells expressing MEKK3 protein for 16 or 24 hours. In particular, selected pheromone response genes (Fus1 , Aga1 , Fig2, Fus2, Kar4) are induced dramatically at both time points. This induction is
» dependent on the kinase activity of the MEKK3 protein, because expression of the catalytically defective KR substitution eliminates the induction of the pheromone response genes.
Fig. 2 is a photograph showing the cellular morphology of yeast cells expressing ) either the GFP protein alone (top panels), or the GFP protein fused to MEKK3 protein (bottom panels). After 24 hours, the MEKK3 expressing cells have clear morphological aberrancies relative to the round control cells.
Fig. 3 is a photograph showing a time course of the induction of the MEKK3 specific i cellular abnormalities. Large aberrant cells appear within 240 minutes after the induction of MEKK3.
Fig. 4 is a photograph showing the time course of the overproduction of the MEKK3 fusion protein. The top panel shows the expression of the myc tag itself, the bottom
) panel shows the expression of the MEKK3 myc tagged protein. After 24 hours, the
. level of protein expression is very high, but protein accumulation is evident after 3 hours and pronounced after 5 hours.
Figure 5 is a graph showing the expression of selected genes in the presence of the hyperactive human Ras val12 allele. Genes that are up-regulated in the presence of Ras are above the zero line, genes down-regulated are below. The Snz1 gene selected to serve as a marker is shown in white and is noted specifically.
SUMMARY OF THE INVENTION
In its broadest aspect, the invention provides a new method for the evaluation of activity and function of a mammalian protein, by expressing the protein in another eukaryotic organism, in particular in yeast, and observing and analyzing transcriptional changes of the organism's genes (a transcriptional profiling approach).
>
In one specific aspect, the invention provides a method of evaluation of function of mammalian proteins, such as human proteins implicated in pathological conditions, signal transduction and cell cycle progression, by expressing these proteins in cells of another eukaryotic organism, in particular yeast, and analyzing changes in the
) transcriptional pattern of the cells caused by the activity of the protein.
In another specific aspect, the invention provides a method of screening for potential modulators of function of mammalian proteins, in particular human proteins, such as signaling and cell cycle proteins, i
In yet another specific aspect, the invention provides new assays and high throughput screens for identifying modulators (low molecular weight compounds, peptides, or other proteins) of the function of mammalian proteins.
) In a preferred embodiment, the above methods rely on expressing in yeast cells the cDNA for the human protein of interest using a yeast promoter and examining the
> changes in transcriptional pattern of yeast genes by using a DNA microarray composed of DNA probes complementary to a majority of genes of the yeast genome. Yeast genes showing a significant change (increase or decrease) in transcription in the presence of the human protein are then used for the construction i of reporter genes for screening assays.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention provides a method of evaluation of function of a mammalian, in particular human protein, by expressing the cDNA encoding that protein in cells of another eukaryotic organism and by analyzing changes in the transcriptional pattern of the cells caused by the activity of the expressed foreign (heterologous) protein. Such changes in the transcriptional pattern, when compared with the normal transcriptional profile of the organism's cells, provide a fingerprint (transcriptional profile, transcriptional bar code) of the presence and/or activity of that mammalian protein. This information may be used in several manners to determine the function of the mammalian protein, for example by comparing the transcriptional profile of the protein with transcriptional profiles of other proteins of known functions. The logic behind this approach is that similar changes in the normal transcriptional profile generated by different heterologous proteins imply that these proteins affect the same metabolic or signaling pathway. Combined with information on the modifications of transcriptional profiles resulting from manipulation of known regulatory pathways of the eukaryotic host organism, this provides insight into the regulatory mechanisms of specific pathways and into specific targets affected by the expressed product of the foreign gene. Such information may be stored in a database for future reference and with increasing amount of information on modification of transcriptional profiles by foreign proteins expressed in cells of the eukaryotic organism, patterns of modified gene expression will begin to show linkages of regulatory circuits and the observed effects will become more informative and effectively enable the determination of function of other proteins by their transcriptional profiling alone.
The term "activity of a protein" as used herein includes any activity influencing a gene expression pathway. This may be, for example, an enzymatic function, but the term also includes functions of proteins which have no enzymatic activity as classically i defined, but are known, for example, to modify the behaviour or location of other proteins in the cell, usually by binding to them or by covalently modifying them. It should be stressed that according to the invention the activity of a mammalian protein is determined and measured in a eukaryotic organism (heterologous host) other than the native mammalian organism (native host) and that this activity may be
) unrelated to the protein's function in the native host.
The effects of activity of the mammalian protein on transcriptional levels of endogenous genes of a eukaryotic organism are monitored in cells of the organism transformed with a recombinant construct comprising a DNA sequence coding for the j foreign (mammalian) protein, such as cDNA of this protein, that is linked to an endogenous yeast transcriptional regulatory elements or any other regulatory element that ensures reliable, preferably regulatable expression of the DNA sequence in yeast. Such an expression cassette is preferably enclosed in a plasmid that replicates in yeast and that can be selected for in yeast cells. The expression of
) a foreign protein in yeast cells is not necessarily limited to a single mammalian protein. The cells may be transformed with an expression vector or vectors expressing several proteins, under control of a single or several different regulatory elements. This embodiment is preferred to establish possible interactions between the expressed proteins. Genetically transformed cells capable of the expression of
> the foreign proteins are prepared by methods well known to those skilled in the art.
The genetically transformed cells are cultured under conditions that ensure the maintenance of the plasmids, their mRNA transcripts are isolated, reverse transcribed into cDNA and hybridized with a microarray of DNA probes ) complementary to mRNA transcripts of the organism. The cDNA hybridized with the probes of the microarray is labeled prior to hybridization, to facilitate easy detection
and quantification of hybridization to each of the probes of the microarray. Microarrays of probes corresponding to a major part of the organism's genome, in particular of the entire genome, are preferred for initial screens. However, as the database of transcriptional profiles becomes more densely populated, subsets of genes of the organism may ensure better screening and subarrays of the original array may be used. It would be apparent to those skilled in the art that even though microarrays of DNA probes derived from the ORFs of the yeast genome are preferred for the purpose of the present invention, microarrays or subarrays of DNA probes derived from genomes of organisms other than yeast may also be used.
In a preferred embodiment, the eukaryotic organism for the practice of the invention is a strain of yeast. The choice of a suitable yeast strain, either wild or genetically modified, may depend on a number of factors, such as permeability to small molecules through manipulation of the drug efflux system, or attenuation of the ubiquitination system to allow better expression of peptides and mammalian proteins. The yeast cells are genetically transformed with a suitable expression vector comprising the gene of an exogenous protein operatively linked to an endogenous regulatory sequence. The preparation of such expression vectors, transformation of the yeast cells therewith, selection of the transformed cells and their amplification are carried out using standard methods of recombinant DNA technology well known to those skilled in the art as in Current Protocols in Molecular Biology (see, for example, http://www.wiley.com/legaCy/cp/cpmb/).
In a preferred embodiment, the heterologous gene or cDNA functionally and physically linked to endogenous (yeast) regulatory sequence is inserted into a plasmid, with which the yeast cells are subsequently transformed. A marker gene facilitating the process of selection of the transformed cells is usually inserted into the plasmid together with the gene coding for the mammalian protein. Examples of suitable plasmids include the pVT series (Vernet T. et al., Gene 52, 225-233 (1987)), the pRS series (Sikorski et al., Genetics 122, 19-27 (1989)), and the pGREG series (in preparation). Examples of suitable selectable genes include TRP1 , URA3, LEU2,
HIS3, and KanR. The reporter construct can be either replicated as a plasmid or directed to and integrated by homologous recombination at specific sites in the genome to allow stable integration. After selection, transformed yeast cells are amplified and cultured according to standard procedures.
The endogenous regulatory sequence consists of a yeast promoter and transcriptional terminator. Regulatable promoters, making expression of the foreign protein in the yeast cells dependent on the presence or absence of a specific substance in the culturing medium are preferred, but are not essential. An inducible system may be preferable if the foreign gene's expression causes slow growth of the yeast cells. In such a case, it may be preferred to turn on the gene's expression for a limited time prior to profiling, as expressing the gene for a long time could cause suppressors to develop. Regulatable yeast promoters, such as Gall or Gal10, are particularly preferred, as they are well characterized. These promoters allow inducible expression of the gene under their control, which expression is limited (at least at the high level) to the period of time when the inducer (galactose) is present in the culturing medium.
The transformed yeast cells are cultured under conditions ensuring the expression of the foreign gene, harvested, and polyA+ mRNA is isolated from the sample. The cDNA derived from this sample by reverse transcription following priming with oligodT and/or random hexamer primers, is then hybridized with a suitable microarray of oligonucleotide probes. The hybridization is preferably carried out at low stringency, and may include subsequent washes at progressively increasing stringency, until a desired level of hybridization specificity is achieved. The specificity of hybridization may also be increased by washing the hybridized microarray with lower ionic strength buffers, either at ambient or at higher temperatures, and/or with buffers containing non-specific nucleic acids to increase stringency and reduce background.
- The pool of nucleic acids to be hybridized with the microarray of oligonucleotide probes is labeled prior to hybridization. Fluorescent labeling methods using fluorescent dyes, such as Cy3 and Cy5 dyes, are preferred. Quantification of the hybridized nucleic acids is carried out by measurement of fluorescence from the
> hybridized, fluorescently labeled nucleic acids, preferably using a DNA chip reader that scans the DNA chip recording the fluorescent intensity and wavelength at each spot. The reader is normally capable of automatic scanning of the array, and may be further equipped with a data acquisition system, for the automated recording and subsequent integration and processing of the fluorescence intensity measurements.
) The subsequent data analysis, significance of the transcriptional signals and display of the data is achieved by processing the data with an appropriate software, usually a combination of software packages.
The accuracy of the determination of function of the foreign protein expressed in cells j of a eukaryotic organism depends on completeness of the observed transcriptional response of the cells. Thus the DNA microarray used for hybridization should comprise as comprehensive a collection of probes as is available for the eukaryotic organism in question, preferably corresponding to a majority of the organism's genes, most preferably, but not necessarily, to essentially all genes of the organism.
) For organisms such as yeast, this requires microarrays comprising at least several thousands of probes. Preferred DNA microarrays for yeast are DNA chips containing
PCR-generated DNA probes representing each of the approximately 6000 open reading frames (ORFs) of the yeast genome. An example may be chips available from Canadian Microarray Consortium (Princess Margaret Hospital, Toronto, j Canada).
The transcriptional response of cells is quantified as the value of hybridization signal for each or selected genes of the eukaryotic organism, using all or selected probes of the microarray. The quantitative hybridization data are preferably evaluated by ) measuring the difference in intensity of the hybridization signal between cells in which the foreign protein was expressed and negative control cells (wild type cells)
cultured under the same conditions. The hybridization data can be further corrected using hybridization signals from other controls applied to ensure that the observed changes in transcriptional response of the cells are due to the activity of the foreign protein and not to some unrelated factors. For example, the transcriptional profile of cells transformed with a recombinant vector bearing an expression cassette for the foreign protein may be corrected using the transcriptional profile of cells containing the vector alone (i.e., without the expression cassette). Similarly, an inactive mutant of the protein, such as an inactive kinase mutant, may be used to determine if the transcriptional changes are due to protein's activity and not only to its presence in the host cells. Generalized transcriptional responses resulting from the presence of large amounts of a foreign protein (that is cellular stress responses, such as heat shock proteins, molecular chaperones) may be identified and added to the database and removed from consideration. Alternatively, the DNA probes corresponding to such proteins can be excluded from the DNA microarray.
It is possible to modify, genetically or chemically, the cells in which the foreign protein is expressed to permit the detection of a wider range of effects of the foreign gene expression. For example, removing desensitizing elements of the signaling pathway affected by the foreign protein may enhance the effects of the protein on the pathway. In particular, the consequences of expressing foreign kinases in yeast cells may be amplified by deletion of gene or genes of one or more phosphatases. Efforts that have created systematic disruptions of all the yeast genes will provide tools for constructing yeast strains with enhanced sensitivities for specific classes of expression products of foreign genes.
Is it also possible to modify the foreign gene to be expressed in eukaryotic cells. Genes for proteins that require phosphorylation for their activity can be mutated to replace the normally phosphorylated amino acid residue with an amino acid having a negatively charged side chain (i.e., replacing serine, threonine or tyrosine with aspartic or glutamic acid). This creates a pseudo-phosphorylated residue. that mimics the activated form and cannot be dephosphorylated and thus remains permanently
active. Proteins that require an association, either physical or functional, with specific proteins to become active can be co-expressed with such proteins. For example, cyclin-dependent protein kinases can be co-expressed with various cyclins, to generate kinases of different specificities. Similarly, negative regulatory regions of proteins can be removed by deletion of the respective coding sequences from their genes, to create constitutively active versions of the proteins. Similar modifications may be applied to other categories of proteins to be characterized by the methods of the present invention and such modifications would be apparent to those skilled in the art.
The identification of a gene of a eukaryotic organism whose transcription is regulated by the expression and activity of a mammalian protein in yeast provides an unique tool for developing assays and high throughput screens for modulators of function of the mammalian protein. If the expression of the mammalian protein leads to a significant increase or decrease in the expression of one or several genes of the yeast cell, as measured by its changed transcription level, the regulatory sequences of such a gene may be operatively linked to a suitable reporter gene whose expression can be easily detected and quantified, such a gene coding for a protein that cleaves a substrate to yield a coloured product (for example LacZ or MEL1 ) or a reporter gene that permits growth of yeast (for example HIS3 or URA3). A suitable DNA construct (expression cassette) containing the reporter gene is introduced into yeast cells to create a reporter strain. The yeast expression vector capable of expressing the mammalian cDNA is then introduced into the cells of the reporter strain. The cells are then cultured in the presence of a potential modulator of the protein activity (either a small molecule added exogenously, or a peptide or protein expressed endogenously), and the expression of the reporter gene is detected and quantified by a method depending on the reporter gene used. The level of expression of the reporter gene in the presence of the potential modulator, compared with the level of expression of the reporter gene in the absence of the modulator, provides a measure of modulation (e.g. inhibition) of the protein activity. This approach takes the global response of cells down to measuring the activity of a
single or a few genes that are effected and is thus amenable to high throughput methods that would be difficult to perform with DNA microarrays.
The above approach to developing assays and high throughput screens (HTS) for protein activity modulators is particularly useful when the eukaryotic organism is a strain of yeast, due to a large number and precision of genetic methods available for this organism. In particular, the complete yeast genome has been sequenced, DNA chips with probes derived from the genome are available, and it is relatively straightforward to clone the upstream regulatory sequences of any gene whose transcription is significantly affected by the presence of the foreign protein.
Various convenient reporter gene systems exist which can be operatively linked to such regulatory sequences, to provide a readily assayable readout. Examples of such reporter genes include LacZ producing β-galactosidase, for which well established colorimetric assays are available, HIS3, a gene enabling cells of yeast to grow on medium lacking histidine, or URA3 which allows growth of yeast cells in the presence of an inhibitor and which can be selected both positively and negatively (the latter in the presence of 5-fluoroorotic acid). The choice of reporter genes suitable for the practice of the invention would be apparent to those of ordinary skill in the art.
EXPERIMENTAL
The invention will be now explained in more detail with reference to several non- limiting examples that further illustrate the practicality and versatility of the invention. It should be noted that the methods of the present invention can be applied to wide variety of classes of mammalian proteins, even those whose expression in eukaryotic host cells do not result in a changed transcriptional profile of the cells. However, such "non-responders" may be useful, in that they can be transformed with clone banks of other mammalian genes and new transcriptional profiles found.
Example 1 MEKK3
MEKK3 is a member of the MEKK super-family of MEK activating kinases. These kinases represent the upstream activators of MAP kinase pathways and are conserved in structure and function from yeast to man. The specific function of mammalian MEKK3 is not known, but recent evidence implicates this protein in the control of cell cycle progression; overproduction of MEKK3 causes arrest of the cell cycle of mammalian cells (Ellinguer-Ziegelbauer H. ef al., Mol. Cell Biol. 19, 3857- 3868 (1999)).
The human MEKK3 was expressed in yeast as both a GFP fusion protein and a 13myc fusion protein, under control of the galactose-inducible promoter, using the pGREG vector system. Intracellular localization of the protein through GFP fluorescence showed the protein to be cytoplasmic and concentrated in discrete spots within the cells. Analysis of Western blots using the 13myc tagged version showed that the protein was expressed in a galactose inducible manner (Figure 4). Both expression constructs caused significant morphological and growth effects in the yeast strain W303-1A used as the host; the cells grew poorly and exhibited morphological aberrancies consisting of extended buds and enlarged cells (Figures 2, 3).
The profile of gene expression in the presence of MEKK3-GFP was checked using gene chips at 5, 16 and 24 hours after switching to galactose-containing growth medium. Each experiment involved two chips. For one chip, the control RNA was labeled with Cy5 and compared with Cy3-labeled RNA from the MEKK3 containing strain. For the second chip, the same RNA was labeled in reciprocal way, with the control RNA getting the Cy3 label and the MEKK3 RNA getting the Cy5 label. Each gene was duplicated on the chip. Two independent RNA preparations were tested at 5 hours and 16 hours post-induction, and one preparation was analyzed at 24 hours.
Experiments were repeated several times for each time point (Fig. 1 )._These data sets were checked for internal consistency of the signals and then analyzed for significant changes in gene expression generated by the presence of the MEKK3 protein. The data from the microarray was quantified using QuantArray software (supplied with the array reader used) and transferred into the GeneSpring program for gene expression profile analysis. Few genes were identified to change significantly after 5 hours, but several genes were observed whose expression was increased after 16 hours and further increased after 24 hours (Fig. 1 ). These induced genes were compared with controls, which included the kinase dead version of MEKK3, as well as the expression of other mammalian genes such as Grb10, MEK1 and RasVal12. Best candidate genes were those that showed significant changes only in the 16 and 24 hour samples for MEKK3 expression and not in the other conditions; the Fus1 expression profile is highlighted as a white trace in Fig. 1.
Among genes significantly induced by expression of functional MEKK3 were a number of genes also known to be induced by treatment of cells with mating factor, or by expression of a hyper-activated Ste11 (MEKK) protein (Roberts C.J. et al., Science 287, 873-880 (2000)). We therefore tested whether a pheromone induced reporter gene was also activated by expression of MEKK3. The FUS1 gene promoter hooked up to the LacZ reporter was used to test the induction of FUS1::LacZ in response to galactose induced expression of MEKK3. Cells containing the galactose-induced MEKK3-GFP as well as a Fus1::LacZ reporter plasmid were grown in either glucose or galactose containing medium for 24 hours, and the expression of β-galactosidase was monitored using the CPRG colour shift from yellow to purple. Fus1::LacZ was induced in a galactose-dependent, MEKK3- dependent manner (Table 1 ), meaning that Fus1::LacZ represented a suitable reporter gene for the expression of MEKK3 function in yeast.
Table 1. Induction of β-galactosidase reporter gene in response to the expression of the human MEKK3 protein.
The MEKK3 protein alone (pGreg506) or fused to GFP (GFP) was expressed under control of the galactose promoter. Expression of the Fus1 :LacZ reporter gene monitored by b-galactosidase activity was only seen when the MEKK protein or fusion protein was induced. The measurements were made after 5 hours of induction.
Vector Insert Galactose β-Gal. activity
PGreg506 None 0.0
+ 0.2
MEKK3 0.0
+ 16.2
GFP None 0.0
+ 0.2
MEKK3 0.0
+ 28.2
) MEKK3 is example of a protein for which there is both a transcriptional profile response and a microscopically observable phenotype.
Example 2 Grb10
Grb10 family members are adapter proteins implicated in a variety of mammaliar signaling pathways and contain SH3, SH2 and PH domains (Nantel, A. et al., J. Biol. Chem. 273: 10475-10484 (1998), Nantel, A. et al., J. Biol. Chem. 21 A: 35719-35724 (1999)). No related proteins are found in yeast, although the SH3 domains of Grb10 are related to the SH3 domains of yeast proteins. Thus by sequence analysis there are nc
yeast homologues of Grb10 proteins. More importantly, since there are no specific tyrosine kinases in yeast, the SH2 domain would not interact with any yeast proteins.
Grb10 was expressed in yeast as a GFP fusion protein; this protein was cytoplasmicall^ localized. The 13myc fusion protein was also expressed in yeast under galactose contro and Western blots showed strong expression at 20 hours. There was no visible phenotype from the Grb10-13myc expression. Microarray analysis at 20 hrs after the shift to galactose showed no significant changes in gene expression (this was confirmee with reciprocal chips for 3 independent RNA preparations). Consequently, there was nc opportunity to develop a reporter gene system for the presence of Grb10.
Grb10 is an example of a gene that is expected to have no function in yeast, as there is no tyrosine phosphorylation for its SH2 domain to interact with. However, this gene constitutes a good starting point for identifying the interacting partners, by introducing mammalian tyrosine kinases and other mammalian cDNAs in expression vectors.
Example 3
Rasval12
The Ras family of GTPases are highly conserved eukaryotic signaling molecules. The hyperactive human Ras gene, Rasva 2, was transformed into yeast as both a GFP fusion protein and as a 13myc fusion protein. There was no significant effect of express on of this protein on growth or cellular morphology. Galactose induced express on of the 13myc-Rasval12 gene for 24 hours caused changes in gene express on. These changes show some similarity to genes involved in pathways using the yeast Ras2Val19, including sporulation and glycogen accumulation.
A total of 13 genes showed a statistically significant (p<0.05) average increase of at least 2-fold, while 37 genes showed a significant reduction. Four independent RNA preparations were tested in 6 chips. The SNZ1 gene showed a very consistent 4-fold reduction in expression and thus would make a good candidate reporter gene, as it
showed little variation in other experiments (Fig 5). Rasval12 is an example of a mammalian protein that causes a significant transcriptional profile change with no significant morphological change, and for which there is a suitable reporter system. The utility of this example might be in identifying interacting proteins (activating and downstream pathways) and identify inhibitors of interaction with the Ras binding domain.
Example 4
Combined signaling modules
A mammalian MAP kinase module consists of the Raf, MEK and ERK kinases. Related MAP kinase modules exist in yeast to control response to mating pheromones, to osmotic stress and to cell wall damage. Expression of individual components of the mammalian module in S. cerevisiae has little effect.
The Mek1 gene was hooked up to GFP and 13myc tags and expressed under control of the galactose promoter. The GFP construct generated cytoplasmic staining, while the 13myc construct showed high levels of protein expression after overnight growth in galactose medium. The cells expressing the fusion constructs exhibited normal morphologies and growth rates. Microarray analysis was performed on 2 independent RNA preparations at 24 hours after the shift to galactose medium, at a point when high levels of protein had been present in the cells for several hours, but few significant changes in gene transcription were noted.
When expressed alone, neither Rafl-CT (amino acids 330-642 which encode the activated kinase domain), nor the full length MEK1 kinase, induced any strong (i.e. 2- fold or better) changes in gene expression in duplicate experiments performed with two independent RNA samples from separate cultures. These results were not entirely surprising, as both kinases have very stringent substrate specificity, as it appears that no substrate other than MEK1 and MEK2 was ever found for Rafl , while MEK1 can only phosphorylate the ERK MAP kinases. Furthermore, the full length MEK1 has little kinase activity in its inactive unphosphorylated form.
We therefore reconstructed the entire RAF-MEK-ERK pathway in yeast. It has already been demonstrated that reconstitution of an active MAP kinase cascade, leading to activated ERK, is possible in bacterial cells (see http://www.ncbi. nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve db=PubMedcoxlist_ui ds=9110999&dopt=Abstract).
Rafl-CT and MEK1 were co-expressed in yeast cells. Using a commercially available antibody that specifically recognizes dual-phosphorylated MEK1 , we demonstrated that the Raf-CT kinase can phosphorylate and activate MEK1. Again, several microarray experiments failed to detect any significantly modulated genes in co- transfected cells treated with galactose, possibly due to the high substrate specificity of MEK1 , as described previously. We then produced a triply-transformed yeast strain that expresses Rafl-CT, MEK1 and the MAP kinase ERK1. Yeast cells expressing these three kinases, but not those expressing only Raf+MEK or ERK1 alone, have a budding and morphological phenotype suggesting that the ERK1 kinase has been activated and was able to phosphorylate yeast proteins. Microarray analysis of the triply transformed strain identified over 20 genes that were induced more than 2-fold when the expression of the mammalian genes were turned on.
Example 5
Low molecular weight compounds
The MEKK3 protein expressing strains were screened against a library of low molecular weight compounds. A screen has been run using the MEKK3 construct and candidate inhibitors of the MEKK3 phenotype have been identified.
Example 6 Peptide inhibitors
Using the reporter constructs outlined in Example 1 , peptide modulators of the expressed mammalian proteins can also be screened. Intracellular display of random peptides can be achieved by cloning oligonucleotides encoding short peptides (typically 12-30 amino acid residues) into a "scaffold protein". The region of these proteins chosen to display the random sequences typically provides an underlying tertiary structure to the displayed random peptides. This "scaffold" probably also provides intracellular protection from proteolysis and ensures their presentation to binding surfaces. A Staphylococcal nuclease displayed library (Norman T.C. et al., Science, 285, 591-595 (1999)) has been screened for inhibitors of the MEKK3 mediated arrest and candidate inhibitors have been identified. The peptide leads discovered in this manner can provide information on the pathways themselves and also be used for further drug development strategies.
Although various particular embodiments of the present invention have been described hereinbefore for the purpose of illustration, it would be apparent to those skilled in the art that numerous variations may be made thereto without departing from the spirit and scope of the invention, as defined in the appended claims.