WO2006008575A2 - Construction of a comparative database and identificaiton of virulence factors through comparison of polymorphic regions in clinical isolates of infectious organisms - Google Patents
Construction of a comparative database and identificaiton of virulence factors through comparison of polymorphic regions in clinical isolates of infectious organisms Download PDFInfo
- Publication number
- WO2006008575A2 WO2006008575A2 PCT/IB2004/002598 IB2004002598W WO2006008575A2 WO 2006008575 A2 WO2006008575 A2 WO 2006008575A2 IB 2004002598 W IB2004002598 W IB 2004002598W WO 2006008575 A2 WO2006008575 A2 WO 2006008575A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- nos
- strain
- tuberculosis
- strains
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Definitions
- the present invention is directed to novel nucleotide sequences to be used for diagnosis, identification of the strain, typing of the strain and giving orientation to its potential degree of virulence, infectivity and/or latency for all infectious diseases including tuberculosis.
- the present invention also includes method for the identification and selection of polymorphisms associated with the virulence and /or infectivity in infectious diseases by a comparative genomic analysis of the sequences of different clinical isolates/strains of infectious organisms.
- the regions of polymorphisms can also act as potential drug targets and vaccine targets.
- the invention also relates to identifying virulence factors of M. tuberculosis strains and other infectious organisms to be included in a diagnostic DNA chip allowing identification of the strain, typing of the strain and finally giving orientation to its potential degree of virulence.
- Microbial pathogens use a variety of complex strategies to subvert host cellular functions to ensure their multiplication and survival. Some pathogens that have co- evolved or have had a long-standing association with their hosts utilize finely tuned host-specific strategies to establish a pathogenic relationship.
- pathogens During infection, pathogens encounter different conditions, and respond by expressing virulence factors that are appropriate for the particular environment, host, or both.
- the mycobacteria are rod-shaped, acid-fast, aerobic bacilli that do not form spores.
- mycobacteria are pathogenic to humans and/or animals, and factors associated with their virulence.
- Tuberculosis is a worldwide health problem, which causes approximately 3 million deaths each year, yet little is known about the molecular basis of tuberculosis pathogenesis.
- the disease is caused by infection with Mycobacterium tuberculosis; tubercle bacilli are inhaled and then ingested by alveolar macrophages. As is the case with most pathogens, infection with M. tuberculosis does not always result in disease.
- CMI cell- mediated immunity
- the tuberculosis complex is a group of four mycobacterial species that are so closely related genetically that it has been proposed ⁇ liat they be combined into a single species.
- Three important members :of the complex are Mycobacterium tuberculosis, the major cause of human tuberculosis; Mycobacterium africanum, a major cause of human tuberculosis in some populations; and Mycobacterium bovis, the cause of bovine tuberculosis. None of these mycobacteria is restricted to being pathogenic for a single host species.
- M. bovis causes tuberculosis in a wide range of animals including humans in which it causes a disease that is clinically indistinguishable from that caused by M. tuberculosis.
- Antibiotic treatment of tuberculosis is very expensive and requires prolonged administration of a combination of several anti-tuberculosis drugs. Treatment with single antibiotics is not advisable as tuberculosis organisms can develop resistance to the therapeutic levels of all antibiotics that are effective against them. Strains of M. tuberculosis that are resistant to one or more anti-tuberculosis drugs are becoming more frequent and treatment of patients infected with such strains is expensive and difficult. In a small but increasing percentage of human tuberculosis cases the tuberculosis organisms have become resistant to the two most useful antibiotics, isoniazid and rifampicin. Treatment of these patients presents extreme difficulty and in practice is often unsuccessful. In the current situation there is clearly an urgent need to develop new methods for detecting virulent strains of mycobacteria and to develop tuberculosis therapies.
- Improvements in the specific recognition and characterization of mycobacteria may also increase in relevance if current evidence linking diseases such as rheumatoid arthritis to mycobacterial antigens is substantiated. Emerging drug resistance to mycobacteria including M. avium isolates from AIDS patients, any Mycobacterium tuberculosis from TB patients is an increasing problem.
- a method of using DNA probes for the precise identification of mycobacteria and discrimination between closely related mycobacterial strains and species by genotype characterization is essential.
- the method of genotypic analysis is further applicable to the rapid identification of phenotypic properties such as drug resistance and pathogenicity.
- the invention aids in fulfilling these needs in the art.
- the method according to the invention has the advantage to reduce drastically the number of potential new targets and protective antigens by giving for the first time an exhaustive description of conserved SNPs in the tuberculosis.
- the isolated polynucleotides described in the present invention which are highly conserved in genomic sequences of both virulent and avirulent, are by this characteristic essential for the survival or the virulence of these mycobacteria in the host.
- the identification of antigens and potentially therapeutic targets has been made by a method of comparative genomic analysis.
- Patent application WO 02074903 describes a method of selection of purified nucleotidic sequences or polynucleotides encoding proteins or part of proteins carrying at least an essential function for the survival or the virulence of mycobacterium species by a comparative genomic analysis of the sequence of the genome of M. tuberculosis aligned on the genome sequence of M. leprae and M. tuberculosis and M. leprae marker polypeptides of nucleotides encoding the polypeptides, and methods for using the nucleotides and the encoded polypeptides are disclosed.
- US patent no. 6,228,575 provides oligonucleotide based arrays and methods for speciating and phenotyping organisms, for example, using oligonucleotide sequences based on the Mycobacterium tuberculosis, rpoB gene.
- the groups or species to which an organism belongs may be determined by comparing hybridization patterns of target nucleic acid from the organism to hybridization patterns in a database.
- Patent application no. WO9954487 and US patent no.6,492,506 describes a method for isolating a polynucleotide of interest that is present or is expressed in a genome of a first mycobacterium strain and that is absent or altered in a genome of a second mycobacterium strain which is different from the first mycobacterium strain using a bacterial artificial chromosome (BAC) vector.
- BAC bacterial artificial chromosome
- This invention further relates to a polynucleotide isolated by this method and recombinant BAC vector used in this method.
- the present invention comprises method and kit for detecting the presence of a mycobacteria in a biological sample.
- US patent no. 5,783,386 describes polynucleotides associated with virulence in mycobacteria, and particularly a fragment of DNA isolated from M. bovis that contains a region encoding a putative sigma factor. Also provided are methods for a DNA sequence or sequences associated with virulence determinants in mycobacteria, and particularly in M. tuberculosis and M. bovis. In addition, the invention provides a method for producing strains with altered virulence or other properties, which can themselves be used to identify and manipulate individual genes.
- US patent no. 5,955,077 relates to novel antigens from mycobacteria capable of evoking early (within 4 days) immunological responses from T-helper cells in the form of gamma-interferon release in memory immune animals after rechallenge infection with mycobacteria of the tuberculosis complex.
- the antigens of the invention are believed useful especially in vaccines, but also in diagnostic compositions, especially for diagnosing infection with virulent mycobacteria.
- nucleic acid fragments encoding the antigens as well as methods of immunizing animals/humans and methods of diagnosing tuberculosis.
- US patent no. 6,596,281 describes two genes for proteins of M. tuberculosis have been sequenced.
- the DNAs and their encoded polypeptides can be used for immunoassays and vaccines.
- Cocktails of at least three purified recombinant antigens, and cocktails of at least three DNAs encoding them can be used for improved assays and vaccines for bacterial pathogens and parasites.
- US patent no. 5,700;683 provides specific genetic deletions that result in an avirulent phenotype of a mycobacterium. These deletions may be used as phenotypic markers of providing a means for distinguishing between disease-producing and non-disease producing mycobacteria.
- US Patent no. 5,225,324 relates to a family of DNA insertion sequences (ISMY) of mycobacterial origin and other DNA probes which may be used a probes in assay methods for the identification of mycobacteria and the differentiation between closely related mycobacterial strains and species.
- ISMY DNA insertion sequences
- the use of ISMY, and of proteins and peptides encoded by ISMY, in vaccines, pharmaceutical preparations and diagnostic test kits is also disclosed.
- WO0066157 patent application provides for polypeptides encoded by open reading frames present in the genome of Mycobacterium tuberculosis but absent from the genome of BCG and diagnostic and prophylactic methodologies using these polypeptides.
- US 6,458,366 discloses compounds and methods for diagnosing tuberculosis.
- the compounds provided include polypeptides that contain at least one antigenic portion of one or more M. tuberculosis proteins, and DNA sequences encoding such polypeptides.
- Diagnostic kits containing such polypeptides or DNA sequences and a suitable detection reagent may be used for the detection of M. tuberculosis infection in patients and biological samples.
- Antibodies directed against such polypeptides are also provided.
- S. T. Cole has sequences the complete genome sequence of the best-characterized strain of Mycobacterium tuberculosis, H37Rv. The sequence has been analyzed in order to improve our understanding of the biology of this slow-growing pathogen and to help the conception of new prophylactic and therapeutic interventions. [Nature 393, 537 - 544 (1998)]
- polymorphisms constitute a set of putative virulence markers that are being validated in 120 clinical isolates of tuberculosis. The study results in a set of virulence markers, which could be used in predicting the degree of virulence and infectivity of Mycobacterium infections.
- the object of the present invention is to identify genes which encode for essential proteins or regulatory nucleotidic sequences in the survival or infection of mycobacterium species as also all infectious diseases and which could be useful for the design of drugs and vaccines based on the knowledge of comparative genomics.
- Yet another object of the present invention is to provide for the identification of strains including mycobacterium in disease samples, for the specific recognition of pathogenic strains, for precisely distinguishing closely related strains including mycobacterial strains and for defining virulence and resistance patterns.
- the method according to the invention has the advantage to reduce drastically the number of potential new targets and protective antigens by giving for the first time an exhaustive description of conserved SNPs in different M. tuberculosis strains, which cause tuberculsosis.
- the isolated polynucleotides described in the present invention which are highly conserved in genomic sequences of virulent strains are essential for the survival or the virulence of these strains, in particular mycobacteria, in the host.
- the identification of antigens and potentially therapeutic targets has been made by a method of comparative genomic analysis.
- the invention is directed to identifying virulence factors in M. tuberculosis & other infectious diseases, using both strands of DNA, RNA and/or proteins associated with the virulence factors, allowing identification of the strain, typing of the strain and finally giving orientation to its potential degree of virulence, infectivity and/or latency.
- this invention provides a nucleotide sequences for diagnosis, identification of the strain, typing of the strain and giving orientation to its potential degree of virulence, infectivity and/or latency of all infectious diseases having a SEQ ID nos 1 to 2531.
- the invention is further directed to a method comprising of aligning the genomic sequences of different mycobacteria species to a.
- testing the polynucleotide selected for its capacity of virulence or involved in the survival of a mycobacterium species said testing being based on the activation or inactivation of said polynucleotide in a bacterial host or said testing being based on the activity of the product of expression of said polynucleotide in vivo or in vitro.
- the invention further comprises of identification of following polymorphisms, having potential to be used as reagents and in diagnostics, drug and vaccine development for infectious diseases: i. Identical nucleotide in . virulent strains/species, but a different nucleotide in avirulent strains/species at the same position ii. Some of the virulent strains differ in the nucleotide sequence at specific positions and share the nucleotide sequence with that of avirulent strains. Yet another object of the present invention is to provide for the identification of strains including mycobacterium in disease samples, for the specific recognition of pathogenic strains, for precisely distinguishing closely related strains including mycobacterial strains and for defining virulence and resistance patterns.
- the method according to the invention has the advantage to reduce drastically the number of potential new targets and protective antigens by giving for the first time an exhaustive description of conserved SNPs in different M. tuberculosis strains, which cause tuberculsosis.
- the isolated polynucleotides described in the present invention which are highly conserved in genomic sequences of virulent strains are essential for the survival or the virulence of these strains, in particular mycobacteria, in the host.
- the identification of antigens and potentially therapeutic targets has been made by a method of comparative genomic analysis.
- the invention is directed to identifying virulence factors in M. tuberculosis & other infectious diseases, using both strands of DNA, RNA and/or proteins associated with the virulence factors, allowing identification of the strain, typing of the strain and finally giving orientation to its potential degree of virulence, infectivity and/or latency.
- this invention provides a nucleotide sequences for diagnosis, identification of the strain, typing of the strain and giving orientation to its potential degree of virulence, infectivity and/or latency of all infectious diseases having a SEQ ID nos 1 to 2531.
- the invention is further directed to a method comprising of aligning the genomic sequences of different mycobacteria species to a.
- testing the polynucleotide selected for its capacity of virulence or involved in the survival of a mycobacterium species said testing being based on the activation or inactivation of said polynucleotide in a bacterial host or said testing being based on the activity of the product of expression of said polynucleotide in vivo or in vitro.
- the invention further comprises of identification of following polymorphisms, having potential to be used as reagents and in diagnostics, drug and vaccine development for infectious diseases: i. Identical nucleotide in virulent strains/species, but a different nucleotide in avirulent strains/species at the same position ii. Some of the virulent strains differ in the nucleotide sequence at specific positions and share the nucleotide sequence with that of avirulent strains.
- the invention relates to the identification and analysis of Non-synonymous SNPs to predict conservative and non-conservative amino acid substitutions. The effect of the substitution on the function of the proteins encoded provided a powerful insight in predicting SNPs correlating with virulence and infectivity in infectious diseases for example M. tuberculosis.
- the invention further relates to proteins, RNA, DNA and metabolites encoded by the region carrying the polymorphisms in tuberculosis and other infectious disease causing organisms; which can be utilized for developing drugs and vaccines effective against tuberculosis and other infectious diseases, plays a important role in gene therapy, RNAi technology.and imaging.
- the invention is also directed to a process for the production of recombinant polypeptides and chimeric polypeptides comprising them, antibodies generated against these polypeptides, immunogenic or vaccine compositions comprising at least one polypeptide useful as protective antigens or capable to induce a protective response in vivo or in vitro against ycobacterium infections, immunotherapeutic compositions comprising at least such a polypeptide according to the invention, and the use of such nucleic acids and polypeptides in diagnostic methods, vaccines, kits, or antimicrobial therapy.
- SEQ ID Nos.l to 1829 are single nucleotide polymorphisms.
- SEQ ID Nos.l 830 to 2286 is an insertion/deletion (indel)
- SEQ ID No 2287 to 2531 are regions of long polymorphism.
- the present invention also includes primer sequences for amplifying the region around the polymorphism SEQ ID nos 1 to 2531
- nucleotide sequences flanking the polymorphisms of SEQ ID Nos. 1 to 2531 to a length of 35 nucleotides on either side are used in reagents and in diagnostics, drug development, RNAi, gene therapy and other such technologies.
- SEQ ID Nos 1 to 2531 are used as targets for drug design using bioinformatics and other tools, drug development, for gene therapy and vaccine development.
- This invention also includes the use of proteins, RNA, DNA and metabolites encoded by the region carrying the polymorphisms having a SEQ ID Nos. 1 to 2531 for RNAi technology and antisense technologies.
- This invention also includes a database for identification and selection of the polymorphisms having SEQ ID nos . 1 to 2531 . Brief description of the figures and tables:
- Fig 1 describes Entity Relationship Model.
- Fig 2 illustrates the identification of SNPs in M. tuberculosis strains H37Rv, CDC1551 and M. bovis BCG.
- a total of 1829 SNP's have been identified in the three genomes. Of these 1825 SNPs are identical in H37Rv and CDC1551, with a different nucleotide in BCG. 1579 of these are in ORFs while the rest (246) are in non-coding regions.
- the SNPs in the ORF are categorized into synonymous, non-synonymous SNPs. The latter are further categorized on the basis of the change in primary structure of the protein that results - conservative for no-change and non-conservative for changed primary structure of protein encoded.
- Figure 3 illustrates the identification of indels in M. tuberculosis strains H37Rv, CDC1551 and M. bovis BCG. A total of 794 indels have been identified in the three genomes. Of these, 237 are present in both H37Rv and CDC1551 with respect to BCG, 178 in ORF and 59 are outside the ORF.
- Figure 4 illustrates Identification of long plymorphisms in M. tuberculosis strains H37Rv, CDC 1551 and M. bovis BCG. 136 polymorphisms are present in the three genomes, 30 of them being identical to CDC1551 and H37Rv. 22 of these polymorphisms are present in the ORFs while 8 are outside the ORF.
- Figure 5 display shows a region of 10kb of the BCG genome with three types of annotations: BCG ORF's, SNP's in H37Rv, and SNP's in CDC1551.
- Figure 6 shows the comparative genomics browser displaying BCG in the upper panel and H37Rv in the bottom panel.
- the segments labeled MUM-* are the perfect matches generated by the MUMmer tool, and the vertical lines show the alignment of the MUM segments in both genomes.
- the color coding of the ORF's is used to indicate the length of the ORF. This is very helpful to researchers because if an ORF in H37 aligns with an ORF in BCG but they have different colors, then there is a mutation that makes them have different lengths (see for example the genes in the MUMrl280 region).
- Figure .7.1 - 7.25 are the primers used for the amplification to encompass the regions of polymorphisms.
- Table 1 gives the list of Single Nucleotide Polymorphisms in Mycobacterium tuberculosis/ M. bovis BCG.
- Table 2 gives the list of Insertions/deletions (Indels) in Mycobacterium tuberculosis/ M, bovis BCG.
- Table 3 gives the list of long polymorphisms in Mycobacterium tuberculosis/ M. bovis BCG.
- Table 4 lists Polymorphisms in genes involved in cell wall synthesis.
- Table 5 lists Polymorphisms in transcription factors.
- Table 6 lists Polymorphisms in genes involved in lipid metabolism
- Table 8 lists Polymorphisms in genes implicated in virulence
- the Mycobacterium tuberculosis complex consists of six species - M. tuberculosis, M. bovis, M. canotti, M.microtii and M. africanum. Of these, the genomes of two different strains of M. tuberculosis, which are virulent and infective to humans, have been completely sequenced, while the complete genome of M. bovis BCG, which is non-virulent and non-infective has also been sequenced. Only partial sequences are available for the other species. All Mycobacterium sequences available in the NCBI, EMBL, GENBANK, Sanger and ⁇ GR databases were retrieved and compiled.
- H37Rv Mycobacterium tuberculosis strains H37Rv
- CDC1551 referred to as CDC1551
- BCG Mycobacterium bovis BCG
- BCG was chosen as the reference genome and compare the two tuberculosis strains, CDC1551 and H37Rv, against the reference.
- MUMmer uses fasta files as input and was run using the following command line: run-mummer 1 bovis.fasta cdcl551.
- fasta BCG-CDC which takes the format, program ⁇ reference> ⁇ query> ⁇ output>
- the BCG-CDC parameter provides the file name prefix for the output files, the bovis.fasta parameter is the reference fasta file, and the CDC1551.fasta parameter is the name of the query fasta sequence file.
- the database is generated using the scripts: Parsing MUMmer . align file to extract polymorphism data
- SNP's are synonymous or non-synonymous, whether they are within or outside an open reading frame. is first determined. All SNP's that lie within an ORF are taken and the amino acid for that codon containing the SNP is determined.
- SNP_analysis table To insert data into the SNP analysis table the SNP data from the SNP, SEQ_SNP and gene ontology tables is fetched and entered into the SNP_analysis table. This step also identifies the conservative and non-conservative amino acids.
- SNPs identified were of two kinds: i. Identical nucleotide in CDC 1551 and H37Rv, but a different nucleotide in
- CDC1551 and H37Rv are different from each other and one of them is identical to the BCG sequence at identical positions.
- SNPs thus identified were categorized according to their location in Open Reading Frames. SNPs falling within the ORF of both BCG and H37Rv were identified. The results were validated by determining if the SNPs were present in the ORFs of BCG and CDC1551.
- SNPs falling in ORFs were further categorized into synonymous and non- synonymous SNPs.
- a SNP was said to cause a non-synonymous change if:
- a SNP can be in one ORF in the reference sequence but in another ORF in the comparison sequence, e.g. due to a frame-shift mutation earlier in the sequence.
- SNP's to 'Non Synonymous' or 'Synonymous' groupings all SNP's which either did not fall in an ORF, or fell into different ORF's on the reference and comparison sequences were eliminated.
- the BCG and H37 genomes have been annotated with respect to one another.
- CDC 1551 has not been so thoroughly annotated, so it was not possible to immediately assess if an ORF in BCG was the corresponding ORF in CDC. Therefore, a metric was devised to eliminate spurious comparisons.
- the non-synonymous SNPs thus identified was analysed to predict conservative and non-conservative amino acid substitutions.
- the effect of the substitution on the function of the proteins encoded was predicted. This provides a powerful insight in predicting SNPs correlating with virulence and infectivity in M. tuberculosis.
- Bovis_ORF - Yes indicates that the SNP in bovis is in bovis ORF. No indicates not in ORF.
- Bovis_base Indicates the SNP with respect to the SNP position in bovis
- Bovis_AA - Displays the bovis amino acid after the codon translation.
- ⁇ Qry_pos - Displays the position of a SNP in either CDC 1551 or H37Rv with respect to bovis SNP position.
- Is_nsSNP - Displays SNPs synonymous (S), non-synonymous (NS) and SNPs in non-coding region (NC).
- a total of 1829 have been identified in the three genomes. Of these 1825 SNPs consist of having the same nucleotide in H37Rv and CDC1551, with a different nucleotide in BCG. Of thel829 SNPs, 1579 are in ORFs while the rest (246) are in non-coding regions. 811 H37Rv SNPs and 810 CDC1551 SNPs are synonymous while 1282 H37Rv and 1219 CDC1551 SNPs are non-synonymous. Out of 1219 CDC1551 nsSNPs, 312 SNPs have conservative amino acid substitution, 888 have non- conservative substitution and 19 results in truncated proteins. Out of 1282 H37Rv non-synomous SNPs, 304 have conservative amino acid substitution, 954 have non- conservative substitution and 24 results in truncated proteins. ( Figure 2)
- Indels are insertions and deletions in the sequence with respect to BCG sequence. These indels could be of one or more nucleotides. Considering BCG as reference sequence, the indels in the both the strains of M. tuberculosis, H37rv and CDC1551 were identified.
- Long polymorphs are insertions or deletions of long stretches of nucleotides with respect to BCG sequence.
- the EMBL sequence DB has made putative GO assignments to most of the ORF's in the three TB genomes, so a local installation of GO was used together with the EMBL cross reference tables to identify TB polymorphisms based on their putative functional classification.
- the annotation table consisting of the genbank features of the genes such as coding region, database reference and product information to name a few was constructed.
- Gene_start - This indicates the start of the coding region
- Locus_tag - db_xref - This indicates the gene indices representation of the gene db_xref_GOA - This indicates the gene ontology identity of the gene product id - This indicates the gene annotation type - strand - This indicates the forward or reverse strand of the sequence that is stored in the genbank gene_name - This indicates the gene name gene_link - This provides a hyperlink to the gene features form the genbank note - This provides the general information and the protein information of the gene.
- a front-end was constructed as an essential part of the database: Front end of the database:
- the front-end displaying the results of alignment as follows:
- the annotation table consists of genbank annotation about the genes in bovis, H37Rv and CDC1551. It specifies details including the coding region of a gene and its database reference.
- the database is made queryable to retrieve the required features of SNPs, indels and long polymorphs respectively.
- the main options to query the SNP information are: Select SNPs
- the query has been designed in the similar way for both indels and long polymorphs.
- the SNP analysis includes functional annotation id, which is hyperlinked to the functional annotation of the gene carrying the polymorphism.
- the functional annotation id consists of either one of the Swiss Prot, SPTREMBL or gene ontology id's. Similarly the indels and long polymorphs are also functionally annotated.
- Genes with known involvement in virulence of Mycobacterium tuberculosis can also be accessed from the SNP database query or from the Long polymorphs database query respectively.
- the first tool was based on the Generic Genome Browser developed at Cold Spring Harbor Lab (CSHL). This visualization tool could show a single TB genome along with any annotations, e.g. SNP locations for all other genomes.
- CSHL Generic Genome Browser developed at Cold Spring Harbor Lab
- the output can be obtained by specifying the region of interest in the text box labeled as "landmark or region".
- the gene start and the gene end has to be specified and in case of indels or long polymorphs, the BCG start and BCG end must be specified.
- the display can also be zoomed in or out by selecting the required number of base pairs in the scroll down menu.
- Figure 4 display shows a region of 10kb of the BCG genome with three types of annotations: BCG ORFs, SNFs in H37Rv, and SNPs in CDC1551.
- Figure 5 shows the comparative genomics browser displaying BCG in the upper panel and H37Rv in the bottom panel.
- the segments labeled MUM-* are the perfect matches generated by the MUMmer tool, and the vertical lines show the alignment of the MUM segments in both genomes.
- the color coding of the ORF's is used to indicate the length of the ORF. This is very helpful to researchers because if an ORF in H37 aligns with an ORF in BCG but they have different colors, then there is a mutation that makes them have different lengths (see for example the genes in the MUM-1280 region).
- a set of five Mycobacterium tuberculosis strains with known virulence is being screened for the polymorphisms identified above.
- Primers have been designed to encompass the regions of polymorphisms.
- the list of the primers used for the amplification is given in the Fig. 6.1-6.25 Amplification and sequencing of regions around the polymorphisms: DNA from the five strains has been amplified under optimal conditions determined for each primer pair. The amplified fragments have been sequenced and the sequences obtained from different strains compared.
- BCG M.bovis BCG
- H37Rv M.tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- NINF non-lethal North Indian strain
- BS Beijing strain
- NIF Lethal North Indian strain.
- the gene coding for oxidoreductase activity is a virulence gene which does not show any differences between the M.tuberculosis strains, but has a conservative polymorphism with M.bovis BCG .
- BCG M.bovis BCG
- H37Rv M.tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- BS Beijing strain
- NINF non-lethal North Indian strain
- NIF Lethal North Indian strain .
- the insertion in BCG leads to a shorter protein with a different carboxyl terminal compared to the transcription factor encoded by the tuberculosis strains.
- the SNP common to all the tuberculosis strains results in a conservative substitution in the PPE33b gene and does not affect the function of this gene.
- the A to G substitution results in the truncation of the prjotein encoded by BCG.
- BCG M.bovis BCG
- H37Rv M. tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- BS Beijing strain
- NMF non-lethal North Indian strain
- NIF Lethal Norm Indian strain .
- a single nucleotide polymorphism occurring in the proton transport gene PPE33b results in the introduction of a stop codon and hence truncation of the protein in BCG.
- BCG M.bovis BCG
- H37Rv M.tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- BS Beijing strain
- NINF non-lethal North Indian strain.
- the region is characterized by the occurrence of two indels and two SNPs in a transcription regulator. All the tuberculosis strains appear to be identical in this region while BCG. has a different amino-acid sequence in the region.
- BCG M.bovis BCG
- H37Rv M.tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- BS Beijing strain
- NINF non-lethal North Indian strain
- NIF Lethal North Indian strain
- the C to T transition occurs in a gene of unknown function and results in a synonymous substitution.
- the C to A change occurs in a transcription factor (MbO393) and is a non-conservative substitution resulting in a slightly different protein in BCG.
- BCG M.bovis BCG
- H37Rv M.tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- BS Beijing strain
- NINF non-lethal North Indian strain
- NIF Lethal North Indian strain.
- a synonymous SNP occurs in a virulence gene and is identical in all the tuberculosis strains.
- BCG M.bovis BCG
- H37Rv M.tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- BS Beijing strain
- NINF non-lethal North Indian strain
- NIF Lethal North Indian strain.
- the SNP in BCG results in splitting the gene PE-PGRS32 into two parts with the latter being truncated.
- BCG M.bovis BCG
- H37Rv M.tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- BS Beijing strain
- NINF non-lethal North Indian strain.
- the polymorphism observed occurs in the pksl2 gene and results in a non-conservative substitution.
- BCG M.bovis BCG
- H37Rv M.tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- BS Beijing strain
- NINF non-lethal North Indian strain.
- the deletion in BCG occurs in the region corresponding to a gene with putative enzyme activity and results in a loss of function in BCG.
- BCG M.bovis BCG
- H37Rv M.tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- BS Beijing strain
- NINF non-lethal North Indian strain
- NIF North Indian Fatal.
- the first T to C transition results in the truncation of the bacterial regulatory protein in BCG.
- BCG M.bovis BCG
- H37Rv M.tuberculosis strain H37Rv sequence from NCBI database
- CDC CDC1551
- S.I South Indian strain A2313
- BS Beijing strain
- NINF non-lethal North Indian strain
- NIF North Indian Fatal strain.
- a total of 2755 polymorphisms including 1779 in ORFs and 313 in regions outside the ORF are being screened for association to virulence and/or infectivity in tuberculosis.
- a multicomponent analysis to determine the association of polymorphism to the degree, of virulence and infectivity is in progress.
- the polymorphisms which constitute a set of virulence markers are further being validated in 120 clinical isolates of tuberculosis.
- the virulence factors thus identified could be used as: i. Diagnostic markers in prediction of disease and its progress in the patient, ii. Drug targets for development of new and effective treatments for TB. iii. Candidate genes/sequences in DNA vaccine, iv. In development of SiRNA technology for combating tuberculosis.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Pathology (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IB2004/002598 WO2006008575A2 (en) | 2004-07-12 | 2004-07-12 | Construction of a comparative database and identificaiton of virulence factors through comparison of polymorphic regions in clinical isolates of infectious organisms |
CN200480043971.9A CN101421415A (en) | 2004-07-12 | 2004-07-12 | Construction of a comparative database and identification of virulence factors through comparison of polymorphic regions in clinical isolates of infectious organisms |
EP04744233A EP1789577A4 (en) | 2004-07-12 | 2004-07-12 | Construction of a comparative database and identificaiton of virulence factors through comparison of polymorphic regions in clinical isolates of infectious organisms |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IB2004/002598 WO2006008575A2 (en) | 2004-07-12 | 2004-07-12 | Construction of a comparative database and identificaiton of virulence factors through comparison of polymorphic regions in clinical isolates of infectious organisms |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2006008575A2 true WO2006008575A2 (en) | 2006-01-26 |
WO2006008575A9 WO2006008575A9 (en) | 2010-08-12 |
Family
ID=35785594
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2004/002598 WO2006008575A2 (en) | 2004-07-12 | 2004-07-12 | Construction of a comparative database and identificaiton of virulence factors through comparison of polymorphic regions in clinical isolates of infectious organisms |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1789577A4 (en) |
CN (1) | CN101421415A (en) |
WO (1) | WO2006008575A2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104131100A (en) * | 2014-07-31 | 2014-11-05 | 深圳市亿立方生物技术有限公司 | Fluorescent PCR reaction liquid and kit for mycobacterium parting identification |
EP2947158A1 (en) * | 2010-05-25 | 2015-11-25 | National University of Ireland, Galway | Diagnostic method |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106048019A (en) * | 2016-06-13 | 2016-10-26 | 遵义医学院附属医院 | Antituberculous drug drug-resistance gene and screening method thereof |
CN108165560B (en) * | 2017-12-01 | 2021-06-08 | 北京蛋白质组研究中心 | Mycobacterium tuberculosis H37Rv encoding gene and application thereof |
CN108165563B (en) * | 2017-12-01 | 2021-02-19 | 北京蛋白质组研究中心 | Mycobacterium tuberculosis H37Rv encoding gene and application thereof |
CN108165561B (en) * | 2017-12-01 | 2021-06-18 | 北京蛋白质组研究中心 | Mycobacterium tuberculosis H37Rv encoding gene and application thereof |
CN110408629B (en) * | 2018-04-28 | 2020-11-20 | 北京蛋白质组研究中心 | Mycobacterium tuberculosis H37Rv encoding gene and application thereof |
CN110408632B (en) * | 2018-04-28 | 2021-01-19 | 北京蛋白质组研究中心 | Mycobacterium tuberculosis H37Rv encoding gene and application thereof |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2002306849A1 (en) * | 2001-03-21 | 2002-10-08 | Elitra Pharmaceuticals, Inc. | Identification of essential genes in microorganisms |
EP1573038A2 (en) * | 2002-07-19 | 2005-09-14 | Arizona Board Of Regents | A high resolution typing system for pathogenic mycobacterium tuberculosum |
-
2004
- 2004-07-12 EP EP04744233A patent/EP1789577A4/en not_active Withdrawn
- 2004-07-12 CN CN200480043971.9A patent/CN101421415A/en active Pending
- 2004-07-12 WO PCT/IB2004/002598 patent/WO2006008575A2/en active Application Filing
Non-Patent Citations (1)
Title |
---|
See references of EP1789577A2 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2947158A1 (en) * | 2010-05-25 | 2015-11-25 | National University of Ireland, Galway | Diagnostic method |
US9994918B2 (en) | 2010-05-25 | 2018-06-12 | National University Of Ireland, Galway | Diagnostic method |
CN104131100A (en) * | 2014-07-31 | 2014-11-05 | 深圳市亿立方生物技术有限公司 | Fluorescent PCR reaction liquid and kit for mycobacterium parting identification |
CN104131100B (en) * | 2014-07-31 | 2016-03-23 | 深圳市亿立方生物技术有限公司 | Mycobacterium Classification Identification Fluorescence PCR liquid and test kit |
Also Published As
Publication number | Publication date |
---|---|
EP1789577A4 (en) | 2010-11-17 |
CN101421415A (en) | 2009-04-29 |
EP1789577A2 (en) | 2007-05-30 |
WO2006008575A9 (en) | 2010-08-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Petersen et al. | Third-generation sequencing in the clinical laboratory: exploring the advantages and challenges of nanopore sequencing | |
Dean et al. | Predicting phenotype and emerging strains among Chlamydia trachomatis infections | |
Mokrousov et al. | Detection of embB306 mutations in ethambutol-susceptible clinical isolates of Mycobacterium tuberculosis from Northwestern Russia: implications for genotypic resistance testing | |
Adair et al. | Diversity in a variable-number tandem repeat from Yersinia pestis | |
Klint et al. | High-resolution genotyping of Chlamydia trachomatis strains by multilocus sequence analysis | |
Sullivan et al. | Candida dubliniensis: characteristics and identification | |
Beifuss et al. | Direct detection of five common dermatophyte species in clinical samples using a rapid and sensitive 24‐h PCR–ELISA technique open to protocol transfer | |
Parsons et al. | Phenotypic and molecular characterization of Mycobacterium tuberculosis isolates resistant to both isoniazid and ethambutol | |
Hewinson et al. | Recent advances in our knowledge of Mycobacterium bovis: a feeling for the organism | |
EP2006395B1 (en) | M. tuberculosis open reading frame RV2660c and uses thereof | |
Vigil et al. | Defining the humoral immune response to infectious agents using high-density protein microarrays | |
Narayanan | Molecular epidemiology of tuberculosis | |
Winglee et al. | Whole genome sequencing of Mycobacterium africanum strains from Mali provides insights into the mechanisms of geographic restriction | |
Gazi et al. | Functional, structural and epitopic prediction of hypothetical proteins of Mycobacterium tuberculosis H37Rv: An in silico approach for prioritizing the targets | |
CN111793704B (en) | SNP molecular marker for identifying Brucella vaccine strain S2 and wild strain and application thereof | |
CN113481311A (en) | SNP molecular marker for identifying Brucella vaccine strain M5 and application thereof | |
JP4738740B2 (en) | M.M. TUBERCULOSIS deletion sequences, methods for detecting mycobacteria using these sequences, and vaccines | |
Bart et al. | Intragenomic variation in the internal transcribed spacer 1 region of Dientamoeba fragilis as a molecular epidemiological marker | |
EP1789577A2 (en) | Construction of a comparative database and identificaiton of virulence factors through comparison of polymorphic regions in clinical isolates of infectious organisms | |
Amlerova et al. | Genotyping of Mycobacterium tuberculosis using whole genome sequencing | |
US20080085284A1 (en) | Construction of a Comparative Database and Identification of Virulence Factors Comparison of Polymorphic Regions in Clinical Isolates of Infectious Organisms | |
Van Soolingen et al. | New perspectives in the molecular epidemiology of tuberculosis | |
Carey-Ewend et al. | Population genomics of Plasmodium ovale species in sub-Saharan Africa | |
Björkholm et al. | Genomics of helicobacter 2003 | |
Peletiri et al. | Genetic diversity of circulating genotypes of PCR confirmed Neisseria meningifidis serogroups amongst cerebrospinal meningitis patients in parts of Northern Nigeria |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200480043971.9 Country of ref document: CN |
|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase in: |
Ref country code: DE |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2004744233 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 674/CHENP/2007 Country of ref document: IN |
|
WWP | Wipo information: published in national office |
Ref document number: 2004744233 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 11632108 Country of ref document: US |