WO2013073929A1 - Method and apparatus for detecting nucleic acid variation(s) - Google Patents
Method and apparatus for detecting nucleic acid variation(s) Download PDFInfo
- Publication number
- WO2013073929A1 WO2013073929A1 PCT/MY2012/000273 MY2012000273W WO2013073929A1 WO 2013073929 A1 WO2013073929 A1 WO 2013073929A1 MY 2012000273 W MY2012000273 W MY 2012000273W WO 2013073929 A1 WO2013073929 A1 WO 2013073929A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nucleic acid
- acid variation
- detecting
- signal intensity
- snp
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
- C12Q1/6827—Hybridisation assays for detection of mutation or polymorphism
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/20—Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
Definitions
- the present invention relates to the field of detecting nucleic acid variations, for example in genotyping. More specifically, the invention relates to detecting a nucleic acid variation of a locus present in a sample. In particular, allelic variations or single nucleotide polymorphisms (SNPs) may be detected.
- SNPs single nucleotide polymorphisms
- Genotyping generally refers to identifying the genetic makeup of an individual organism. Genotyping may detect nucleic acid variations in an individual, for example allelic variations or single nucleotide polymorphisms (SNP).
- SNP single nucleotide polymorphisms
- Genotyping platforms include the invader assay (Olivier et a/., 2005), array-based methods (Perkel 2008) and arrayed primer extension or APEX (Kurg et a/., 2000).
- the abundant data generated with these robust genotyping platforms need to be analysed, typically using computerised methods. Examples of data analysis methods using diverse algorithms have been reported in US 2009/0062138, Ritchie et a/., 2009, Takitoh et a/., (2005). It is desirable to develop methods capable of genotyping with high accuracy.
- the present invention relates to detecting nucleic acid variation(s).
- the present invention provides a method for detecting a nucleic acid variation of a locus in a sample, comprising the steps of: (i) contacting at least two differentially labelled probes with the sample; wherein the first labelled probe is capable of detecting a first nucleic acid variation A and the second labelled probe is capable of detecting a second nucleic acid variation B;
- the method may be for detecting different alleles.
- the method may be for detecting single nucleic acid polymorphisms (SNPs).
- Figure 1 depicts the hybridisation of allele-specific probes and locus specific oligonucleotide (probe) comprising an IllumiCode region (a step from lllumina GoldenGate assay).
- Figure 2 depicts the hybridisation of PCR products to beads on an array.
- Figure 3 shows an example of the raw intensities output in graphical format from Genome Studio.
- Figure 4 shows further examples of raw signal intensities distribution output in graphical format from Genome Studio.
- Figure 5 shows a graphical representation of the background corrected signal intensities against the expected genotype.
- Figure 6 shows the error margins of the present corrected signal ratio algorithm for detecting nucleic acid variation(s).
- Figure 7 shows a Venn diagram comparing the SNP calls from the corrected signal ratio algorithm (Poh), Genome Studio (GS) and sequencing.
- Figure 8 shows the flow diagram of an example of the method of the present invention.
- An array refers to a support including a slide, chip, membrane, bead, or microtiter plate, with a plurality of elements bound or immobilised at defined locations.
- the elements may comprise molecules (e.g. nucleic acid molecules).
- a microarray refers to a high density array.
- a microarray may have a density of 120 or more elements per cm 2 .
- double polynucleotide polymorphism refers to two single polynucleotide polymorphisms, and includes the circumstances when the two SNPs are positioned next to each other, separated by other nucleotides, on different strands of the same nucleic acid molecules, or on different nucleic acid molecules.
- a primer refers to an oligonucleotide to which deoxyribonucleotides may be added by a DNA polymerase.
- a single primer may be used to amplify a DNA or RNA region, for example, for sequencing.
- a primer pair usually comprises a first primer complementary to one strand of a DNA or RNA molecule and a second primer complementary to a second strand of a DNA or RNA molecule, with both primers flanking a target DNA or RNA region, to be amplified by a DNA polymerase.
- a probe refers to any molecule used to locate and/or identify a target DNA or RNA sequence. Probes may usually be labelled by standard methods, for example, radioactively or with fluorescent markers. For example, probes may be used to detect differences in DNA or RNA sequences, including single nucleotide polymorphism(s).
- the differentially labelled probes also include at least two differentially labelled nucleotides, wherein one nucleotide is incorporated by a polymerase to a polynucleotide being extended, depending on the SNP present.
- Nucleic acid variation includes, but is not limited to allelic variations, a single nucleotide polymorphism (SNP), a double nucleotide polymorphism (DNP), a deletion, an insertion, a substitution, a nucleic acid amplification, a rearrangement of a nucleic acid sequence or a gene and/or its corresponding transcriptional and/or translational product, and/or alternative splicing of the transcriptional and/or translational product.
- SNP single nucleotide polymorphism
- DNP double nucleotide polymorphism
- a single polynucleotide polymorphism refers to a DNA and/or RNA sequence variation occurring when a single nucleotide in an organism's genetic material which differs between members of the species (or between paired chromosomes in the organism). SNP includes substitution, deletion or insertion of a single nucleotide.
- the invention relates to a method for detecting nucleic acid variations. As herein described, the method comprises the steps of:
- the sample comprises an isolated sample.
- Detection of the nucleic acid variation is based on the ratio of background corrected signal intensities of at least two differentially labelled probes.
- the signal intensity X of the first probe capable of detecting the first nucleic acid variation A is background corrected to give X A .
- the signal intensity Y of the second probe capable of detecting the second nucleic acid variation B is background corrected to give YB.
- Background correction may be performed by any suitable method.
- background correction may be made by subtracting the background intensity (Bl) from the signals X and Y to give XA and Y B respectively.
- the background intensity (Bl) may be determined by measuring signal intensity in the absence of any probe (negative control).
- the nucleic acid variation present is determined based on the corrected signal ratio (S r ) XA:YB-
- the nucleic acid variation is determined as A:A if X A :YB ⁇ C:1.
- the nucleic acid variation is determined as B:B if X A :Y B ⁇ 1 :C.
- the nucleic acid variation is determined as A:B if ⁇ ⁇ is between C:1 and 1 :C.
- C (cut-off) is a real number.
- C may be any value ⁇ 2.
- C 3.
- nucleic acid variation is determined as A:A. If ⁇ ⁇ 1 :3, the nucleic acid variation is determined as B:B. If XA:YB is between 3:1 and 1 :3, the nucleic acid variation is determined as A:B.
- Corrected signal intensities for nucleic acid variations A and B respectively against the background should be larger than 0. If X A or Y B ⁇ 0, the signal is taken to be negligible or absent. Accordingly, if XA:YB ⁇ C:1 given X A , YB > 0, or if XA > 0 and Y B > 0, then the nucleic acid variation present is A:A. If XA-'YB ⁇ 1 :C given X A , Y B > 0, or if Y B > 0 and X A ⁇ 0, then the nucleic acid variation present is B:B.
- the signal intensities X and Y are detected on a support.
- the support may comprise an array or more in particular, a microarray.
- Step (i) may further comprise an amplification step.
- the amplification is with a polymerase chain reaction (PCR).
- the PCR is with the first labelled probe, the second labelled probe and a locus specific oligonucleotide as primers to give PCR products.
- the PCR products are then hybridised to locus specific nucleic acid immobilised on the support.
- the lllumina GoldenGate technology includes an amplification step (see Examples below).
- two-channel detection may be used to detect the signal intensities X and Y.
- one-channel detection may be used if only one probe is labelled.
- the method according to any aspect of the present invention may be adapted to detect more than two nucleic acid variations. For example, if there are three nucleic acid variants A, B, C, the analysis can be performed for A & B, A & C and B & C. If there are four nucleic acid variants A, B, C, D, the analysis can be performed for A & B, A & C, A & D, B & C, B & D and C & D.
- the present method may be adapted accordingly to detect any number of nucleic acid variants.
- the present method may be adapted to any suitable array platform for detecting nucleic acid variations and/or genotyping known in the art, including but not limited to lllumina GoldenGate, lllumina Infinium, Affymetrix platform or Invader assay.
- the present method may also be used with the lllumina Infinium platform (Steemers et al., 2006) where a differentially labelled nucleotide corresponding to the SNP is incorporated during amplification.
- the invention also includes an apparatus for performing the invention.
- the apparatus includes the support system and/or associated computer system.
- the support system includes the array system.
- the computer system may be used to process and/or analyze the signal intensities from the array.
- the invention relates to a computer system, programmed to perform steps (iii) and (iv) of the method of the invention.
- the computer system may in principle be any general computer, such as a personal computer, although in practice it is more likely typically to be a workstation or a mainframe computer.
- the invention also relates to software executable by a computer system to cause the computer system to perform steps (iii) and (iv) of the method.
- the invention also includes a computer program product comprising the software.
- the computer program product is tangible.
- a computer program product includes, for example, a tangible recording, storage and/or computer- readable media. Examples of such media include but are not limited to a computer hard-drive, a compact disc, a flash memory device (e.g. memory cards, USB flash drives, solid state drives ), a floppy disk. Other suitable media known in the art may also be used.
- the method of the invention comprises a computer-implemented method. Further, the present method is capable of being automated.
- SNP analysis was performed using the lllumina GoldenGate Assay according to the manufacturer's instructions. Basically, this genotyping platform uses differentially labelled allele-specific probes and a locus-specific oligonucleotide (or probe) for detecting the SNP ( Figure 1).
- the allele-specific probes are labelled with different fluorescent dyes.
- Each locus-specific oligonucleotide comprises a specific IllumiCode region which is unique to the locus.
- the corresponding allele-specific probe will specifically bind to the DNA template of the sample and extended via PCR to the locus specific oligonucleotide.
- the PCR products flanked by the allele-specific probe and the locus-specific oligonucleotide are hybridized to a set of beads via the llumiCode region.
- Each specific IllumiCode region will represent a specific locus and the position of the bead with the corresponding complementary IlluniCode region on the array is tracked and used to aid identification of the expected SNPs associated with the locus.
- the PCR products that bind to the beads localised to specific locations on the array is then scanned for the presence/absence of each differentially labelled probe which would indicate the SNP present, either homozygous or heterozygous ( Figure 2).
- each SNP locus is in general represented by two dyes, where each dye represent one of the SNP alleles and both dyes in combination represent the presence of both alleles (heterozygous).
- the signal intensities of each dye are collected by instruments and analysed using software provided by the manufacturer.
- For each SNP locus there are associated information such as the identity of the SNP, the SNP alleles and the dye representing each allele.
- Genome Studio provides the software, Genome studio, provided by lllumina is capable of processing and displaying the signal intensities and associated information in graphical format ( Figures 3 and 4).
- Genome Studio also includes a proprietary clustering algorithm which analyses the signal intensities to determine (or "call") the SNP genotype.
- the algorithm provided in genome studio was not satisfactory and often calls the SNP wrongly compared to capillary sequencing of the SNPs or not called (uncalled) at all while the signal is available and callable. The miscalled SNP would distract and the uncalled SNPs excluded from analysis.
- the present method may be adapted to any suitable array platform for detecting nucleic acid variations and/or genotyping known in the art.
- the present method was tested on the lllumina GoldenGate platform.
- the corrected signal intensities XA and YB for alleles A and B, respectively is as follows:
- Corrected signal intensities XA and Y B " for A and B, respectively against the background should be larger than 0. If XA or Y B is found to be less than 0, it will be assigned to 0. When one of the allele corrected signal is 0, the allele is found to be negligible or absent, and the other allele that was found will be called as the genotype. In other words, if one of the signal intensities were found to be too low, i.e. only one dye has significant intensities after background subtraction, the represented allele would be taken as representing the SNP. If both X A and Ye are found to be 0, the genotype is called as a No Call (NC). In other words, if the signal intensities of both dyes fall below the background intensity, it would be accepted that the signal failed or the SNP alleles being tested are not present.
- NC No Call
- the present method uses a principle where the differences in the dyes and bias in corrected signal ratio reflected the correct genotype to be called as expected ( Figure 5).
- Validation of the calls was made with 40 SNPs on 27 plant samples by capillary sequencing.
- the 40 SNPs was a set of SNP performed on oil palm sample that have both golden gate assay signal at sequencing data available for analysis.
- Figure 8 illustrates the flow diagram of an example of the method according to the invention.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Genetics & Genomics (AREA)
- Analytical Chemistry (AREA)
- Organic Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The invention relates to a method for detecting at least one nucleic acid variation based on the ratio of corrected signal intensities of at least two differentially labelled probes capable of detecting the nucleic acid variation. The invention also relates to an apparatus for performing the method.
Description
METHOD AND APPARATUS FOR DETECTING NUCLEIC ACID VARIATION(S)
Field of the invention
The present invention relates to the field of detecting nucleic acid variations, for example in genotyping. More specifically, the invention relates to detecting a nucleic acid variation of a locus present in a sample. In particular, allelic variations or single nucleotide polymorphisms (SNPs) may be detected.
Background of the invention
In a population, the genetic makeup or genotype of individuals varies. Genotyping generally refers to identifying the genetic makeup of an individual organism. Genotyping may detect nucleic acid variations in an individual, for example allelic variations or single nucleotide polymorphisms (SNP). Recent developments have led to robust genotyping platforms. Examples of genotyping platforms include the invader assay (Olivier et a/., 2005), array-based methods (Perkel 2008) and arrayed primer extension or APEX (Kurg et a/., 2000). The abundant data generated with these robust genotyping platforms need to be analysed, typically using computerised methods. Examples of data analysis methods using diverse algorithms have been reported in US 2009/0062138, Ritchie et a/., 2009, Takitoh et a/., (2005). It is desirable to develop methods capable of genotyping with high accuracy.
Summary of the invention
The present invention relates to detecting nucleic acid variation(s). According to a first aspect, the present invention provides a method for detecting a nucleic acid variation of a locus in a sample, comprising the steps of: (i) contacting at least two differentially labelled probes with the sample; wherein the first labelled probe is capable of detecting a first nucleic acid variation A and
the second labelled probe is capable of detecting a second nucleic acid variation B;
(ii) detecting a first signal intensity X for the first labelled probe and a second signal intensity Y for the second labelled probe on a support; wherein the first signal intensity X correlates to the presence of the first nucleic acid variation A and the second signal intensity Y correlates to the presence of the second nucleic acid variation B;
(iii) performing background corrections on the first and second signal intensities to give background corrected first signal intensity XA and background corrected second signal intensity YB;
(iv) expressing XA: YB as a ratio (Sr), wherein if XA:YB≥ C:1 given XA , YB > 0, or if XA > 0 and YB≥ 0, then the nucleic acid variation is A:A; if ΧΑΎΒ≤ 1 :C given XA , YB > 0, or if YB > 0 and XA≥ 0, then the nucleic acid variation is B:B; if XA YB is between C:1 and 1 :C, then the nucleic acid variation is A:B; wherein C is a real number; and if both XA and YB≤ 0, either both A and B are not present or the nucleic acid variation cannot be determined.
For example, the method may be for detecting different alleles. In particular, the method may be for detecting single nucleic acid polymorphisms (SNPs).
Brief description of the figures Figure 1 depicts the hybridisation of allele-specific probes and locus specific oligonucleotide (probe) comprising an IllumiCode region (a step from lllumina GoldenGate assay).
Figure 2 depicts the hybridisation of PCR products to beads on an array.
Figure 3 shows an example of the raw intensities output in graphical format from Genome Studio.
Figure 4 shows further examples of raw signal intensities distribution output in graphical format from Genome Studio.
Figure 5 shows a graphical representation of the background corrected signal intensities against the expected genotype. Figure 6 shows the error margins of the present corrected signal ratio algorithm for detecting nucleic acid variation(s).
Figure 7 shows a Venn diagram comparing the SNP calls from the corrected signal ratio algorithm (Poh), Genome Studio (GS) and sequencing.
Figure 8 shows the flow diagram of an example of the method of the present invention.
Definitions
An array refers to a support including a slide, chip, membrane, bead, or microtiter plate, with a plurality of elements bound or immobilised at defined locations. The elements may comprise molecules (e.g. nucleic acid molecules). In particular, a microarray refers to a high density array. For example, a microarray may have a density of 120 or more elements per cm2.
A "double polynucleotide polymorphism (DNP)" refers to two single polynucleotide polymorphisms, and includes the circumstances when the two SNPs are positioned next to each other, separated by other nucleotides, on different strands of the same nucleic acid molecules, or on different nucleic acid molecules.
Differentially labelled probes also include the situation when one probe is labelled and the other probe is not.
A primer refers to an oligonucleotide to which deoxyribonucleotides may be added by a DNA polymerase. A single primer may be used to amplify a DNA or RNA region, for example, for sequencing.
A primer pair usually comprises a first primer complementary to one strand of a DNA or RNA molecule and a second primer complementary to a second strand of a DNA or RNA molecule, with both primers flanking a target DNA or RNA region, to be amplified by a DNA polymerase.
A probe refers to any molecule used to locate and/or identify a target DNA or RNA sequence. Probes may usually be labelled by standard methods, for example, radioactively or with fluorescent markers. For example, probes may be used to detect differences in DNA or RNA sequences, including single nucleotide polymorphism(s). The differentially labelled probes also include at least two differentially labelled nucleotides, wherein one nucleotide is incorporated by a polymerase to a polynucleotide being extended, depending on the SNP present.
"Nucleic acid variation" includes, but is not limited to allelic variations, a single nucleotide polymorphism (SNP), a double nucleotide polymorphism (DNP), a deletion, an insertion, a substitution, a nucleic acid amplification, a rearrangement of a nucleic acid sequence or a gene and/or its corresponding transcriptional and/or translational product, and/or alternative splicing of the transcriptional and/or translational product.
A single polynucleotide polymorphism (SNP) refers to a DNA and/or RNA sequence variation occurring when a single nucleotide in an organism's genetic material which differs between members of the species (or between paired chromosomes in the organism). SNP includes substitution, deletion or insertion of a single nucleotide.
Detailed description of the invention
The invention relates to a method for detecting nucleic acid variations. As herein described, the method comprises the steps of:
(i) contacting at least two differentially labelled probes with the sample; wherein the first labelled probe is capable of detecting a first nucleic acid variation A and the second labelled probe is capable of detecting a second nucleic acid variation B;
(ii) detecting a first signal intensity X for the first labelled probe and a second signal intensity Y for the second labelled probe on a support; wherein the first signal intensity X correlates to the presence of the first nucleic acid variation A and the second signal intensity Y correlates to the presence of the second nucleic acid variation B;
(iii) performing background corrections on the first and second signal intensities to give background corrected first signal intensity XA and background corrected second signal intensity YB;
(iv) expressing XA: YB as a ratio (Sr), wherein if XA:YB≥ C:1 given XA , YB > 0, or if XA > 0 and YB≥ 0, then the nucleic acid variation is A:A; if XA:YB≤ 1 :C given XA , YB > 0, or if YB > 0 and XA≥ 0, then the nucleic acid variation is B:B; if XA:YB is between C:1 and 1 :C, then the nucleic acid variation is A:B; wherein C is a real number; and if both XA and YB≤ 0, either both A and B are not present or the nucleic acid variation cannot be determined.
In particular, the sample comprises an isolated sample.
Any labelling means as is known in the art may be used in the practice of the invention. Detection of the nucleic acid variation is based on the ratio of background corrected signal intensities of at least two differentially labelled probes. The signal intensity X of the first probe capable of detecting the first
nucleic acid variation A is background corrected to give XA. The signal intensity Y of the second probe capable of detecting the second nucleic acid variation B is background corrected to give YB.
Background correction may be performed by any suitable method. For example, background correction may be made by subtracting the background intensity (Bl) from the signals X and Y to give XA and YB respectively. The background intensity (Bl) may be determined by measuring signal intensity in the absence of any probe (negative control).
If XA and YB are both > 0, the nucleic acid variation present is determined based on the corrected signal ratio (Sr) XA:YB- The nucleic acid variation is determined as A:A if XA:YB≥ C:1. Conversely, the nucleic acid variation is determined as B:B if XA:Y B ≤ 1 :C. The nucleic acid variation is determined as A:B if ΧΑ Β is between C:1 and 1 :C. In particular, C (cut-off) is a real number. For example, C may be any value≥ 2. In particular, C may be 2 or 3. More in particular, C = 3. As an example, consider the situation where C = 3. If XA:YB≥ 3:1 , the nucleic acid variation is determined as A:A. If ΧΑΎΒ ≤ 1 :3, the nucleic acid variation is determined as B:B. If XA:YB is between 3:1 and 1 :3, the nucleic acid variation is determined as A:B.
Corrected signal intensities for nucleic acid variations A and B respectively against the background should be larger than 0. If XA or YB≤ 0, the signal is taken to be negligible or absent. Accordingly, if XA:YB≥ C:1 given XA , YB > 0, or if XA > 0 and YB > 0, then the nucleic acid variation present is A:A. If XA-'YB≤ 1 :C given XA , YB > 0, or if YB > 0 and XA≥ 0, then the nucleic acid variation present is B:B. If both XA and YB are≤ 0, it would be considered that either both the nucleic acid variations A and B are not present or the signal failed (i.e. the nucleic acid variation cannot be determined).
The signal intensities X and Y are detected on a support. The support may comprise an array or more in particular, a microarray.
Step (i) may further comprise an amplification step. In particular, the amplification is with a polymerase chain reaction (PCR). In particular, the PCR is with the first labelled probe, the second labelled probe and a locus specific oligonucleotide as primers to give PCR products. The PCR products are then hybridised to locus specific nucleic acid immobilised on the support. For instance, the lllumina GoldenGate technology includes an amplification step (see Examples below).
Typically, two-channel detection may be used to detect the signal intensities X and Y. However, one-channel detection may be used if only one probe is labelled. Further, the method according to any aspect of the present invention may be adapted to detect more than two nucleic acid variations. For example, if there are three nucleic acid variants A, B, C, the analysis can be performed for A & B, A & C and B & C. If there are four nucleic acid variants A, B, C, D, the analysis can be performed for A & B, A & C, A & D, B & C, B & D and C & D. The present method may be adapted accordingly to detect any number of nucleic acid variants. The present method may be adapted to any suitable array platform for detecting nucleic acid variations and/or genotyping known in the art, including but not limited to lllumina GoldenGate, lllumina Infinium, Affymetrix platform or Invader assay. For example, the present method may also be used with the lllumina Infinium platform (Steemers et al., 2006) where a differentially labelled nucleotide corresponding to the SNP is incorporated during amplification.
The invention also includes an apparatus for performing the invention. The apparatus includes the support system and/or associated computer system. The support system includes the array system.
The computer system may be used to process and/or analyze the signal intensities from the array. According to a further aspect, the invention relates to a computer system, programmed to perform steps (iii) and (iv) of the method of the invention. The computer system may in principle be any general computer, such as a personal computer, although in practice it is more likely typically to be a workstation or a mainframe computer.
The invention also relates to software executable by a computer system to cause the computer system to perform steps (iii) and (iv) of the method. The invention also includes a computer program product comprising the software. In particular, the computer program product is tangible. A computer program product includes, for example, a tangible recording, storage and/or computer- readable media. Examples of such media include but are not limited to a computer hard-drive, a compact disc, a flash memory device (e.g. memory cards, USB flash drives, solid state drives ), a floppy disk. Other suitable media known in the art may also be used.
Accordingly, the method of the invention comprises a computer-implemented method. Further, the present method is capable of being automated.
Having now generally described the invention, the same will be more readily understood through reference to the following examples which are provided by way of illustration, and are not intended to be limiting of the present invention.
EXAMPLES
Standard molecular biology techniques known in the art and not specifically described were generally followed as described in Sambrook and Russel (2001).
Example 1 lllumina GoldenGate Assay
SNP analysis was performed using the lllumina GoldenGate Assay according to the manufacturer's instructions. Basically, this genotyping platform uses differentially labelled allele-specific probes and a locus-specific oligonucleotide (or probe) for detecting the SNP (Figure 1). The allele-specific probes are labelled with different fluorescent dyes. Each locus-specific oligonucleotide comprises a specific IllumiCode region which is unique to the locus. Depending on the SNP variant present, the corresponding allele-specific probe will specifically bind to the DNA template of the sample and extended via PCR to the locus specific oligonucleotide. After the extension, the PCR products flanked by the allele-specific probe and the locus-specific oligonucleotide are hybridized to a set of beads via the llumiCode region. Each specific IllumiCode region will represent a specific locus and the position of the bead with the corresponding complementary IlluniCode region on the array is tracked and used to aid identification of the expected SNPs associated with the locus. The PCR products that bind to the beads localised to specific locations on the array is then scanned for the presence/absence of each differentially labelled probe which would indicate the SNP present, either homozygous or heterozygous (Figure 2). In array-based SNP genotyping platforms, each SNP locus is in general represented by two dyes, where each dye represent one of the SNP alleles and both dyes in combination represent the presence of both alleles (heterozygous). The signal intensities of each dye are collected by instruments and analysed using software provided by the manufacturer. For each SNP locus, there are associated information such as the identity of the SNP, the SNP alleles and the dye representing each allele.
For example, the software, Genome studio, provided by lllumina is capable of processing and displaying the signal intensities and associated information in
graphical format (Figures 3 and 4). Genome Studio also includes a proprietary clustering algorithm which analyses the signal intensities to determine (or "call") the SNP genotype. However, it was found that the algorithm provided in genome studio was not satisfactory and often calls the SNP wrongly compared to capillary sequencing of the SNPs or not called (uncalled) at all while the signal is available and callable. The miscalled SNP would distract and the uncalled SNPs excluded from analysis.
Example 2 Ratio of signal intensities
The present method may be adapted to any suitable array platform for detecting nucleic acid variations and/or genotyping known in the art.
The present method was tested on the lllumina GoldenGate platform. First, the following associated information from lllumina Genome Studio was extracted with a series of preprocessing scripts.
(i) The "Full data table.txt" which carries the information of the name of subject/sample that was genotyped, the IllumiCode address, locus and position of the SNP and the raw intensities.
(ii) The "Sample Table.txt" which carries information on the sample and distribution of intensities of the sample
(iii) "Paired sample table.txt" which carries the information of the SNP and bead type and the SNPs in the X„and Y, including background intensities of the negative control.
(iv) The "SNPtable.txt" which carries the orientation of the SNP using TOP/BOT convention of the lllumina GoldenGate Array and the primer sequences.
Consider the situation for example allele A and B, where X and Y represented the respective allele raw signal intensities. For each array, there will be negative controls comprising beads or points without binding to any PCR product, where the intensities of these points may be used as the background intensity (Bl). In the case of lllumina GoldenGate Assays, the background intensity was derived from a few blank beads which are part of the negative control in lllumina GoldenGate assays. This could be substituted with any signal intensities from control experiments in any SNP genotyping assays.
The corrected signal intensities XA and YB for alleles A and B, respectively is as follows:
XA = X - Bl
YB = Y - Bl
Corrected signal intensities XA and YB " for A and B, respectively against the background should be larger than 0. If XA or YB is found to be less than 0, it will be assigned to 0. When one of the allele corrected signal is 0, the allele is found to be negligible or absent, and the other allele that was found will be called as the genotype. In other words, if one of the signal intensities were found to be too low, i.e. only one dye has significant intensities after background subtraction, the represented allele would be taken as representing the SNP. If both XA and Ye are found to be 0, the genotype is called as a No Call (NC). In other words, if the signal intensities of both dyes fall below the background intensity, it would be accepted that the signal failed or the SNP alleles being tested are not present.
For example, if the ratio :3 is used (i.e. C = 3), when the signal ratio (Sr) of corrected signal intensities of allele A arid B (XA to YB or XA:YB)≥ 3:1 the allele in the sample would be A.A, if the XA-'YB signal ratio < 1 :3 the SNP would be B:B, while the signal ratio of XA:YB is between 3:1 and 1 :3, it would be A.B.
The calculation and genotype calling is represented as follows.
Where,
XA = Corrected signal intensities for A, XA≥ 0
X = Raw signal intensity of allele A
Bl = background intensity
YB = Y - BI
Where,
YB = Corrected signal intensities for B, YB≥ 0
Y= Raw signal intensity of allele B
Bl = background intensities
AA AB
Signal ratio, ST ¾■- ¾ =
BB BB AA NC
Or
> 2.0000, Genotype = AA
YB
0.5000 <— < 2.0000, Genotype
Signal ratio, SR = XA: YB = <
≤ 0.5000, Genotype
XA = 0.0000, YE > 0.0000, Genotype = BB YB = o.oooo, A > 0.0000, Genotype = AA KYB = 0.0000, XA = 0.0000, Genotype = NC
Alternatively, a ratio of 1 :2 could also be applied.
Accordingly, the present method uses a principle where the differences in the dyes and bias in corrected signal ratio reflected the correct genotype to be called as expected (Figure 5).
Validation of the calls was made with 40 SNPs on 27 plant samples by capillary sequencing. The 40 SNPs was a set of SNP performed on oil palm sample that have both golden gate assay signal at sequencing data available for analysis.
After observation of false positive calls compared to the result of capillary sequencing, it was found that the error margin was 15% (Figure 6). In other words, given a ratio of XA:YB of 3:1 , the ratio is 3.000 and the error is within 15% of 3.000 or 3.000 +/- 15% would be the area for false positive calls to be present (Figure 6). A similar range could be applied for the ratio 1 :2. Not all present in the error margin was false. The terms "error margin" and "signal border" are used interchangeably.
Implementation of the present corrected signal ratio algorithm for calling SNP was done in a Perl script called GGGTSNPcaller.pl. This code could be ported to be implemented in other programming languages. The script could output essential data as shown in Table 1 , indicating the summary of the sample, the SNP, the SNP address in the array, expected SNP1 and SNP2 alleles, the corrected signal intensities XA and YB, the genotype call made (homozygous or heterozygous), the SNP called, the corrected signal ratio (Sr), the percentage variation (VARPCT) from the signal borders and the Borders - whether the signal is within the 15% borders. Refined data with more details of the call could also be produced.
Table 1 Sample output from signal algorithm analysis
By observing the few good and successful calls in genome studio, corrected signal intensities after deduction of the background intensities from the dyes at ratio 1:2 or 1 : 3 ranges was found to suggest with relatively higher confidence of the correct calls. Benchmark was done against sequenced genotypes. Referring to Table 2 and Figure 7 which compares the calls from the corrected signal ratio algorithm (Poh), Genome Studio (GS) and sequencing (Seq), it can be seen that the corrected signal ratio algorithm had a cumulative 56.2% (15.09% + 41.11%) of calls matching to sequencing (Seq) while Genome Studio (GS) had a cumulative 48.6% (41.11% + 7.5%) of calls matching to sequencing. This shows that the present corrected signal ratio algorithm shows a higher percentage of correctly identifying the SNP than Genome Studio on validation with sequencing.
Table 2 Comparison of the present invention (Poh) with capillary sequencing (Se and Genome Studio (GS)
[Type [Description
AllSame Sr(Poh), GS, Sequencing calls the same result.
PohSeqSame Sr(Poh) and Sequencing calls the same result. GS calls different resu
PohGSSame Sr(Poh) and GS call the same result. Sequencing calls different resul
GSseqSame |GS and Sequencing call the same result. Sr (Poh) calls different resu
AllDiff Sr(Poh), GS and Sequencing calls different result.
A11F Sr(Poh), GS and Sequencing failed to call result.
GS-seqF GS and Sequencing failed to call result. Sr (Poh) calls a result.
Poh-seqF Sr(Poh) and Sequencing failed to call result. GS calls a result.
Poh-GSF Sr(Poh) and GS failed to call result. Sequencing calls a result.
seqF-PohGSSame Sr(Poh) and GS calls the same result. Sequencing failed to calls a re; seqF-PohGSDiff Sr(Poh) and GS calls different result. Sequencing failed to calls a res
GSF-PohseqSame Sr(Poh) and Sequencing calls the same result. GS failed to calls a re:
GSF-PohSeqDiff Sr(Poh) and Sequencing calls different result. GS failed to calls a res
PohF-GSseqSame GS and Sequencing calls the same result. Sr(Poh) failed to calls a re:
PohF-GSseqDiff GS and Sequencing calls different result. Sr(Poh) failed to calls a res
Figure 8 illustrates the flow diagram of an example of the method according to the invention.
References
Kurg et ai, (2000) Arrayed primer extension: Solid phase four color DNA resequencing and mutation detection technology. Genet. Test 4:1-7.
Olivier (2005) The Invader assay for SNP genotyping. Mutat. Res. 573(1- 2):103-110.
Perkel (2008) SNP genotyping: six technologies that keyed a revolution. Nat. Methods 5:447-454.
Ritchie er a/., (2009) Bioinformatics 25(19):2621 -2623.
Sambrook and Russel, Molecular Cloning: A Laboratory Manual, Cold Springs Harbor Laboratory, New York (2001). Steemers et ai, (2006) Whole-genome genotyping with the single-base extension assay. Nat. Methods 31(1):31-33
Takitoh et al., (2007) Genome Analysis 23(4):408- US 2009/0062138
Claims
1. A method for detecting a nucleic acid variation of a locus in a sample, comprising the steps of:
(i) contacting at least two differentially labelled probes with the sample; wherein the first labelled probe is capable of detecting a first nucleic acid variation A and the second labelled probe is capable of detecting a second nucleic acid variation B;
(ii) detecting a first signal intensity X for the first labelled probe and a second signal intensity Y for the second labelled probe on a support; wherein the first signal intensity X correlates to the presence of the first nucleic acid variation A and the second signal intensity Y correlates to the presence of the second nucleic acid variation B;
(iii) performing background corrections on the first and second signal intensities to give background corrected first signal intensity XA and background corrected second signal intensity YB;
(iv) expressing XA: YB as a ratio (Sr), wherein if XA:YB > C:1 given XA , YB > 0, or if XA > 0 and YB≥ 0, then the nucleic acid variation is A:A; if XA:YB < 1 :C given XA , YB > 0, or if YB > 0 and XA > 0, then the nucleic acid variation is B:B; if XA:YB is between C:1 and 1 :C, then the nucleic acid variation is A:B; wherein C is a real number; and if both XA and YB≤ 0, either both A and B are not present or the nucleic acid variation cannot be determined.
2. The method according to claim 1 , wherein C > 2.
3. The method according to claim 1 or 2, wherein C = 2 or 3.
4. The method according to any one of the preceding claims; wherein C = 3.
5. The method according to any one of the preceding claims, wherein step (iii) comprises subtracting the background intensity (Bl) from X and Y.
6. The method according to any one of the preceding claims, step (i) further comprising performing an amplification.
7. The method according to claim 6, wherein amplification is with a polymerase chain reaction (PCR).
8. The method according to claim 7, wherein the PCR is with the first labelled probe, the second labelled probe and a locus specific oligonucleotide as primers to give PCR products.
9. The method according to claim 8, wherein the PCR products are hybridized to locus specific nucleic acid immobilised on the support.
10. The method according to any one of the preceding claims, wherein the method is for detecting different alleles.
11. The method according to any one of the preceding claims, wherein the method is for detecting single nucleic acid polymorphisms (SNP).
12. The method according to any one of the preceding claims, wherein the method comprises a computer-implemented method.
13. A computer system, programmed to perform steps (iii) and (iv) of any one of the preceding claims. A computer program product comprising a software executable by a computer system to cause the computer system to perform steps (iii) and (iv) of any one of claims 1 to 12.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
MYPI2011005518 | 2011-11-15 | ||
MYPI2011005518 | 2011-11-15 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2013073929A1 true WO2013073929A1 (en) | 2013-05-23 |
WO2013073929A8 WO2013073929A8 (en) | 2014-01-09 |
Family
ID=48429916
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/MY2012/000273 WO2013073929A1 (en) | 2011-11-15 | 2012-11-14 | Method and apparatus for detecting nucleic acid variation(s) |
Country Status (3)
Country | Link |
---|---|
AR (1) | AR088867A1 (en) |
TW (1) | TW201323615A (en) |
WO (1) | WO2013073929A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016157473A1 (en) * | 2015-04-01 | 2016-10-06 | 株式会社 東芝 | Genotype determination device and method |
KR20170083088A (en) * | 2014-11-10 | 2017-07-17 | 티이 커넥티비티 코포레이션 | Bus bar for a battery connector system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080070253A1 (en) * | 2005-01-13 | 2008-03-20 | Progenika Biopharma, S.A. | Methods and products for in vitro genotyping |
US20090011944A1 (en) * | 2007-02-21 | 2009-01-08 | Valtion Teknillinen Tutkimuskekus | Method and test kit for detecting nucleotide variations |
US20090062138A1 (en) * | 2007-08-31 | 2009-03-05 | Curry Bo U | Array-based method for performing SNP analysis |
US20090246792A1 (en) * | 1999-06-17 | 2009-10-01 | Becton, Dickinson And Company | Methods for detecting nucleic acid sequence variations |
US20110003301A1 (en) * | 2009-05-08 | 2011-01-06 | Life Technologies Corporation | Methods for detecting genetic variations in dna samples |
-
2012
- 2012-11-14 AR ARP120104287A patent/AR088867A1/en not_active Application Discontinuation
- 2012-11-14 WO PCT/MY2012/000273 patent/WO2013073929A1/en active Application Filing
- 2012-11-14 TW TW101142478A patent/TW201323615A/en unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090246792A1 (en) * | 1999-06-17 | 2009-10-01 | Becton, Dickinson And Company | Methods for detecting nucleic acid sequence variations |
US20080070253A1 (en) * | 2005-01-13 | 2008-03-20 | Progenika Biopharma, S.A. | Methods and products for in vitro genotyping |
US20090011944A1 (en) * | 2007-02-21 | 2009-01-08 | Valtion Teknillinen Tutkimuskekus | Method and test kit for detecting nucleotide variations |
US20090062138A1 (en) * | 2007-08-31 | 2009-03-05 | Curry Bo U | Array-based method for performing SNP analysis |
US20110003301A1 (en) * | 2009-05-08 | 2011-01-06 | Life Technologies Corporation | Methods for detecting genetic variations in dna samples |
Non-Patent Citations (1)
Title |
---|
TWYMAN: "Single Nucleotide Polymorphism (SNP) Genotyping Techniques - An Overview", ENCYCLOPEDIA OF DIAGNOSTIC GENOMICS AND PROTEOMICS, December 2004 (2004-12-01), pages 1202 - 1207 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20170083088A (en) * | 2014-11-10 | 2017-07-17 | 티이 커넥티비티 코포레이션 | Bus bar for a battery connector system |
KR101975976B1 (en) | 2014-11-10 | 2019-05-09 | 티이 커넥티비티 코포레이션 | Bus bar for a battery connector system |
WO2016157473A1 (en) * | 2015-04-01 | 2016-10-06 | 株式会社 東芝 | Genotype determination device and method |
GB2551091A (en) * | 2015-04-01 | 2017-12-06 | Toshiba Kk | Genotype determination device and method |
JPWO2016157473A1 (en) * | 2015-04-01 | 2017-12-21 | 株式会社東芝 | Genotyping apparatus and method |
Also Published As
Publication number | Publication date |
---|---|
WO2013073929A8 (en) | 2014-01-09 |
TW201323615A (en) | 2013-06-16 |
AR088867A1 (en) | 2014-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9920370B2 (en) | Haplotying of HLA loci with ultra-deep shotgun sequencing | |
Gansauge et al. | Selective enrichment of damaged DNA molecules for ancient genome sequencing | |
JP2022103371A5 (en) | ||
JP2016185162A5 (en) | ||
FI2557517T3 (en) | Determining a nucleic acid sequence imbalance | |
WO2014101655A1 (en) | Method for analyzing high-throughput nucleic acid and application thereof | |
CN108004345B (en) | Method for high-throughput detection of wheat scab resistance genotyping and kit thereof | |
US20210285063A1 (en) | Genome-wide maize snp array and use thereof | |
KR101923647B1 (en) | SNP markers for discrimination of Jubilee type or Crimson type watermelon cultivar | |
Bernardo et al. | Using next generation sequencing for multiplexed trait-linked markers in wheat | |
JP2023126945A (en) | Improved method and kit for generation of dna libraries for massively parallel sequencing | |
Wang et al. | Forensic nanopore sequencing of microhaplotype markers using QitanTech’s QNome | |
WO2012171990A1 (en) | Discrimination of blood type variants | |
Hollox et al. | DNA copy number analysis by MAPH: molecular diagnostic applications | |
Newman et al. | High-throughput genotyping of intermediate-size structural variation | |
US20200199648A1 (en) | Easy one-step amplification and labeling (eosal) | |
Du et al. | Comprehensive evaluation of SNP identification with the Restriction Enzyme-based Reduced Representation Library (RRL) method | |
WO2012019190A1 (en) | Compositions and methods for high-throughput nucleic acid analysis and quality control | |
WO2013073929A1 (en) | Method and apparatus for detecting nucleic acid variation(s) | |
Sharma et al. | Assessment of genetic finger printing using molecular marker in plants: A Review | |
CN105586392B (en) | Method for evaluating maternal cell contamination level in fetal sample | |
CN114292924B (en) | Sika whole genome SNP molecular marker combination, SNP chip and application | |
Bayés et al. | Overview of genotyping | |
WO2019117704A1 (en) | Methods for detecting pathogenicity of ganoderma sp. | |
CN115323048A (en) | Primer combination and method for detecting human embryo alpha-thalassemia gene mutation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12849891 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 12849891 Country of ref document: EP Kind code of ref document: A1 |