[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN111793695A - Molecular typing system for intestinal cancer - Google Patents

Molecular typing system for intestinal cancer Download PDF

Info

Publication number
CN111793695A
CN111793695A CN202010866899.0A CN202010866899A CN111793695A CN 111793695 A CN111793695 A CN 111793695A CN 202010866899 A CN202010866899 A CN 202010866899A CN 111793695 A CN111793695 A CN 111793695A
Authority
CN
China
Prior art keywords
sequencing
cms
molecular typing
typing
molecular
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010866899.0A
Other languages
Chinese (zh)
Inventor
盛伟琪
彭俊杰
彭海翔
黄凯
陆彬彬
陈丽萌
黄丹
许蜜蝶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Puen Haihui Medical Laboratory Co ltd
Fudan University Shanghai Cancer Center
Original Assignee
Shanghai Puen Haihui Medical Laboratory Co ltd
Fudan University Shanghai Cancer Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Puen Haihui Medical Laboratory Co ltd, Fudan University Shanghai Cancer Center filed Critical Shanghai Puen Haihui Medical Laboratory Co ltd
Priority to CN202010866899.0A priority Critical patent/CN111793695A/en
Publication of CN111793695A publication Critical patent/CN111793695A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Analytical Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Immunology (AREA)
  • Genetics & Genomics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Hospice & Palliative Care (AREA)
  • Oncology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The application discloses intestinal cancer molecular typing system. The system comprises: (1) a sequencing library construction module for constructing a sequencing library from the sample RNA; (2) a quantitative sequencing module for quantifying and sequencing the sequencing library; (3) the normalization module is used for normalizing the quantitative and sequencing results; and (4) a CMS typing module for molecular typing of the normalized results.

Description

Molecular typing system for intestinal cancer
Technical Field
The present application relates to a molecular typing system for intestinal cancer. The system can accurately quantify the specific gene expression profile of the intestinal cancer patient without deviation, accurately predict the postoperative cancer recurrence-free survival rate of the patient and help to guide the personalized treatment of the intestinal cancer.
Background
Colorectal cancer (CRC) becomes the third most common malignant tumor in China at present, the coarse incidence rate of the CRC reaches 29.44/10 ten thousand of people, and the CRC tends to increase year by year. For colorectal cancer which is treated by operation, pathological evaluation after operation is the most important prognostic index and treatment basis after operation. Advanced CRC patients should receive adjuvant treatment post-operatively, but greater controversy remains in the decision to adjuvant treatment for stage II/III patients. Retrospective studies have found that approximately 10-20% of phase II patients experience postoperative recurrence and metastasis. In addition, high risk phase II patients may benefit from adjuvant chemotherapy. The current evaluation of the need for adjuvant chemotherapy in stage II colon cancer patients currently remains mainly histologically: the infiltration depth, differentiation degree, the presence or absence of lymphatic infiltration, the presence or absence of nerve infiltration, the total number of lymph nodes, the positive number and the pathological condition of incisional margin of the tumor; molecular-level studies suggest that patients with MSI-H (high satellite instability) or dMMR (mismatch repair deficiency) receive reduced benefit from 5-FU chemotherapy. While the length of chemotherapy remains controversial for stage III CRC patients, recent findings with IDEA suggest: the 5-year survival difference between 3 months and 6 months of adjuvant chemotherapy in stage III patients is small; the 5-year survival rate of the III-stage low-risk 4-cycle XELOX is better than that of the 8-cycle XELOX, and the 5-year survival rate of the III-stage high-risk 4-cycle XELOX is only reduced by 1% compared with that of the 8-cycle XELOX, but the toxic and side effects are greatly reduced. Furthermore, current clinical treatment outcomes reflect that not only high-risk phase II/III can benefit from adjuvant chemotherapy. The key to these problems is the lack of specific molecular marker definitions or judgment of "high risk". Therefore, how to screen specific molecular markers, perform accurate molecular feature and subtype analysis on patients with stage II/III CRC, identify real 'high-risk' patients and enable the patients to benefit by adjuvant chemotherapy is a problem to be clinically urgently solved.
Currently, methods for molecular typing include NanoString technology and Next generation sequencing (Next generation sequencing) technology. The NanoString technology is based on the molecular hybridization principle, is influenced by the hybridization efficiency, and the low-expression gene cannot be effectively captured, so that the detection sensitivity and accuracy are influenced; meanwhile, instruments and equipment and reagents are expensive, and general medical institutions do not have conditions for development; the second generation sequencing technology includes whole transcriptome sequencing, target gene hybridizing and trapping sequencing, and target gene amplifying sequencing. The whole transcriptome sequencing can detect gene expression at the whole transcriptome level, but the sequencing data volume is large, the cost is high, most of sequence data is consumed by medium-abundance and high-abundance transcripts, and the coverage rate of low-abundance transcripts is often insufficient; the target gene hybridization capture sequencing has complex operation and high development cost; existing targeted amplification, library preparation and sequencing steps all utilize DNA polymerase and amplification, introducing a large number of biases. PCR amplification bias significantly affects quantitative accuracy, so the final sequencing data cannot accurately represent the relative abundance of the original fragments.
The colorectal cancer Consensus Molecular Subtype (CMS) is published and proposed in Nature Medicine from 2015, is considered to be the most powerful colorectal cancer (CRC) molecular classification system at present, has clear biological identification, and has important significance in understanding the tumor molecular characteristics, the biological behavior difference of the left half colorectal cancer and the right half colorectal cancer and the prediction of curative effect. However, since CMS is not directed to a certain gene or marker, but a large class of genes, how to detect which genes are detected and how to perform classification analysis are problems to be solved in the future. Moreover, CMS typing is based solely on gene expression profiling of colorectal cancer and does not match patient disease-free survival after surgery. In addition, the existing research is based on European and American people, up to 13 percent of patients cannot be classified, and the research on the CMS typing of the intestinal cancer of Chinese people is not available at present.
Clearly, there is still a need in the art to find a molecular typing system that can accurately quantify the expression profile of specific genes of intestinal cancer patients, especially Chinese intestinal cancer patients, without deviation and accurately predict the postoperative survival rate of patients without cancer recurrence.
Disclosure of Invention
In order to solve the above technical problem, the present application provides a molecular typing system for intestinal cancer, comprising: (1) a sequencing library construction module for constructing a sequencing library from the sample RNA; (2) a quantitative sequencing module for quantifying and sequencing the sequencing library; (3) the normalization module is used for normalizing the quantitative and sequencing results; and (4) a CMS typing module for molecular typing of the normalized results.
In one embodiment of the present application, constructing a sequencing library from sample RNA comprises reverse transcribing the sample RNA into cDNA and performing multiplex PCR-targeted amplification of the cDNA. In one embodiment of the present application, constructing a sequencing library from the sample RNA further comprises targeting the target gene cDNA with a molecular tag and purifying the tagged product. In one embodiment of the present application, the target gene is amplified by multiplex PCR in a targeted manner, and the amplification product is purified to obtain an amplification product expressed by the target gene. In one embodiment of the present application, the method further comprises performing a second round of PCR amplification using the multiplex PCR amplification product as a template to add a sequencing adapter sequence. In one embodiment of the present application, the target gene for multiplex PCR targeted amplification comprises one or more genes selected from table 1. In a preferred embodiment of the present application, the target genes for multiplex PCR targeted amplification include all the genes of table 1.
TABLE 1 782 Gene List for CMS typing
Figure BDA0002648949250000021
Figure BDA0002648949250000031
Figure BDA0002648949250000041
Figure BDA0002648949250000051
Figure BDA0002648949250000061
In one embodiment of the present application, the primers for multiplex PCR targeted amplification include primers for target genes for said multiplex PCR targeted amplification. In one embodiment of the present application, the multiplex PCR targeted amplification comprises the use of one or more pairs of primers selected from table 2. In a preferred embodiment of the present application, the multiplex PCR targeted amplification comprises the use of all primers of table 2.
TABLE 2 Forward and reverse primers for amplification of target genes
Figure BDA0002648949250000071
Figure BDA0002648949250000081
Figure BDA0002648949250000091
Figure BDA0002648949250000101
Figure BDA0002648949250000111
Figure BDA0002648949250000121
Figure BDA0002648949250000131
Quantification may include any suitable quantification technique known to those skilled in the art. In one embodiment of the present application, the quantifying comprises qPCR. In one embodiment of the present application, qPCR comprises real-time fluorescent quantitative PCR.
Sequencing may comprise any suitable sequencing technique known to those skilled in the art. In one embodiment of the present application, sequencing comprises high-throughput sequencing.
Normalization may comprise any suitable normalization method known to those skilled in the art. In one embodiment of the present application, the normalization includes quantile normalization. In one embodiment of the present application, quantile normalization includes calculating LOG2 values for the number of originally sequenced molecules per sample, then ranking the number of LOG2 molecules per sample, calculating the arithmetic mean of the number of LOG2 molecules for all samples corresponding to the order and replacing the LOG2 values to form a normalized corrected molecule matrix.
In one embodiment of the present application, the CMS typing module includes a classification model that calculates a cosine distance on the CMS typing signature genome expression template based on the normalized result, and obtains a result of molecular typing based on the cosine distance.
The result of molecular typing may be any suitable intestinal cancer typing result known to those skilled in the art. In one embodiment of the present application, the results of the molecular typing include: CMS1-inflammatory type, CMS 2-intestinal epithelial cell type, CMS 2-transient proliferative type, CMS3-goblet type, and CMS 4-trunk type.
In one embodiment of the present application, the intestinal cancer is colorectal cancer. In one embodiment of the present application, the intestinal cancer is selected from rectal cancer, left half colon cancer or right half colon cancer.
In one embodiment of the present application, the sample RNA is from a chinese population.
In one embodiment of the present application, the sample RNA is from fresh tissue, frozen tissue, or paraffin-embedded tissue. In one embodiment of the present application, the sample RNA is from a formalin-fixed paraffin embedded processed (FFPE) sample.
Compared with the prior art, the intestinal cancer molecular typing system realizes non-deviation quantification of the specific gene expression profile of intestinal cancer patients, particularly Chinese intestinal cancer patients, greatly reduces the number of samples which cannot be typed, obviously improves the typing accuracy, and can more accurately predict the postoperative cancer recurrence-free survival rate of the patients.
Drawings
The present application is described in more detail below with reference to the attached drawing figures, wherein:
FIG. 1 is a graph showing the results of CMS typing of a clinical sample of intestinal cancer using a system according to an embodiment of the system of the present application.
Fig. 2 is a CMS heat map of a clinical sample of an intestinal cancer using a system according to an embodiment of the system of the present application.
Fig. 3 is a charpelan-meier survival plot of a retrospective cohort after CMS typing of clinical samples of intestinal cancer using the system of one embodiment of the system of the present application.
Figure 4 is a schematic diagram of a recent template prediction (NTP) method using the system of one embodiment of the system of the present application.
Detailed Description
Definition of
In the present application, "CMS typing" refers to a typing method based on consensus molecular subtype (consensus molecular subtype) which determines intestinal cancer heterogeneity at a gene expression level using an integrated network and performs typing of intestinal cancer.
In this application, "quantile normalization" refers to a statistical method that allows the distribution of two or more sets of data to be consistent without reference.
As used herein, "CMS 1-Inflammatory (CMS1-Inflammatory) type" refers to a subset of consensus molecules characterized by relatively high expression of chemokine and interferon related genes.
In the present application, the "CMS 2-transient proliferation (CMS2-Transit amplification) type" refers to a consensus molecular subtype characterized by a heterogeneous pool of samples with different expression of stem cells and Wnt-target genes.
As used herein, "CMS 2-intestinal epithelial cell (CMS2-Enterocyte) type" refers to a subset of consensus molecules characterized by high expression of genes specific for intestinal epithelial cells.
As used herein, "CMS 3-Goblet (CMS3-Goblet like) type" refers to a consensus molecular subtype characterized by high mRNA expression of the Goblet cell-specific MUC2 and TFF 3.
In this application, "CMS 4-Stem (CMS4-Stem like) type" refers to a consensus molecular subtype characterized by Wnt signaling targets plus high expression of Stem cell, myoepithelial and mesenchymal genes and low expression of differentiation markers.
The technical solutions of the present application will be described in detail and fully with reference to the accompanying drawings, and it should be understood that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any inventive step are within the scope of protection of the present application.
Example 1
Gene set expression detection
1. The instrument comprises the following steps: PCR instrument (Veriti)TM96-well Thermal Cycler), a Qubit fluorescence quantifier (Qubit3.0 fluorometer), a real-time fluorescence quantitative PCR instrument (QuantStudio 5 Flex), a full-automatic pre-sequencing processor (IonChef)TMInstrument), high throughput sequencer (Ion S5)TMXL System). Extracting sample RNA:
a) in the present example, 109 colorectal cancer FFPE samples were partially stored in the affiliated tumor hospital at the university of compound denier in 2017, 12-2018, 9-month, and the sample RNA was extracted using RNeasy FFPE Kit nucleic acid extraction Kit (Qiagen, Inc.) according to the Kit instructions and quantified using a Quibt fluorometer and frozen at-80 ℃.
b) Another part of the study subjects of this example were 78 cases of frozen colorectal cancer tissues stored in the tissue bank of affiliated tumor hospitals at university of Fudan university in 2013, 1-2018, using RNeasy Mini Kit nucleic acid extraction Kit (Qiagen, Inc.) according to the Kit instructions, extracting sample RNA, and quantifying with Quibt, and freezing at-80 ℃.
2. Constructing a sequencing library:
1) reverse transcription to synthesize cDNA
a) Reagents were prepared and RNA was diluted to 8. mu.L with nuclease-free water to a total amount of 100ng RNA.
b) mu.L of DNase I enzyme mixture was added to each sample to eliminate gDNA.
c) After mixing well, place on PCR instrument and run the following program:
circulation of Temperature of Time of day
1 42℃ 5min
1 4℃ 1min
1 4℃ Holding
d) Reverse Transcription mixes were prepared using the QuantiTect Reverse Transcription Kit (Qiagen, Inc.) according to the Kit instructions:
reagent Individual reaction volume (. mu.L)
Nuclease-free water 4
Reverse transcription buffer solution 4
Reverse transcriptase 2
Total volume 10
e) The reagents were mixed well and 10. mu.L of reverse transcription mixture was added to the gDNA-removed sample.
f) Mixing, and centrifuging.
g) The following reverse transcription program is set and operated:
circulation of Temperature of Time of day
1 42℃ 15min
1 95℃ 5min
1 4℃ Holding
h) After the reaction, the next reaction was carried out immediately.
2) Adding molecular tags
a) Preparing the following molecular label mixed solution:
Figure BDA0002648949250000161
the gene specific molecular tag primer sequences are shown in table 3 below:
TABLE 3 Gene specific molecular tag primers
Figure BDA0002648949250000162
Figure BDA0002648949250000171
Figure BDA0002648949250000181
Figure BDA0002648949250000191
Figure BDA0002648949250000201
Figure BDA0002648949250000211
Figure BDA0002648949250000221
Figure BDA0002648949250000231
Wherein N is any one of A, T, C, G four bases.
b) After the reverse transcription was complete, the tube was centrifuged transiently to transfer 4. mu.L of cDNA to a new tube and the remaining cDNA was stored at-20 ℃.
c) mu.L of the molecular tag mixture was added to each 4. mu.L cDNA sample tube.
d) Mixing well, and centrifuging briefly. The following program was set up and run on a PCR instrument:
circulation of Temperature of Time of day
1 95℃ 15min
1 55℃ 15min
1 65℃ 15min
1 72℃ 7min
1 4℃ Holding
e) After the reaction, the next reaction was carried out immediately.
3) Purification of
a) By AgencourtTMAMPureTMThe XP Kit (Beckman) purified beads were removed 30min in advance and allowed to stand at room temperature.
b) To 10. mu.L of the reaction product was added 40. mu.L of water.
c) Adding 65 μ L (1.3X) of magnetic beads, blowing with a gun head for about 10 times, and mixing well.
d) Incubate at room temperature for 5 min.
e) Place on magnetic stand for about 2min or until clear.
f) Carefully discard the supernatant taking care not to attract the beads.
g) Adding 26 μ L nuclease-free water, blowing, mixing, and standing at room temperature for 2 min.
h) Standing on a Magnetic rack (0.2mL PCR Strip Magnetic Separator, Permagen Labware) for 2min or until clarified
i) Transfer 25. mu.L of supernatant to a new tube.
j) Add 32.5. mu.L (1.3X) of magnetic beads, blow the tip about 10 times, mix well.
k) Incubate at room temperature for 5 min.
l) placed on a magnetic stand for about 2min or until clear.
m) carefully discard the supernatant taking care not to attract the beads.
n) 200. mu.L of freshly prepared 80% ethanol was added, left to stand for about 30s, and the supernatant was aspirated.
o) repeating step n.
p) centrifuge the sample tube briefly, place on a magnetic rack, and discard the residual ethanol.
q) drying in the air for 5-10min at room temperature or until the magnetic beads are dried.
r) adding 12 μ L nuclease-free water, blowing, mixing, and standing at room temperature for 2 min.
s) standing on a magnetic frame for 2min or until the solution is clear
t) transfer 10. mu.L of supernatant to a new tube.
u) the supernatant can be immediately taken to the next step or stored at-20 ℃.
4) First round PCR reaction
a) A reaction system of the first PCR1 mixture was prepared according to the following table.
Reagent Individual reaction volume (. mu.L)
Nuclease-free water 2.5
HotStarTaq polymerase reaction buffer (5 ×) 5
Gene-specific reverse primers (100 nM each) 5
Primer A (10. mu.M) 1.5
HotStarTaq DNA polymerase (Qiagen) (6U/. mu.L) 1
Total volume 15
The gene specific reverse molecular primer sequences are shown in table 4 below:
TABLE 4 Gene specific reverse molecular primers
Figure BDA0002648949250000241
Figure BDA0002648949250000251
Figure BDA0002648949250000261
Figure BDA0002648949250000271
Figure BDA0002648949250000281
Figure BDA0002648949250000291
Figure BDA0002648949250000301
The primer A sequence is as follows: GATGTACAGTCTACCGATTTG (SEQ ID NO: 3129)
b) To the purified product, 15. mu.L of a mixture of PC R1 was added per tube and mixed well.
c) Placed on a PCR instrument, the following reaction program was set up and run:
Figure BDA0002648949250000302
Figure BDA0002648949250000311
d) after the reaction was completed, the next step was immediately carried out.
5) PCR product purification
a) The magnetic beads are taken out 30min in advance and placed at room temperature.
b) mu.L (1.6X) of magnetic beads were added to 25. mu.L of the PCR1 product, and the tip was flicked about 10 times and mixed well.
c) Incubate at room temperature for 5 min.
d) Place on magnetic stand for about 2min or until clear.
e) Carefully discard the supernatant taking care not to attract the beads.
f) Add 200. mu.L of freshly prepared 80% ethanol, let stand for about 30s, and discard the supernatant by aspiration.
g) And f, repeating the step f.
h) The sample tube was briefly centrifuged, placed on a magnetic rack, and the residual ethanol was discarded.
i) Air drying at room temperature for 5-10min or until the magnetic beads are dried.
j) Adding 27 μ L nuclease-free water, blowing, mixing, and standing at room temperature for 2 min.
k) Standing on magnetic frame for 2min or until it is clear
L) transfer 25. mu.L of supernatant to a new tube.
m) the supernatant can be immediately taken to the next step or stored at-20 ℃.
6) Second round linker sequence PCR reaction
a) The PCR2 mixture was prepared as shown, mixed well and centrifuged briefly and the mixture was collected to the bottom of the PCR tube.
Reagent Individual reaction volume (. mu.L)
Nuclease-free water 8
Primer B (4. mu.M) 2.5
Primer C (4. mu.M) 2.5
HotStarTaq polymerase reaction bufferLiquid (5x) 10
HotStarTaq DNA polymerase (Qiagen) (6U/. mu.L) 2
Total volume 25
The primer B sequence is as follows: CCATCTCATCCCTGCGTGTCTCCGACTCAGGATGTACAGTCTACCGATTTG (SEQID NO: 3130)
The primer C sequence is: CCTCTCTATGGGCAGTCGGTGAT (SEQ ID NO: 3131)
b) Add 25. mu.L of PCR2 mixture to the first round of PCR purified product, mix and centrifuge briefly, place the PCR tube into the PCR machine, and perform the PCR amplification reaction using the following procedure
Figure BDA0002648949250000321
c) After the reaction was completed, the next step was immediately carried out.
7) Sequencing library purification
a) The magnetic beads are taken out 30min in advance and placed at room temperature.
b) mu.L (1.1X) of magnetic beads were added to 50. mu.L of the PCR2 product, and the tip was flicked about 10 times and mixed well.
c) Incubate at room temperature for 5 min.
d) Place on magnetic stand for about 2min or until clear.
e) Carefully discard the supernatant taking care not to attract the beads.
f) Add 200. mu.L of freshly prepared 80% ethanol, let stand for about 30s, and discard the supernatant by aspiration.
g) And f, repeating the step f.
h) The sample tube was briefly centrifuged, placed on a magnetic rack, and the residual ethanol was discarded.
i) Air drying at room temperature for 5-10min or until the magnetic beads are dried.
j) Adding 27 μ L nuclease-free water, blowing, mixing, and standing at room temperature for 2 min.
k) Standing on magnetic frame for 2min or until it is clear
L) transfer 25. mu.L of supernatant to a new tube.
m) supernatant, the final sequencing library, was stored at-20 ℃.
8) Sequencing library quantification
The sequencing Library was quantified using qPCR-based methods (using the QIAseq Library Quant assay kit, sequencing Library quantification was performed according to kit instructions).
3. And (3) machine sequencing: the library was diluted to 50pM according to qPCR quantitation using a fully automated pre-sequencing processor (Ion Chef)TMInstrument) and Ion 540TMThe Kit-Chef reagent was pre-sequenced according to the instrument and Kit instructions using a high throughput sequencer (Ion S5)TMXL System) and Ion 540TMAnd performing on-machine sequencing by using the Chip Kit sequencing reagent according to the instructions of the instrument and the Kit.
4. Bioinformatic analysis and molecular typing
1) Data normalization
The distribution of the number of molecules in the sequencing of the fresh tissue and the FFPE tissue is obviously different, so that the data normalization processing of samples from different tissue sources is required. The quantile normalization processing method is adopted to respectively normalize the sequencing molecule numbers of the fresh tissue samples and the FFPE tissue samples. LOG2 values for the number of originally sequenced molecules for each sample (column) were first calculated. To avoid zero value transitions, a fractional value of 0.25 is added to the number of zero molecules. The number of LOG2 molecules for each sample (column) was then sorted. And calculating the arithmetic mean of all the LOG2 molecule numbers of the samples corresponding to each order (such as order 1), including the cases of equal order and missing value. The number of LOG2 molecules corresponding to each gene (row) was replaced by the mean of the number of LOG2 molecules corresponding to the row-column order. And forming a normalized and corrected molecular number matrix. The distribution of experimental values for each column after normalization was the same.
2) Building CMS templates
The CMS typing model is based on the gene-oriented technologyCharacteristic genomes related to road activity, signal transmission and cellular biological activity processes, and up-or down-regulated expression in each CMS typing, are used to determine whether characteristic patterns related to a certain CMS typing exist in a test sample, and the CMS typing model reported in the literature in 2013 contains 786 genes (Sadanadam, A., acidic cancer classification system high tissue proteins cellular phenyl genes therapy, Nat Med.2013 May; 19 (5): 619-25.), wherein some gene information is updated in the NCBI system. CMS typing herein is a comprehensive building 782 of gene sequencing combinations based on updated genetic information and subsequently reported genomes in the literature. The 782 gene combinations were used to train and establish a typing model. The typing model algorithm adopts Nearest Template prediction (near Template Predictions), takes the up-regulation or down-regulation expression of characteristic genomes related to gene pathway activity, signal transmission and cell biological activity in the CMS typing as templates, and calculates the cosine distance between the normalized molecular number value distribution of the characteristic genome of each sample and the expression Template of the characteristic genome of the CMS typing. The cosine distance d is the cosine of the angle between the normalized molecular weight value of the characteristic genome of each sample and the two vectors of the characteristic genome expression template of each CMS typing subtracted by 1. Without loss of generality, it is assumed here that vector a represents the characteristic genome up-or down-regulated expression template and vector B represents the normalized molecular weight value of the measured sample characteristic genome. Each specific element A in vector Ai(i-1, 2.., n) is the up-or down-regulated expression of the ith gene of the template signature genome, Bi(i 1, 2.., n) is the normalized molecular number value of the ith gene of the characteristic genome of the sample to be tested.
Figure BDA0002648949250000331
The cosine distance is defined as a feature distance, which is the default feature distance of the nearest template prediction model. The shortest distance of d indicates that the sample is closest to the characteristic genomic expression template of a CMS typing, i.e., the typing is most likely.
As a test for measuring statistical significance, a random permutation test was used for the P-value of the profile. The P value of the significance test is calculated by randomly extracting a feature gene (default value is 1000 times) to generate a random distribution of feature distances, comparing the distance between the sample to be tested and the typing feature template with the randomly generated distance distribution and correcting a False Discovery Rate (FDR). Smaller P values indicate a greater statistical significance of the shortest cosine feature distance, representing a more reliable CMS typing of the prediction (typically the threshold for statistical significance P is P < 0.05).
Figure 4 is a recent template prediction (NTP) method illustration: marker genes associated with CMS typing are predefined, assuming, without loss of generality, that group A genes (nA) are up-regulated in CMS type A, but not or down-regulated in CMS type B. Also group B genes (nB) are up-regulated in CMS type B, but not or down-regulated in CMS type a. Group A plus group B genes constitute the characteristic genomic templates A and B of the type A and B typing gene expression patterns of CMS. And extracting the normalized magnitude of the nA + nB characteristic gene from the N genes of the detected sample, respectively comparing the normalized magnitude with the two templates A and B, and calculating the characteristic cosine distance relative to A or B. The typing of the nearest distance template becomes the predicted CMS typing. In calculating the statistical significance of the feature distances, nA + nB genes were repeatedly extracted 1000 times at random from N genes, resulting in a zero distribution of feature distances d. And calculating a calibration P value of statistical significance by comparing the characteristic distance of the measured sample with zero distribution. Red and blue in the heatmap represent up-and down-regulation of gene expression, respectively.
The CMS typing of all samples shares the following five types: CMS1-inflammatory type, CMS 2-intestinal epithelial cell type, CMS 2-transient proliferative type, CMS3-goblet type, and CMS 4-trunk type.
Example 2
Retrospective examination of 187 clinical specimens
The detection method of the present invention was used for retrospectively detecting 187 intestinal cancer samples described in example 1, and intestinal cancer typing was performed using an evaluation intestinal cancer molecular typing gene set.
The results are shown in fig. 1-2, where CMS1 accounted for 12%, characterized by strong immune system activation; CMS 2-intestinal epithelial cell type accounts for 18%, and is mainly characterized by epidermal WNT and MYC signal pathway activation, and APC gene may be used as a driving gene; CMS 2-transient proliferative form accounts for 31%; CMS3 accounts for 11%, mainly for metabolic abnormalities, and KRAS may be a driving gene; 25% of CMS4, significant transforming growth factor beta activation, interstitial infiltration and angiogenesis; only about 3% of the samples failed to be CMS molecularly typed.
Fig. 3 shows kaplan-meier survival plots for a retrospective cohort study based on the typing results of fig. 1-2. From the results of retrospective cohort studies, CMS typing can achieve accurate typing of chinese CRC populations, reduce the number of samples that cannot be typed, is valuable for assessing patient relapse-free survival, and is poor in relapse-free survival for CMS4 dry-type and CMS2 enterocyte types.
While the invention has been described with respect to a preferred embodiment, it will be understood by those skilled in the art that the foregoing and other changes, omissions and deviations in the form and detail thereof may be made without departing from the scope of this invention. Those skilled in the art can make various changes, modifications and equivalent arrangements to those skilled in the art without departing from the spirit and scope of the present invention; meanwhile, any changes, modifications and variations of the above-described embodiments, which are equivalent to those of the technical spirit of the present invention, are within the scope of the technical solution of the present invention.

Claims (10)

1. A molecular typing system for intestinal cancer, comprising:
(1) a sequencing library construction module for constructing a sequencing library from the sample RNA;
(2) a quantitative sequencing module for quantifying and sequencing the sequencing library;
(3) the normalization module is used for normalizing the quantitative and sequencing results; and
(4) and the CMS typing module is used for carrying out molecular typing on the normalized result.
2. The molecular typing system according to claim 1, wherein the construction of the sequencing library from the sample RNA comprises reverse transcription of the sample RNA into cDNA and multiplex PCR-targeted amplification of the cDNA.
3. The molecular typing system according to claim 2, wherein the target genes targeted for the multiplex PCR amplification include one or more genes selected from the group consisting of:
Figure FDA0002648949240000011
Figure FDA0002648949240000021
Figure FDA0002648949240000031
4. the molecular typing system according to claim 3, wherein the primers for multiplex PCR targeted amplification comprise primers for target genes for the multiplex PCR targeted amplification.
5. The molecular typing system according to claim 1, wherein the quantification comprises qPCR.
6. The molecular typing system of claim 1, wherein the sequencing comprises high throughput sequencing.
7. The molecular typing system according to claim 1, wherein the normalization comprises quantile normalization.
8. The molecular typing system according to claim 1, wherein the CMS typing module comprises a classification model that calculates cosine distances on CMS typing signature genome expression templates based on the normalized results, and obtains molecular typing results based on the cosine distances.
9. The molecular typing system according to claim 1, wherein the results of the molecular typing include: CMS1-inflammatory type, CMS 2-intestinal epithelial cell type, CMS 2-transient proliferative type, CMS3-goblet type, and CMS 4-trunk type.
10. The molecular typing system according to claim 1, wherein the sample RNA is from Chinese population.
CN202010866899.0A 2020-08-25 2020-08-25 Molecular typing system for intestinal cancer Pending CN111793695A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010866899.0A CN111793695A (en) 2020-08-25 2020-08-25 Molecular typing system for intestinal cancer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010866899.0A CN111793695A (en) 2020-08-25 2020-08-25 Molecular typing system for intestinal cancer

Publications (1)

Publication Number Publication Date
CN111793695A true CN111793695A (en) 2020-10-20

Family

ID=72834119

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010866899.0A Pending CN111793695A (en) 2020-08-25 2020-08-25 Molecular typing system for intestinal cancer

Country Status (1)

Country Link
CN (1) CN111793695A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104582477A (en) * 2012-06-21 2015-04-29 社会福祉法人三星生命公益财团 Method for preparing patient-specific glioblastoma animal model, and use thereof
CN104781423A (en) * 2012-09-18 2015-07-15 新加坡科技研究局 Grouping for classifying gastric cancer
US20150354009A1 (en) * 2012-11-26 2015-12-10 Ecole Polytechnique Federale De Lausanne (Epfl) Colorectal cancer classification with differential prognosis and personalized therapeutic responses

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104582477A (en) * 2012-06-21 2015-04-29 社会福祉法人三星生命公益财团 Method for preparing patient-specific glioblastoma animal model, and use thereof
CN104781423A (en) * 2012-09-18 2015-07-15 新加坡科技研究局 Grouping for classifying gastric cancer
US20150354009A1 (en) * 2012-11-26 2015-12-10 Ecole Polytechnique Federale De Lausanne (Epfl) Colorectal cancer classification with differential prognosis and personalized therapeutic responses

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YUJIN HOSHIDA ET AL.: "Nearest Template Prediction: A Single-Sample-Based", 《PLOS ONE》 *
闫兆鹏 等: "结直肠癌分子亚型共识简介及各亚型特征分析", 《医学综述》 *

Similar Documents

Publication Publication Date Title
CN110317875B (en) Methylation gene related to lung cancer and detection kit thereof
US10196691B2 (en) Colon cancer gene expression signatures and methods of use
CN114317738B (en) Methylation biomarker related to detection of gastric cancer lymph node metastasis or combination and application thereof
US20160102359A1 (en) Genetic marker for early breast cancer prognosis prediction and diagnosis, and use thereof
CN109609648B (en) Liver cancer-related lncRNA marker and detection primer and application thereof
EP2304056A2 (en) Gene expression signatures for lung cancers
CN105154542A (en) Group of genes for lung cancer molecular subtyping and application thereof
EP3472361A1 (en) Compositions and methods for diagnosing lung cancers using gene expression profiles
CN109266745B (en) Screening method and application of lung cancer marker based on gene expression profile
US10961591B2 (en) Methods of mast cell tumor prognosis and uses thereof
WO2023105297A2 (en) Urine mirna marker for bladder cancer diagnosis, diagnostic reagent and kit
US20220259674A1 (en) Compositions and methods for treating breast cancer
CN113897437B (en) Application of reagent for detecting expression level of marker in sample in preparation of kit for diagnosing breast cancer
CN111788317B (en) Compositions and methods for characterizing cancer
US9689041B2 (en) Method and kit for determining in vitro the probability for an individual to suffer from colorectal cancer
KR20190113094A (en) MicroRNA-4732-5p for diagnosing or predicting recurrence of colorectal cancer and use thereof
CN101457254B (en) Gene chip and kit for liver cancer prognosis
EP3677694B1 (en) Compositions and methods for the analysis of radiosensitivity
CN111793695A (en) Molecular typing system for intestinal cancer
US20210079479A1 (en) Compostions and methods for diagnosing lung cancers using gene expression profiles
CN106048050B (en) Method and kit for determining colorectal cancer suffering probability of individual in vitro
Wei et al. Genome‐wide expression difference of MicroRNAs in basal cell carcinoma
US20080119367A1 (en) Prognosis of Renal Cell Carcinoma
US20210040563A1 (en) Molecular signature and use thereof for the identification of indolent prostate cancer
Liu et al. LncRNA expression profiles reveal the co-expression network in human colorectal carcinoma

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201020

RJ01 Rejection of invention patent application after publication