Disclosure of Invention
The invention provides a novel preparation method of an RNA library, which constructs an RNA/DNA hybrid library by quantitatively adding a linker, has simple and rapid operation, can be directly used for sequencing, has no two-strand cDNA synthesis process, namely a cDNA amplification and enrichment process, and can avoid bias and errors generated by PCR amplification and enrichment. Meanwhile, the RNA sequencing method provided by the invention can sequence the first strand cDNA, improve the genome coverage of sequencing data, improve the effective utilization rate (low repetition rate) of the sequencing data and reduce the data analysis burden because the sequencing data does not have the interference of the second strand cDNA sequencing data.
In one aspect, the present invention provides a method for preparing an RNA library, the method comprising the steps of:
reverse transcription is carried out by taking RNA as a template to generate first strand cDNA, and an RNA/cDNA heterozygote is obtained;
adding a first predetermined sequence to at least one end of an RNA/cDNA hybrid to form an RNA/DNA hybrid, wherein the RNA/DNA hybrid is an RNA library, and the first predetermined sequence is positioned at least one end of a cDNA chain. Wherein the predetermined sequence is a sequence of which the sequence is known.
In one example of the present invention, in the above-mentioned method, the reverse transcription is performed in a first reaction system, the first reaction system comprises a reverse transcriptase and a primer, and the reverse transcriptase does not have an RNA degradation function. Ensuring that the reverse transcriptase has transcription activity but not RNA degradation activity, in order to ensure that RNA/cDNA hybrids are formed for subsequent reactions after reverse transcription.
In one example of the present invention, in the above-mentioned method, the reverse transcriptase in the first reaction system is at least one selected from the group consisting of M-muLV enzyme and engineered enzyme thereof, wherein the M-muLV enzyme is M-muLV wild-type enzyme; the M-M mu LV engineering enzyme is an M-M mu LV mutant enzyme, but still has M-M mu LV reverse transcription activity, and C base is added at the tail end of the reverse transcription product cDNA. When M-M μ LV was reverse transcribed using RNA as template, 3 more C bases were added from the 3' end. When reverse transcription is carried out, the used enzyme is not limited to M-M mu LV enzyme and engineering enzyme thereof, and when the used transcriptive enzyme can realize the function of tailing at the 3' end of cDNA, the aim of adding a joint at a single end of RNA/cDNA can be fulfilled. By tailed, it is meant that the 3' unpaired template of the amplified strand is single stranded.
In one example of the present invention, the primer used for reverse transcription in the above method may be a random primer, and the reverse transcription is performed by using the random primer, which is not selective for the template RNA, and can ensure that all the template RNA is reverse transcribed as much as possible. It is understood that reverse transcription may be performed using different primers, such as mRNA, and that polyT primers and random primers may be used, depending on the purpose of the experiment. Whether a polyT primer or a random primer, a second predetermined sequence may be optionally added to the 5' end of the primer, and the second predetermined sequence may be used for bidirectional sequencing of the library and for adding a linker complementary to the sequence.
In one example of the present invention, the nucleotide at the 5' end of the primer is a blocked nucleotide, the blocked nucleotide cannot add a nucleotide sequence at the 5' end of the primer by a ligation reaction, the type of the blocked nucleotide is that the phosphate group of the nucleotide is blocked, or the nucleotide end has modifications such as biotin (biotin), spacer C18 (18), spacer C9, spacer C3 or NH 2C 12 to increase steric hindrance and prevent the ligation reaction between nucleotides, and the nucleotide at the 5' end of the primer is a blocked nucleotide to prevent the first predetermined sequence from being added at the 5' end of the primer by the ligation reaction, thereby ensuring the directionality of the addition of the first predetermined sequence, i.e., the first predetermined sequence is added at the 3' end of the cDNA.
In one example of the present invention, the above method, wherein the reaction of adding the first predetermined sequence and the reverse transcription are performed simultaneously in the first reaction system, i.e., the reverse transcription of RNA and the reaction of adding the first predetermined sequence are performed simultaneously in one reaction system, eliminates the purification step after the reverse transcription of RNA.
It should be noted that a purification step can be selectively added between the reverse transcription of RNA and the reaction of adding the first predetermined sequence, and whether the addition of the purification step does not affect the effect of linker addition. In one example, reverse transcription of RNA and addition of a first predetermined sequence reaction are carried out by mixing reagents required for the two-step reaction (e.g., a reagent ligase required for the linker ligation reaction, a reagent reverse transcriptase required for the reverse transcription reaction, etc.) together to form a reaction system. It will be appreciated by those skilled in the art that the optimal buffers required for the different enzymes may be different and that it is uncertain whether the two enzymes will function properly when mixed together, as the two enzymes may interact. The simultaneous performance of reverse transcription of RNA and addition of the first predetermined sequence reaction is a simplified process step obtained after several attempts. The RNA library constructed by the RNA library preparation method is successfully added with the linker, as shown in the detection result of Th-2 in figure 3. The reverse transcription of RNA and the reaction of adding the first preset sequence are carried out synchronously, so that the operation steps of the experimental process can be reduced, the experimental operation is simplified, and the experimental result is not influenced.
In the above method, the first predetermined sequence is located at least one end of the cDNA strand in three cases: in the first case, the first preset sequence is positioned at the 3 'end of the cDNA chain, and only the 3' end of the cDNA chain in the constructed RNA library contains the first preset sequence; in the second case, the first predetermined sequence is located at the 5 'end of the cDNA strand, and only the 5' end of the cDNA strand in the constructed RNA library contains the first predetermined sequence; in a third case, the first predetermined sequence is located at the 5' end and the 3' end of the cDNA strand, and only the 3' end of the cDNA strand in the constructed RNA library contains the first predetermined sequence.
In one example of the present invention, two methods can be used to add a first predetermined sequence to at least one end of a cDNA strand, the first method being: a first predetermined sequence is added to at least one end of the cDNA strand by ligation or polymerization. When a first predetermined sequence is added to at least one end of a cDNA strand by a ligation reaction, a ligase is included in the first reaction system. Further, the first reaction system further comprises a linker which is a double-stranded DNA comprising a first strand and a second strand, the linker being linked to the RNA/cDNA hybrid to obtain an RNA-DNA hybrid having a first predetermined sequence at least one end of the cDNA strand. When the second strand is linked 5 'to the cDNA 3', preferably the second strand is longer than the first strand, and the probe is used to capture the cDNA strand, the capture temperature of the probe, e.g., higher than the annealing temperature of the linker, is selected to reduce the effect of the first strand on the capture of the second strand by the probe, e.g., the sequences shown in SEQ ID NO 1 and SEQ ID NO 2. In another preferred embodiment, the second strand is linked 5' to the 3' end of the cDNA, the first strand and the second strand are not modified at the 3' end by a hydroxyl group or the hydroxyl group at the 3' end is blocked, the first strand cannot be added to the 5' end of the cDNA by a ligation reaction or the second strand is added in reverse to the 5' end of the cDNA, which ensures that only the cDNA 3' end is added with the first predetermined sequence.
The second method is as follows: adding a first predetermined sequence to at least one end of the cDNA strand by polymerization. When the first predetermined sequence is added by polymerization, the enzyme used for polymerization is the same as the enzyme used for reverse transcription. The same enzyme is used for amplification, the first strand cDNA generated by reverse transcription does not need to be purified, and the single-chain linker is added to carry out amplification reaction, so that the use amount of the enzyme and a purification reagent can be reduced, the operation steps are simplified, and the cost is reduced. Further, when the 3 'end of the cDNA strand in the RNA/cDNA hybrid is a single stranded end, the first reaction system comprises a single stranded DNA linker, the linker and the 3' end of the cDNA strand are complementary, and the linker is longer than the complementary single stranded end. Further, the linker is 10bp to 60bp longer than the single-stranded end of the 3' end of the cDNA strand, preferably 20bp to 50bp longer, as shown in SEQ ID NO 3. The 3' end of the single-stranded DNA linker is complementarily paired with the 3' end of the first strand cDNA, the first strand cDNA is extended and amplified, and a known sequence is added to the 3' end of the first strand cDNA in the amplified product.
The multiple RNA library building methods are simple and quick to operate because the reverse transcription reaction and the joint adding reaction are synchronously carried out; the reverse transcription reaction and the amplification reaction use the same enzyme, so that the use amount of the enzyme can be reduced, and the cost is saved; the constructed library is an RNA/DNA hybrid, can be directly used for sequencing, and can avoid bias and errors generated by PCR amplification enrichment because the method has no two-chain cDNA synthesis process, namely no cDNA amplification enrichment process; directional addition of linkers can be achieved, e.g., adding known sequences only 3' to the first strand cDNA, and when such a library is used for sequencing, the sequencing data is directional, facilitating analysis of the sequencing data.
In a second aspect, the present invention provides an RNA library prepared by the method of any one of the preceding claims. The RNA library was used directly for sequencing.
The third invention of the invention provides an RNA sequencing method, which comprises the following steps:
capturing DNA in the RNA library by the probe to form a probe/DNA hybrid;
sequencing the probe/DNA hybrid to obtain the DNA sequence information;
determining at least a portion of said RNA sequence information based on DNA sequence information;
the probe is complementary paired to the first predetermined sequence.
It will be appreciated that the purpose of complementary pairing of the known sequence to the probe is for probe capture, and therefore complementary pairing of the known sequence to the probe does not require that all of the sequences in the known sequence be complementary paired to the probe, and in the example it can also be seen that two bases at the 3' end of the known sequence are not complementary paired to the probe, and that there are also a plurality of bases in the probe that are not paired in the known sequence. When the base at the 3' end of the known sequence does not complementarily match the probe, the first strand cDNA captured in the amplification sequencing will not be amplified using the probe as a template, thereby reducing the interference of the sequencing signal.
In one example, the sequencing method described above, the sequencing is performed on a solid phase substrate with probes. And (3) directly sequencing after the probe captures the DNA, taking the probe as a sequencing primer, and enabling the first preset sequence to be located at the 3' end of the cDNA and to be complementary and paired with the probe.
In one example, the above step comprises a step of denaturing the RNA library, and after obtaining single-stranded DNA after denaturing the RNA library, the probe captures the DNA. The RNA library contains RNA and cDNA and is double-stranded, and when the probe capture is carried out, the double strand becomes single-stranded, so that the probe can be hybridized and complemented with a known sequence at the 3' end of the first strand cDNA to form a hybrid complex.
In one example, the captured DNA sequence is subjected to bidirectional sequencing, and sequencing is performed by adding a sequencing primer, the sequence of which is identical to the second predetermined sequence.
The multiple RNA sequencing methods can be used for realizing direct sequencing of the first strand cDNA, sequencing data have directionality, and as can be seen from sequencing results in examples, the sequencing method can improve genome coverage of the sequencing data, and meanwhile, due to the fact that interference of the sequencing data of the second strand cDNA is not generated in the sequencing data, the effective utilization rate (low repetition rate) of the sequencing data is improved, and data analysis burden is reduced.
The fourth aspect of the invention provides an RNA sequencing kit, wherein the kit is used for implementing the preparation method of the RNA library or the RNA sequencing method, and comprises the probe, reverse transcriptase and a joint; wherein the adaptor is single-stranded DNA or double-stranded DNA, and the probe is complementarily paired with the first preset sequence.
Further, the kit may further comprise a ligase for ligating the double-stranded DNA to the RNA/DNA heteroduplex.
In one example, the reverse transcriptase in the kit is M-muLV enzyme and its engineered enzyme, which is used for reverse transcription to generate first strand cDNA of RNA.
In one example, the adaptor is a double-stranded DNA, such as the sequences shown in Seq ID No.1 and Seq ID No. 2.
In one example, the linker is a single-stranded DNA; the sequence is shown as Seq ID No. 3.
In one example, the probe is a sequence represented by Seq ID No. 4.
The RNA sequencing kit can be used for constructing the RNA library and sequencing the RNA library.
Detailed Description
Unless otherwise defined, the terms used herein have the ordinary meanings as commonly understood in the art to which this invention belongs.
The present invention will now be described with reference to specific examples and figures, which are provided for illustrative purposes only and are not to be construed as limiting the invention. The examples, where specific techniques or conditions are not indicated, are in accordance with routine experimentation or with the manufacturer's instructions. The reagents or instruments used are not indicated by the manufacturer, and are all conventional products commercially available.
In the examples, the RNA was obtained from the same DH5 a-containing strain (Bio-Rad: B528411-0001). Total RNA of DH5a strain is extracted by using reagent and operation method of the total RNA kit of Tiangen DP430 bacteria.
The specific sequences of the double-stranded linker, single-stranded linker and probe used in the examples are as follows:
the D9TS-ad linker is a double-stranded nucleic acid formed from two single-stranded nucleic acids (a first strand and a second strand):
first chain 5'-AGATGTGTATAAGAGACAGTGGG-3' (Seq ID No.1)
Second chain 5'-ACTGTCTCTTATACACATCTGAGTGGAACTGGATGGTCGCAGGTATCAAGGATT-3' (Seq ID No.2)
D9-TSO linker: single-stranded nucleic acid
5’-GGTCCTTGATACCTGCGACCATCCAGTTCCACTCAGATGTGTATAAGAGACAGTGGG-3’(Seq ID NO.3)
And (3) probe:
5’-TTTTTTTTTTTCCTTGATACCTGCGACCATCCAGTTCCACTCAGATGTGTATAAGAGACAG-3’(Seq ID NO.4)
example 1
High temperature disruption of RNA
100ng of RNA was broken into 100-300bp RNA fragments by the following steps:
1) 5xFS buffer (Thermo Fisher Cat No.: 18064014) thawing and mixing by inversion, preparing the reaction system according to the following table 1:
TABLE 1
|
Control A
|
Experiment B
|
RNA(50ng/μL)
|
2μL
|
2μL
|
5xFS buffer
|
4μL
|
4μL
|
ddH2O
|
2μL
|
2μL
|
Total
|
8μL
|
8μL |
2) High temperature disruption reactions were performed in a PCR instrument: lid (PCR hot lid) 105 deg.C, 94 deg.C for 7min, immediately cooling on ice;
3) immediately after the cooling reaction was completed, the next inversion reaction was carried out.
2. First Strand cDNA Synthesis
1) In the disconnected reaction tube, a reaction system was prepared according to table 2:
TABLE 2
2) Reaction conditions are as follows: 10min at 25 ℃ and 15min at 42 ℃ to obtain RNA/cDNA hybrid fragments after the reaction is finished.
3. Add the piecing
2 μ L D9TS-ad (11uM) linker, 22 μ L Blunt/TA Ligase Master Mix (cat # NEB M0367) was added to the first strand cDNA synthesis tube of experiment B and the ligation product was library B after 15min at room temperature.
4. Purification of
Purifying the first strand cDNA synthesis product of the control A once by using 1.8x Ampure XP beads, and eluting to obtain 10 mu L of supernatant containing RNA/cDNA heterozygous fragment A, wherein the supernatant is used as a fragment control of an experimental library B;
purify the adaptor-added product of experiment B once with 0.8X Ampure XP beads, elute to give 10. mu.L of supernatant 1, purify the supernatant once with 1.2X Ampure XP beads, elute to give 10. mu.L of supernatant 2. The supernatant 2 was examined by the labchip method, and the results are shown in FIG. 1, and the library B (experimental B library) was about 70bp longer than the heterozygous fragment A (control A fragment), indicating that the library B was successfully ligated with a single-ended linker.
5. Hybrid Capture
The probes (SEQ ID NO:4) were immobilized on a chip using the method disclosed in the specification of published patent application CN201510501968.7, and the prepared libraries 1 and 2 were diluted with 3 XSSC hybridization solution and then hybridized with the probes immobilized on a chip. The number of hybridizations of the linker sequence to the probe was then determined from the signal of Cy 3.
The procedure for library chip hybridization was as follows:
(1) chip selection: the base glass of the chip used was an epoxy-modified glass chip of SCHOTT company, and the probe shown in Seq ID No.4 was immobilized by a method of reacting an amino group on the probe with an epoxy group on the surface of the chip, for example, as disclosed in published patent application No. CN201811191589.2, and the probe density was about 18000Dot/FOV in a 110X 110. mu.M area, that is, 18000 bright spots in a 110X 110. mu.M field of view.
(2) Preparing a hybridization solution: as shown in Table 3, the hybridization solution was prepared using 20 XSSC buffer (Sigma, # S6639-1L) at a final concentration of 3 XSSC, and the library at a final concentration of 1nM in a total volume of 40. mu.L. The prepared hybridization solution is denatured at 95 ℃ for 2min and rapidly cooled on ice.
TABLE 3
20 XSSC buffer
| 6μL
|
Library |
1 or library 2
|
Final concentration of 1nM
|
Enucleated acid water
|
Make up to 40. mu.L |
(3) And (3) quickly loading the denatured hybridization solution onto a chip, and then placing the chip at 55 ℃ for 30min to hybridize the library and the probe on the surface of the chip.
(4) The chip was washed sequentially with 3 XSSC, 1 XSSC, and 0.1 XSSC.
Sequencing the library captured by hybridization using the GenoCare third generation sequencing platform, and the sequencing results are shown in table 15.
Example 2
High temperature disruption of RNA
100ng of RNA was disrupted into a 100-and 300-bp fragment by the following steps:
1) the reagents were thawed and mixed by inversion, and the reaction system was prepared according to table 4:
TABLE 4
|
T1 (control)
|
T2 (experiment)
|
RNA(50ng/μL)
|
2μL
|
2μL
|
Frag/1st strand buffer (Tiangen NG308)
|
5μL
|
5μL
|
ddH2O
|
3μL
|
3μL
|
Total
|
10μL
|
10μL |
2) High temperature disruption reactions were performed in a PCR instrument: and (3) Tiangen: cooling immediately on ice at 94 ℃ for 10min and lid 105 ℃;
3) immediately after the cooling reaction was completed, the next inversion reaction was carried out.
2. First Strand cDNA Synthesis
1) In the disconnected reaction tube, a reaction system was prepared according to table 5:
TABLE 5
2) Reaction conditions are as follows: 10min at 25 ℃ and 15min at 42 ℃ to obtain RNA/cDNA hybrid fragments after the reaction is finished.
3. Add the piecing
2 μ L D9TS-ad (11uM) linker, 22 μ L Blunt/TA Ligase Master Mix (cat # NEB M0367) was added to the reverse product of T2 and the ligation product was library T2 after 15min at room temperature.
4. Purification of
The first strand cDNA synthesis product of control T1 was purified once with 1.8X (36. mu.L) Ampure XP beads, eluted to give 10. mu.L supernatant containing RNA/cDNA hybrid T1, used as a fragment control for the experimental library T2;
the linker-added product of experiment T2 was purified once with 0.8X Ampure XP beads, eluting to give 10. mu.L of supernatant 1, and then supernatant 1 was purified once with 1.2X Ampure XP beads, eluting to give 10. mu.L of supernatant 2, and 2L of supernatant was assayed by the labchip method, as shown in FIG. 2, with the assay results for library T2(T2 experiment) being about 70bp greater than for the hybrid T1(T1 control), indicating that library T2 was successfully single-ended linker-added.
5. Hybrid Capture
The hybridization capture step is the same as the "hybridization capture" protocol of step 5 of example 1.
Sequencing the library captured by hybridization using the GenoCare third generation sequencing platform, and the sequencing results are shown in table 15.
Example 3
High temperature disruption of RNA
100ng of RNA was disrupted into a 100-and 300-bp fragment by the following steps:
1) 5xFS buffer (Thermo Fisher Cat No.: 18064014) thawing and mixing by inversion, preparing the reaction system according to Table 6:
TABLE 6
2) High temperature disruption reactions were performed in a PCR instrument: placing at 94 deg.C for 7min, lid 105 deg.C on ice immediately and cooling;
3) immediately after the cooling reaction was completed, the next inversion reaction was carried out.
2. First strand cDNA Synthesis, ligation reaction
1) In the disconnected reaction tube, a reaction system was prepared according to table 7:
TABLE 7
2) 10min at 25 ℃, 15min at 42 ℃ and 15min at room temperature, and adding a joint product to obtain an RNA/cDNA heterozygous segment after the reaction is finished, so as to respectively obtain a product heterozygous segment Th-1 and a library Th-2.
3. Purification of
Control Th-1 was purified once with 1.8X (36. mu.L) Ampure XP beads, eluted to give 10. mu.L supernatant containing RNA/cDNA hybrid fragment Th-1, used as a fragment control for the experimental Th-2 library;
experimental Th-2 was purified once with 0.8x Ampure XP beads, eluting to give 10. mu.L of supernatant 1, and then supernatant 1 was purified once with 1.2x Ampure XP beads, eluting to give 10. mu.L of supernatant 2. The supernatant 2 was examined by the labchip method, and the results are shown in FIG. 3, in which the library Th-2 (experimental Th-2 library) was about 70bp longer than the heterozygous fragment Th-1 (control Th-1 fragment), indicating successful application of the single-ended linker to the library Th-2.
4. Hybrid Capture
The hybridization capture step is the same as the "hybridization capture" protocol of step 5 of example 1.
Sequencing the library captured by hybridization using the GenoCare third generation sequencing platform, and the sequencing results are shown in table 15.
Example 4
High temperature disruption of RNA
100ng of RNA was broken into 100-300bp RNA fragments by the following steps:
1) 5xFS buffer (Thermo Fisher Cat No.: 18064014) thawing and mixing by inversion, preparing the reaction system according to the following table 1:
TABLE 8
2) High temperature disruption reactions were performed in a PCR instrument: lid (PCR hot lid) 105 deg.C, 94 deg.C for 7min, immediately cooling on ice;
3) immediately after the cooling reaction was completed, the next inversion reaction was carried out.
2. First Strand cDNA Synthesis
1) In the disconnected reaction tube, a reaction system was prepared according to table 2:
TABLE 9
2) Reaction conditions are as follows: 10min at 25 ℃ and 15min at 42 ℃ to obtain RNA/cDNA hybrid fragments after the reaction is finished.
3. Add the piecing
2 mu L D9-TSO (10uM) joint is added into a first strand cDNA synthesis tube of experiment Y2, and the ligation product is the library Y2 at 42 ℃ for 5 min.
4. Purification of
The first strand cDNA synthesis product of control Y1 was purified once with 1.8X Ampure XP beads, eluted to give 10. mu.L of supernatant containing RNA/cDNA hybrid fragment A, used as fragment control for experimental library Y2;
purify the adaptor-added product of experiment B once with 0.8X Ampure XP beads, elute to give 10. mu.L of supernatant 1, purify the supernatant once with 1.2X Ampure XP beads, elute to give 10. mu.L of supernatant 2. The supernatant 2 was examined by the labchip method and the results are shown in FIG. 4, the library Y2 (Experimental Y2 library) was about 70bp longer than the hybrid fragment Y1 (control Y1 fragment), indicating that the library Y2 was successfully ligated with a single-ended linker.
5. Hybrid Capture
The hybridization capture step is the same as the "hybridization capture" protocol of step 5 of example 1.
Sequencing the library captured by hybridization using the GenoCare third generation sequencing platform, and the sequencing results are shown in table 15.
Example 5
This example is a RNA library-building sequencing procedure used in the prior art. In total 3 experiments were performed in parallel, with sample designations C1, C2, and C3, respectively, and 100ng of RNA was taken from each sample for subsequent experiments. In the library construction process, a kit (cargo number is NR102) for constructing a quick Tiangen RNA library is used.
High temperature disruption of RNA
100ng of RNA was disrupted into a 100-and 300-bp fragment by the following steps:
1) unfreezing Frag/1st strand buffer in the kit, reversing and mixing uniformly, and preparing a reaction system according to the following table 10:
watch 10
2) High temperature disruption reactions were performed in a PCR instrument: placing at 94 deg.C for 7min, lid 105 deg.C on ice immediately and cooling;
3) immediately after the cooling reaction was completed, the next inversion reaction was carried out.
2. First Strand cDNA Synthesis
1) In the reaction tube after the breakage, a reaction system was prepared as shown in table 11:
TABLE 11
2) Reaction conditions are as follows: 10min at 25 ℃ and 15min at 42 ℃ to obtain RNA/cDNA hybrid fragments after the reaction is finished, and immediately synthesizing second strand cDNA.
3. Second Strand cDNA Synthesis
The second strand cDNA synthesis reaction was configured in a cleavage tube according to table 12:
TABLE 12
Component name
|
Volume of μ L
|
RNA fragments
|
20
|
2st Strand Buffer
|
8.5
|
2st Strand Enzyme Mix
|
3.5
|
Nuclease-Free ddH2O
|
48
|
Total
|
80 |
Reaction conditions are as follows: 60min at 16 ℃ and hold at 4 ℃, and starting purification operation after the reaction is finished.
4. Purification of
And purifying the product after the primary double-chain synthesis by using 1.8x Ampure XP beads, and eluting to obtain 35 mu L of supernatant, namely obtaining a double-chain cDNA sample.
5. End repair/dA addition
The end repair/dA addition reaction system was configured as shown in Table 13:
watch 13
Component name
|
Volume (μ l)
|
cDNA sample
|
35
|
10×ERA buffer
|
5
|
5×ERA Enzyme Mix
|
10
|
Total
|
50 |
Reaction conditions are as follows: the following reactions were carried out in a 4 ℃ pre-cooled PCR instrument with the PCR hot lid temperature set at 70 ℃ and the reaction flow set at 20 ℃ for 30min, 65 ℃ for 30min, 4 ℃ hold. After the reaction procedure was completed, the end repair/dA addition product was obtained and the reaction product was placed on ice immediately into the adaptor ligation step.
6. Joint connection
The linker ligation reaction system was configured according to the system shown in Table 14:
TABLE 14
Component name
|
Volume (μ l)
|
End repair/dA addition products
|
50
|
D9TS-ad (11uM) joint
|
5
|
5×Ligase Buffer
|
20
|
TIANSeq DNA Ligase
|
10
|
Nuclease-Free ddH2O
|
15
|
Total
|
100 |
Reaction conditions are as follows: the following reactions were carried out in a PCR instrument with the temperature of the PCR hot lid set at 40 ℃ or less, the reaction flow set at 20 ℃ for 15min, 4 ℃ hold. After the reaction procedure is finished, the linker ligation product is obtained.
7. Purification step
Purifying the adaptor-ligated product with 0.8x Ampure XP beads, eluting to obtain 35. mu.L of supernatant 1, purifying the supernatant 1 once with 1.2x Ampure XP beads, and eluting to obtain 35. mu.L of supernatant 2, thereby obtaining libraries C1, C2 and C3. The supernatant 2 was detected by the labchip method, and the results are shown in FIG. 5, and the sizes of the libraries C1, C2 and C3 were consistent with the peak values of the main bands of the linker-added libraries in FIGS. 1-4, indicating that the libraries C1, C2 and C3 were successfully added with the linker.
8. Hybrid Capture
The hybridization capture step is the same as the "hybridization capture" protocol of step 5 of example 1.
Sequencing the library captured by hybridization using the GenoCare third generation sequencing platform, and the sequencing results are shown in table 15.
Watch 15
Remarking: count _ all: all reads obtained after the sequencing data are subjected to quality control filtration; count _ map: the data after quality control can be compared with the reads number of the genome; count _ unmap: the data after quality control cannot be compared with the reads number of the genome; count _ unique: comparing the numbers of reads of the unique position on the genome in the reads of the genome; meanlen under various parameters: representing the average reads length under each parameter; coverage: the unique reads are spliced, the repetition is removed, and the total length of bases which can cover the genome is large; base: the total base number on the genome can be aligned in unique reads; cover _ depth, which means the coverage depth in the covered reads area; a Depth: dividing the number of bases by the number of bases of the total length of the genome to obtain the average sequencing depth; coverag%: sequencing coverage was obtained by dividing the number of length bases of coverage by the number of bases of total genome length.
The same amount of RNA library constructed in examples 1-5 was subjected to on-machine sequencing, and the specific sequencing results are shown in Table 15, as can be seen from Table 15:
the sequencing data (count _ all) for the RNA libraries (B, T2, Th-2, Y2) was about half less than the amount of sequencing data (count _ all) for the comparison libraries (C1, C2, C3). Since the RNA libraries (B, T2, Th-2, Y2) are single-ended with known sequences added, sequence capture is performed by capturing only the first strand cDNA in the library for sequencing, while the alignment libraries (C1, C2, C3) are double-ended with linkers for sequencing, and the captured nucleic acid strands are complementary two strands, the same library amount is used for machine sequencing, and the sequencing data amount is different by multiple. The reverse is true when both libraries obtain the same amount of data, the amount of machine library on the RNA library (B, T2, Th-2, Y2) is about half the amount of comparison library (C1, C2, C3);
the percent coverage of sequencing data coverage of the RNA library (B, T2, Th-2 and Y2) is far higher than that of the comparison library (C1, C2 and C3), namely the genome coverage of the sequencing data is greatly different, and the coverage of the sequencing data genome obtained by the RNA library construction and sequencing method is far higher than that of the comparison library. Since the duplication rate in Count _ unique is about 40% and the comparison libraries (C1, C2, C3) are about 90% (data not shown), and since the two-stranded cDNA amplification enrichment is performed during the construction of the comparison libraries, the bias exists, i.e., some fragments occupy less and less proportion during the amplification process, so that the fragments are filtered out as invalid data or are not effectively sequenced at all during the preliminary filtering of sequencing data, and the genome coverage rate of sequencing data of the comparison libraries is significantly lower than that of the sequencing data of the RNA libraries (B, T2, Th-2, Y2) constructed by the method of the present invention.
At the same time, it can be concluded that when the same sequencing amount (count _ all) is obtained, the% coverage of the sequencing data of the RNA libraries (B, T2, Th-2, Y2) is much higher than that of the comparative libraries (C1, C2, C3).
The foregoing is a more detailed description of the present application in connection with specific embodiments thereof, and it is not intended that the present application be limited to the specific embodiments thereof. For those skilled in the art to which the present application pertains, several simple deductions or substitutions may be made without departing from the concept of the present application, and all should be considered as belonging to the protection scope of the present application.
Sequence listing
<110> Shenzhen Zhenzhiji Biotech Limited
<120> preparation method, sequencing method and kit of RNA library
<130> PI2019004
<160> 4
<170> SIPOSequenceListing 1.0
<210> 1
<211> 23
<212> DNA
<213> Artificial Sequence
<220>
<221> misc_feature
<222> (1)..(23)
<223> linker sequence
<400> 1
agatgtgtat aagagacagt ggg 23
<210> 2
<211> 54
<212> DNA
<213> Artificial Sequence
<220>
<221> misc_feature
<222> (1)..(54)
<223> linker sequence
<400> 2
actgtctctt atacacatct gagtggaact ggatggtcgc aggtatcaag gatt 54
<210> 3
<211> 57
<212> DNA
<213> Artificial Sequence
<220>
<221> misc_feature
<222> (1)..(57)
<223> linker sequence
<400> 3
ggtccttgat acctgcgacc atccagttcc actcagatgt gtataagaga cagtggg 57
<210> 4
<211> 61
<212> DNA
<213> Artificial Sequence
<220>
<221> misc_feature
<222> (1)..(61)
<223> Probe sequence
<400> 4
tttttttttt tccttgatac ctgcgaccat ccagttccac tcagatgtgt ataagagaca 60
g 61