CN110246543A - The method and computer system of single pattern detection copy number variation are utilized based on two generation sequencing technologies - Google Patents
The method and computer system of single pattern detection copy number variation are utilized based on two generation sequencing technologies Download PDFInfo
- Publication number
- CN110246543A CN110246543A CN201910541057.5A CN201910541057A CN110246543A CN 110246543 A CN110246543 A CN 110246543A CN 201910541057 A CN201910541057 A CN 201910541057A CN 110246543 A CN110246543 A CN 110246543A
- Authority
- CN
- China
- Prior art keywords
- copy number
- number variation
- generation sequencing
- pattern detection
- sequencing technologies
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000012163 sequencing technique Methods 0.000 title claims abstract description 31
- 238000001514 detection method Methods 0.000 title claims abstract description 28
- 238000005516 engineering process Methods 0.000 title claims abstract description 23
- 239000011159 matrix material Substances 0.000 claims description 18
- 108090000623 proteins and genes Proteins 0.000 claims description 13
- 230000003321 amplification Effects 0.000 claims description 10
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 10
- 101001012157 Homo sapiens Receptor tyrosine-protein kinase erbB-2 Proteins 0.000 claims description 7
- 102100030086 Receptor tyrosine-protein kinase erbB-2 Human genes 0.000 claims description 6
- 238000003780 insertion Methods 0.000 claims description 6
- 230000037431 insertion Effects 0.000 claims description 6
- 238000013178 mathematical model Methods 0.000 claims description 6
- 230000017105 transposition Effects 0.000 claims description 6
- 238000013467 fragmentation Methods 0.000 claims description 4
- 238000006062 fragmentation reaction Methods 0.000 claims description 4
- 230000008901 benefit Effects 0.000 claims description 3
- 238000013135 deep learning Methods 0.000 claims description 3
- 230000007812 deficiency Effects 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000003322 aneuploid effect Effects 0.000 claims description 2
- 208000036878 aneuploidy Diseases 0.000 claims description 2
- 210000000349 chromosome Anatomy 0.000 claims description 2
- 238000007477 logistic regression Methods 0.000 claims description 2
- 238000004458 analytical method Methods 0.000 abstract description 7
- 238000007796 conventional method Methods 0.000 abstract description 4
- 238000002474 experimental method Methods 0.000 abstract description 3
- 239000000523 sample Substances 0.000 description 26
- 238000012360 testing method Methods 0.000 description 5
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 108020004414 DNA Proteins 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000012268 genome sequencing Methods 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 208000030507 AIDS Diseases 0.000 description 1
- 208000031404 Chromosome Aberrations Diseases 0.000 description 1
- 101150029707 ERBB2 gene Proteins 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 208000037979 autoimmune inflammatory disease Diseases 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 231100000005 chromosome aberration Toxicity 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000000926 neurological effect Effects 0.000 description 1
- 230000003950 pathogenic mechanism Effects 0.000 description 1
- 238000003752 polymerase chain reaction Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 235000015170 shellfish Nutrition 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/10—Ploidy or copy number detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/50—Mutagenesis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
Landscapes
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Medical Informatics (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Bioethics (AREA)
- Artificial Intelligence (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present invention discloses the method and computer system based on two generation sequencing technologies using single pattern detection copy number variation.The present invention can compare, the factor that sequencing depth, G/C content are corrected and the conventional methods such as paired sample must or require carries out single sample to copy number variation CNV detection from sequencing initial data without dependence.Thus experiment and analytical procedure are not only simplified, it reduces costs, and analysis result is consistent with conventional method height, is also effectively corrected by increasing Clinical results (such as FISH is verified) to the false positive and false negative of traditional method detection.
Description
Technical field
The present invention relates to genetic tests, and in particular to utilizes single pattern detection copy number variation based on two generation sequencing technologies
Method and computer system.
Background technique
Copying number variation (copy number variation, CNV) is a kind of structure variation, is that weight occurs by genome
Caused by row, microscopic level (microscopic) and sub- microscopic level can be divided into according to size
(submicroscopic).The structure variation of microscopic level is primarily referred to as visible under microscope including euploid or non-multiple
The chromosome aberrations such as body, insertion, missing, inversion, transposition;The structure variation of sub- microscopic level refers mainly to DNA fragmentation length in 1Kb
Above includes the variation of the generations such as insertion, missing, repetition, inversion, transposition.Copy number variation is that the important of human diseases is caused a disease
One of factor, current research discovery CNV is related with the pathogenic mechanism of many complex inheritance diseases or neurological susceptibility, including tumour,
Acquired immunodeficiency syndrome, systemic loupus erythematosus, autoimmune inflammatory diseases etc..Clinically carry out copy number variation inspection
Survey is necessary, can early discovery genome in large fragment DNA sequence variation, thus be disease diagnosing and treating
Reference frame is provided.
There are many means and method of copy number variation detection at present, such as the method based on polymerase chain reaction, including more
Heavy chain connects probe amplification technology and multiple amplifiable probe hybridization technique etc.;Method based on hybridization technique, including it is in situ immune
Fluorescence and Gismsa Banded method etc.;Method based on chip technology, including mononucleotide polymorphism chip etc..These methods are not
Only complicated for operation, resolution ratio is low, it is difficult to the specifying information of variation section is provided, and analysis throughput is lower, price costly,
Cost performance is not very high.And with the fast development of two generation sequencing technologies, not only sequencing cost substantially reduces, and analysis throughput has
Index improves, and resolution ratio can drop to Kb level, so that the copy number variation research of sub- microscopic level can be more deep
Enter.The algorithm of detection CNV is essentially all and is developed based on genome sequencing (WGS) level at present, such as CNVkit,
CNVnator, Control-FREEC etc., and consider detection accuracy, it generally can all require to be paired sample to detect CNV;
Detection for single sample generally all can be that identification CNV is corrected according to sequencing depth and G/C content.And with target sequencing
Demand is higher and higher, the stronger algorithm of some specific aims of also having come into being other than the algorithm mentioned before, such as
PatternCNV, Ioncopy etc..But these methods will be compared and for excessive dependent on sequencing without exception
Depth and G/C content, and it is also limited to the parameter setting of alignment parameters and parser, overall flow is relatively complicated and complicated,
Experiment and analysis cost are also all higher.
Summary of the invention
In consideration of it, the present invention establishes a kind of method and calculating based on two generation sequencing technologies list pattern detections copy number variation
Machine system.The present invention can not have to rely on comparison, sequencing depth, G/C content correction and pairing sample from sequencing initial data
The factor that the conventional methods such as this must or require carries out copy number variation CNV detection to single sample.
Specifically, the present invention includes the following contents.
The first aspect of the present invention provides a kind of side based on two generation sequencing technologies using single pattern detection copy number variation
Method comprising following steps:
(1) the first cdna sample database and the second cdna sample database are established, wherein the first cdna sample database
Comprising A copy number mutant genes, the second cdna sample database includes B and does not occur to copy number variation in corresponding gene
Gene, wherein A and B is respectively 50 or more natural number;Gene in first cdna sample database, which preferably comprises, to be copied
Shellfish number variation, and the gene in the second cdna sample database preferably with the gene copy number in the first cdna sample database
The position (i.e. region) of variation does not occur to copy number variation accordingly.It should be noted that in the second cdna sample database
It includes copying number variation mutation that there may be differences with the mutation except copy number variable region in the first cdna sample in gene.
In order to guarantee the reliability and accuracy of the method for the present invention, it is however generally that the natural number that need A and B respectively be 50 or more,
It is preferred that 100 or more, more preferable 200 or more, further preferred 300 or more.The upper limit of A and B is not specially required.
It (2) is L by lengthjThe j copy number variable region of bp is divided with the sliding window of m bp size, step-length n
Thus bp obtains i=L in each copy number variable regionj/ n seed sequence, wherein if Lj/ n is to divide exactly, then i is rounded, if
Lj/ n is not to divide exactly, then i, which is rounded downwards, adds 1, amounts to and obtains the matrix being made of j*i seed sequence.In general, j be 1 with
On natural number, preferably 10 or more, more preferable 30 or more.The natural number that m is 50 or more, more preferable 80 or more natural number, more
It is preferred that m is L or less.N is 1 so that up to L natural number below, preferably 5 up to L natural number below.
(3) j*i seed sequence is carried out in the first cdna sample database and the second cdna sample database respectively
Not fault-tolerant sufficient sequence matching obtains the matrix of j*i exact matching seed sequence number in each database.
(4) matrix that seed sequence number is exactly matched in each database is standardized, i.e., it is complete each
With seed sequence number divided by the average of all exact matching seed sequence numbers of the copy number variable region.
(5) matrix of the exact matching seed sequence number after standardization is carried out mending the processing of 0 value, i.e., is become with copy number
The seed sequence that different region obtains is the largest number of to compare, and the matrix value of remaining region deficiency number is set as 0.
(6) by A+B 0 value of benefit treated standardization exact matching seed sequence matrix number carry out mathematical modeling, according to
Positive and negative result establishes data statistics model, finally obtains the mathematical model of the yin and yang attribute of judgement copy number variation.
(7) step (2)-(5) will be repeated to judgement sample, carry out copy number change using the resulting mathematical model of step (6)
Different prediction and judgement is then judged as positive, otherwise is if predicted value is greater than 0.5, preferably greater than 0.6, more preferably greater than 0.8
It is negative.
Preferably, described in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation
Cdna sample data source in first cdna sample database and the second cdna sample database in genome sequencing and/or
The data that target area capture/amplification is sequenced.
Preferably, described in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation
Copy number variation includes gene copy number amplification and/or missing.
Preferably, described in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation
Copy number variation include chromosome euploid or aneuploid, insertion, missing, inversion and transposition and the insertion of DNA fragmentation,
Missing, repetition, inversion or transposition.
Preferably, described in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation
The length of DNA fragmentation is 1Kb or more, preferably 1.5Kb or more.On the other hand, preferably 10Kb is hereinafter, more preferably 8Kb or less.
Preferably, described in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation
Gene is ERBB2.
Preferably, in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation, in step
Suddenly data statistics model is established by logistic regression or deep learning algorithm in (6).
The second aspect of the present invention provides a kind of computer system comprising processor, and be configured as executing the present invention
First aspect described in method.
Invention not only simplifies experiment and analytical procedures, reduce costs, and analyze result and conventional method have compared with
High concordance rate, it is negative to the false positive of traditional method detection and vacation also by Clinical results (such as FISH is verified) are increased
Property is effectively corrected.
Detailed description of the invention
Fig. 1 is a kind of exemplary process diagram of the present invention.
Specific embodiment
The existing various exemplary embodiment that the present invention will be described in detail, the detailed description are not considered as to limit of the invention
System, and it is understood as the more detailed description to certain aspects of the invention, characteristic and embodiment.
It should be understood that it is to describe special embodiment that heretofore described term, which is only, it is not intended to limit this hair
It is bright.In addition, for the numberical range in the present invention, it is thus understood that specifically disclose the range upper and lower bound and they it
Between each median.Median and any other statement value in any statement value or stated ranges or in the range
Lesser range is also included in the present invention each of between interior median.These small range of upper and lower bounds can be independent
Ground includes or excludes in range.
Unless otherwise stated, all technical and scientific terms used herein has the routine in field of the present invention
The normally understood identical meanings of technical staff.Although the present invention only describes preferred method and material, of the invention
Implement or also can be used and similar or equivalent any method and material described herein in testing.The institute mentioned in this specification
There is document to be incorporated by reference into, to disclosure and description method relevant to the document and/or material.It is incorporated to any
When document conflicts, it is subject to the content of this specification.Unless otherwise stated, " % " or " amount " is the percentage based on weight
Number.
Embodiment
It chooses 5 known ERBB2 amplification positives and 5 negative full exon group data of known ERBB2 amplification is surveyed
Try analysis method of the invention.Specifically illustrated (such as flow chart 1) with 1 ERBB2 positive sample embodiment Sample1, remaining reality
It applies example and repeats step 1-11, as follows:
1. collecting the full sequencing of extron group data of 272 ERBB2 gene magnification positives, it is collected simultaneously 1029 ERBB2
The full sequencing of extron group data of gene magnification feminine gender, and data are divided into training set and test set two parts;Wherein
Training set includes 223 positive samples, and 817 negative samples, test set includes 49 positive samples, 212 negative samples
This;
2. gene ERBB2 includes 27 full exons, by region Lj (0 < j < 28) bp of 27 amplifications or missing with 50bp
The sliding window of size is divided, step-length 40bp, the available total i=Lj/40 seed sequence in region of each amplification or missing
Column, wherein i is rounded if Lj/40 is to divide exactly, if Lj/40 is not to divide exactly, i, which is rounded downwards, adds 1, therefore available 27* in total
I seed sequence matrix, wherein Lj be respectively 311bp, 152bp, 214bp, 135bp, 69bp, 116bp, 142bp, 120bp,
127bp、74bp、91bp、200bp、133bp、91bp、161bp、48bp、139bp、123bp、99bp、186bp、156bp、
76bp,147bp,98bp,189bp,253bp,974bp.Corresponding i is respectively 8,4,6,4,2,3,4,4,4,2,3,6,4,3,5,
2、4、4、3、5、4、2、4、3、5、7、25。
3. 27*i seed sequence is carried out not fault-tolerant complete sequence in the data of training 1040 samples of set respectively
Column matching, the matrix of 27*i exact matching seed sequence number of available each sample.
4. the exact matching seed sequence matrix number of pair each sample is standardized, i.e., each exact matching seed
Sequence number is divided by the amplification or the average of all exact matching seed sequence numbers of absent region.
5. the matrix of the exact matching seed sequence number after pair standardization carries out mending the processing of 0 value, i.e., to expand or lack area
The seed sequence that domain obtains is the largest number of to compare, and the matrix value of remaining region deficiency number is set as 0.
6. by 1040 0 values of benefit treated standardization exact matching seed sequence number 27*25 matrix carry out mathematical modeling,
10 times of cross validations are carried out to 1040 samples first, and utilize convolutional neural networks (CNN) algorithm knot in deep learning
It closes positive and negative result and hyper parameter, adjusting and optimizing is chosen to model, finally obtaining to training set AUC is 93.04%, to test
The optimal mathematical model that set AUC is 94.54%, in this, as the model method for the new samples for judging same type data.Model
Parameter is as shown in table 1.
Table 1- model parameter
7. by Sample1 repeat step 2-5, using 6 resulting optimal mathematical models carry out copy number variation prediction and
Judgement, predicted value 0.9916596 are greater than 0.5, it is believed that are positive.
Sample2-Sample10 predicted value and judging result are as shown in table 2 below.
The prediction result of each sample of table 2- summarizes
Sample ID | Predicted value | Observation |
Sample1 | 0.9916596 | It is positive |
Sample2 | 0.9989957 | It is positive |
Sample3 | 0.9999901 | It is positive |
Sample4 | 0.9990958 | It is positive |
Sample5 | 0.99751943 | It is positive |
Sample6 | 0.012844639 | It is negative |
Sample7 | 0.006111831 | It is negative |
Sample8 | 0.003521628 | It is negative |
Sample9 | 0.008016149 | It is negative |
Sample10 | 0.002645513 | It is negative |
Without departing substantially from the scope or spirit of the invention, the specific embodiment of description of the invention can be done more
Kind improvements and changes, this will be apparent to those skilled in the art.Other realities obtained by specification of the invention
Applying mode for technical personnel is apparent obtain.Present specification and embodiment are merely exemplary.
Claims (9)
1. a kind of utilize the method for single pattern detection copy number variation based on two generation sequencing technologies, which is characterized in that including following
Step:
(1) the first cdna sample database and the second cdna sample database are established, wherein the first cdna sample database includes A
Example copy number mutant gene, the second cdna sample database include the B bases for not occurring to copy number variation in corresponding position
Cause, wherein A and B is respectively 50 or more natural number;
It (2) is L by lengthjThe j copy number variable region of bp is divided with the sliding window of m bp size, and step-length is n bp, thus
I=L is obtained in each copy number variable regionj/ n seed sequence, wherein if Lj/ n is to divide exactly, then i is rounded, if Lj/ n is not
Divide exactly, then i, which is rounded downwards, adds 1, amounts to and obtains the matrix being made of j*i seed sequence;
(3) j*i seed sequence is not allowed in the first cdna sample database and the second cdna sample database respectively
Wrong sufficient sequence matching obtains the matrix of j*i exact matching seed sequence number in each database;
(4) matrix that seed sequence number is exactly matched in each database is standardized, i.e., each exact matching kind
Subsequence number divided by all exact matching seed sequence numbers of the copy number variable region average;
(5) matrix of the exact matching seed sequence number after standardization is carried out mending the processing of 0 value, i.e., with copy number region of variability
The seed sequence that domain obtains is the largest number of to compare, and the matrix value of remaining region deficiency number is set as 0;
(6) by A+B 0 value of benefit treated standardization exact matching seed sequence matrix number carry out mathematical modeling, according to yin, yang
Property result establish data statistics model, finally obtain judgement copy number variation yin and yang attribute mathematical model;
(7) step (2)-(5) will be repeated to judgement sample, carry out copy number variation using step (6) resulting mathematical model
Prediction and judgement are judged as positive if predicted value is greater than 0.5, otherwise are feminine gender.
2. according to claim 1 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies
Sign is, the natural number that j is 1 or more.
3. according to claim 2 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies
Sign is that the cdna sample data source in the first cdna sample database and the second cdna sample database is in complete
The data that gene order-checking and/or target area capture/amplification are sequenced.
4. according to claim 3 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies
Sign is that the copy number variation includes gene copy number amplification and/or missing.
5. according to claim 4 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies
Sign is, described to copy the euploid or aneuploid, insertion, missing, inversion or transposition and DNA that number variation includes chromosome
Insertion, missing, repetition, inversion or the transposition of segment.
6. according to claim 5 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies
Sign is that the length of the DNA fragmentation is 1Kb or more.
7. according to claim 1 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies
Sign is that the gene is ERBB2.
8. according to claim 1 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies
Sign is, establishes data statistics model by logistic regression or deep learning algorithm in step (6).
9. a kind of computer system, which is characterized in that it includes processor, and be configured to execute according to claim 1-
8 described in any item methods.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910541057.5A CN110246543B (en) | 2019-06-21 | 2019-06-21 | Method and computer system for detecting copy number variation by using single sample based on second-generation sequencing technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910541057.5A CN110246543B (en) | 2019-06-21 | 2019-06-21 | Method and computer system for detecting copy number variation by using single sample based on second-generation sequencing technology |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110246543A true CN110246543A (en) | 2019-09-17 |
CN110246543B CN110246543B (en) | 2021-02-26 |
Family
ID=67888607
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910541057.5A Active CN110246543B (en) | 2019-06-21 | 2019-06-21 | Method and computer system for detecting copy number variation by using single sample based on second-generation sequencing technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110246543B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111276189A (en) * | 2020-02-26 | 2020-06-12 | 广州市金域转化医学研究院有限公司 | Chromosome balance translocation detection and analysis system based on NGS and application thereof |
CN112634987A (en) * | 2020-12-25 | 2021-04-09 | 北京吉因加医学检验实验室有限公司 | Method and device for detecting copy number variation of single-sample tumor DNA |
CN113736865A (en) * | 2021-09-09 | 2021-12-03 | 元码基因科技(北京)股份有限公司 | Kit, reaction system and method for detecting gene copy number variation in sample |
CN114496300A (en) * | 2021-12-20 | 2022-05-13 | 北京优迅医学检验实验室有限公司 | Method and device for clinical annotation of copy number variation pathogenicity |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130184999A1 (en) * | 2012-01-05 | 2013-07-18 | Yan Ding | Systems and methods for cancer-specific drug targets and biomarkers discovery |
CN106372459A (en) * | 2016-08-30 | 2017-02-01 | 天津诺禾致源生物信息科技有限公司 | Method and device for detecting copy number variation based on amplicon next generation sequencing |
CN108073791A (en) * | 2017-12-12 | 2018-05-25 | 元码基因科技(北京)股份有限公司 | Method based on two generation sequencing datas detection target gene structure variation |
CN108256289A (en) * | 2018-01-17 | 2018-07-06 | 湖南大地同年生物科技有限公司 | A kind of method based on target area capture sequencing genomes copy number variation |
CN108304694A (en) * | 2018-01-30 | 2018-07-20 | 元码基因科技(北京)股份有限公司 | Method based on two generation sequencing data analyzing gene mutations |
CN108427864A (en) * | 2018-02-14 | 2018-08-21 | 南京世和基因生物技术有限公司 | A kind of detection method, device and the computer-readable medium of copy number variation |
CN108920899A (en) * | 2018-06-10 | 2018-11-30 | 杭州迈迪科生物科技有限公司 | A kind of single exon copy number variation prediction technique based on target area sequencing |
CN110808084A (en) * | 2019-09-19 | 2020-02-18 | 西安电子科技大学 | A copy number variation detection method based on single-sample next-generation sequencing data |
-
2019
- 2019-06-21 CN CN201910541057.5A patent/CN110246543B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130184999A1 (en) * | 2012-01-05 | 2013-07-18 | Yan Ding | Systems and methods for cancer-specific drug targets and biomarkers discovery |
CN106372459A (en) * | 2016-08-30 | 2017-02-01 | 天津诺禾致源生物信息科技有限公司 | Method and device for detecting copy number variation based on amplicon next generation sequencing |
CN108073791A (en) * | 2017-12-12 | 2018-05-25 | 元码基因科技(北京)股份有限公司 | Method based on two generation sequencing datas detection target gene structure variation |
CN108256289A (en) * | 2018-01-17 | 2018-07-06 | 湖南大地同年生物科技有限公司 | A kind of method based on target area capture sequencing genomes copy number variation |
CN108304694A (en) * | 2018-01-30 | 2018-07-20 | 元码基因科技(北京)股份有限公司 | Method based on two generation sequencing data analyzing gene mutations |
CN108427864A (en) * | 2018-02-14 | 2018-08-21 | 南京世和基因生物技术有限公司 | A kind of detection method, device and the computer-readable medium of copy number variation |
CN108920899A (en) * | 2018-06-10 | 2018-11-30 | 杭州迈迪科生物科技有限公司 | A kind of single exon copy number variation prediction technique based on target area sequencing |
CN110808084A (en) * | 2019-09-19 | 2020-02-18 | 西安电子科技大学 | A copy number variation detection method based on single-sample next-generation sequencing data |
Non-Patent Citations (2)
Title |
---|
MIN ZHAO.ET AL: ""Computational tools for copy number variation"", 《BMC BIOINFORMATICS》 * |
刘永壮: ""基于高通量测序数据的基因组变异检测"", 《中国博士学位论文全文数据库(电子期刊)》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111276189A (en) * | 2020-02-26 | 2020-06-12 | 广州市金域转化医学研究院有限公司 | Chromosome balance translocation detection and analysis system based on NGS and application thereof |
CN112634987A (en) * | 2020-12-25 | 2021-04-09 | 北京吉因加医学检验实验室有限公司 | Method and device for detecting copy number variation of single-sample tumor DNA |
CN113736865A (en) * | 2021-09-09 | 2021-12-03 | 元码基因科技(北京)股份有限公司 | Kit, reaction system and method for detecting gene copy number variation in sample |
CN114496300A (en) * | 2021-12-20 | 2022-05-13 | 北京优迅医学检验实验室有限公司 | Method and device for clinical annotation of copy number variation pathogenicity |
Also Published As
Publication number | Publication date |
---|---|
CN110246543B (en) | 2021-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110246543A (en) | The method and computer system of single pattern detection copy number variation are utilized based on two generation sequencing technologies | |
Oldham et al. | Network methods for describing sample relationships in genomic datasets: application to Huntington’s disease | |
JP4594622B2 (en) | Drug discovery method | |
US20020095260A1 (en) | Methods for efficiently mining broad data sets for biological markers | |
WO2022170909A1 (en) | Drug sensitivity prediction method, electronic device and computer-readable storage medium | |
US20210090686A1 (en) | Single cell rna-seq data processing | |
CN109310332A (en) | Method for analyzing numerical data | |
He et al. | Microarrays—the 21st century divining rod? | |
CN112289376B (en) | Method and device for detecting somatic cell mutation | |
Rabier et al. | On the inference of complex phylogenetic networks by Markov Chain Monte-Carlo | |
WO2021086595A1 (en) | Using machine learning-based trait predictions for genetic association discovery | |
CN113936737A (en) | Method, system and equipment for comparing RNA structures based on RNA motif vectors | |
Nanguneri et al. | Characterization of nanoscale organization of f-actin in morphologically distinct dendritic spines in vitro using supervised learning | |
CN113362894A (en) | Method for predicting syndromal cancer driver gene | |
Padmanaban et al. | Between-tumor and within-tumor heterogeneity in invasive potential | |
Gill et al. | Multi‐trait genomic selection improves the prediction accuracy of end‐use quality traits in hard winter wheat | |
CN110010195A (en) | A kind of method and device detecting single nucleotide mutation | |
Wei et al. | DMSC: a dynamic multi-seeds method for clustering 16S rRNA sequences into OTUs | |
JP2007504542A (en) | How to process biological data | |
Rao et al. | Partial correlation based variable selection approach for multivariate data classification methods | |
CN114317725B (en) | Crohn's disease biomarker, kit and screening method for biomarker | |
CN110164504A (en) | Processing method, device and the electronic equipment of two generation sequencing datas | |
Kuchroo et al. | spARC recovers human glioma spatial signaling networks with graph filtering | |
Wang et al. | ELLA: Modeling Subcellular Spatial Variation of Gene Expression within Cells in High-Resolution Spatial Transcriptomics | |
WO2023277932A1 (en) | Detection of human leukocyte antigen loss of heterozygosity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |