[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN110246543A - The method and computer system of single pattern detection copy number variation are utilized based on two generation sequencing technologies - Google Patents

The method and computer system of single pattern detection copy number variation are utilized based on two generation sequencing technologies Download PDF

Info

Publication number
CN110246543A
CN110246543A CN201910541057.5A CN201910541057A CN110246543A CN 110246543 A CN110246543 A CN 110246543A CN 201910541057 A CN201910541057 A CN 201910541057A CN 110246543 A CN110246543 A CN 110246543A
Authority
CN
China
Prior art keywords
copy number
number variation
generation sequencing
pattern detection
sequencing technologies
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910541057.5A
Other languages
Chinese (zh)
Other versions
CN110246543B (en
Inventor
郎继东
王博
杨家亮
田埂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Meta Code Gene Technology (beijing) Ltd By Share Ltd
Original Assignee
Meta Code Gene Technology (beijing) Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Meta Code Gene Technology (beijing) Ltd By Share Ltd filed Critical Meta Code Gene Technology (beijing) Ltd By Share Ltd
Priority to CN201910541057.5A priority Critical patent/CN110246543B/en
Publication of CN110246543A publication Critical patent/CN110246543A/en
Application granted granted Critical
Publication of CN110246543B publication Critical patent/CN110246543B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/10Ploidy or copy number detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/50Mutagenesis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Landscapes

  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • Epidemiology (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Bioethics (AREA)
  • Artificial Intelligence (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention discloses the method and computer system based on two generation sequencing technologies using single pattern detection copy number variation.The present invention can compare, the factor that sequencing depth, G/C content are corrected and the conventional methods such as paired sample must or require carries out single sample to copy number variation CNV detection from sequencing initial data without dependence.Thus experiment and analytical procedure are not only simplified, it reduces costs, and analysis result is consistent with conventional method height, is also effectively corrected by increasing Clinical results (such as FISH is verified) to the false positive and false negative of traditional method detection.

Description

The method and calculating of single pattern detection copy number variation are utilized based on two generation sequencing technologies Machine system
Technical field
The present invention relates to genetic tests, and in particular to utilizes single pattern detection copy number variation based on two generation sequencing technologies Method and computer system.
Background technique
Copying number variation (copy number variation, CNV) is a kind of structure variation, is that weight occurs by genome Caused by row, microscopic level (microscopic) and sub- microscopic level can be divided into according to size (submicroscopic).The structure variation of microscopic level is primarily referred to as visible under microscope including euploid or non-multiple The chromosome aberrations such as body, insertion, missing, inversion, transposition;The structure variation of sub- microscopic level refers mainly to DNA fragmentation length in 1Kb Above includes the variation of the generations such as insertion, missing, repetition, inversion, transposition.Copy number variation is that the important of human diseases is caused a disease One of factor, current research discovery CNV is related with the pathogenic mechanism of many complex inheritance diseases or neurological susceptibility, including tumour, Acquired immunodeficiency syndrome, systemic loupus erythematosus, autoimmune inflammatory diseases etc..Clinically carry out copy number variation inspection Survey is necessary, can early discovery genome in large fragment DNA sequence variation, thus be disease diagnosing and treating Reference frame is provided.
There are many means and method of copy number variation detection at present, such as the method based on polymerase chain reaction, including more Heavy chain connects probe amplification technology and multiple amplifiable probe hybridization technique etc.;Method based on hybridization technique, including it is in situ immune Fluorescence and Gismsa Banded method etc.;Method based on chip technology, including mononucleotide polymorphism chip etc..These methods are not Only complicated for operation, resolution ratio is low, it is difficult to the specifying information of variation section is provided, and analysis throughput is lower, price costly, Cost performance is not very high.And with the fast development of two generation sequencing technologies, not only sequencing cost substantially reduces, and analysis throughput has Index improves, and resolution ratio can drop to Kb level, so that the copy number variation research of sub- microscopic level can be more deep Enter.The algorithm of detection CNV is essentially all and is developed based on genome sequencing (WGS) level at present, such as CNVkit, CNVnator, Control-FREEC etc., and consider detection accuracy, it generally can all require to be paired sample to detect CNV; Detection for single sample generally all can be that identification CNV is corrected according to sequencing depth and G/C content.And with target sequencing Demand is higher and higher, the stronger algorithm of some specific aims of also having come into being other than the algorithm mentioned before, such as PatternCNV, Ioncopy etc..But these methods will be compared and for excessive dependent on sequencing without exception Depth and G/C content, and it is also limited to the parameter setting of alignment parameters and parser, overall flow is relatively complicated and complicated, Experiment and analysis cost are also all higher.
Summary of the invention
In consideration of it, the present invention establishes a kind of method and calculating based on two generation sequencing technologies list pattern detections copy number variation Machine system.The present invention can not have to rely on comparison, sequencing depth, G/C content correction and pairing sample from sequencing initial data The factor that the conventional methods such as this must or require carries out copy number variation CNV detection to single sample.
Specifically, the present invention includes the following contents.
The first aspect of the present invention provides a kind of side based on two generation sequencing technologies using single pattern detection copy number variation Method comprising following steps:
(1) the first cdna sample database and the second cdna sample database are established, wherein the first cdna sample database Comprising A copy number mutant genes, the second cdna sample database includes B and does not occur to copy number variation in corresponding gene Gene, wherein A and B is respectively 50 or more natural number;Gene in first cdna sample database, which preferably comprises, to be copied Shellfish number variation, and the gene in the second cdna sample database preferably with the gene copy number in the first cdna sample database The position (i.e. region) of variation does not occur to copy number variation accordingly.It should be noted that in the second cdna sample database It includes copying number variation mutation that there may be differences with the mutation except copy number variable region in the first cdna sample in gene. In order to guarantee the reliability and accuracy of the method for the present invention, it is however generally that the natural number that need A and B respectively be 50 or more, It is preferred that 100 or more, more preferable 200 or more, further preferred 300 or more.The upper limit of A and B is not specially required.
It (2) is L by lengthjThe j copy number variable region of bp is divided with the sliding window of m bp size, step-length n Thus bp obtains i=L in each copy number variable regionj/ n seed sequence, wherein if Lj/ n is to divide exactly, then i is rounded, if Lj/ n is not to divide exactly, then i, which is rounded downwards, adds 1, amounts to and obtains the matrix being made of j*i seed sequence.In general, j be 1 with On natural number, preferably 10 or more, more preferable 30 or more.The natural number that m is 50 or more, more preferable 80 or more natural number, more It is preferred that m is L or less.N is 1 so that up to L natural number below, preferably 5 up to L natural number below.
(3) j*i seed sequence is carried out in the first cdna sample database and the second cdna sample database respectively Not fault-tolerant sufficient sequence matching obtains the matrix of j*i exact matching seed sequence number in each database.
(4) matrix that seed sequence number is exactly matched in each database is standardized, i.e., it is complete each With seed sequence number divided by the average of all exact matching seed sequence numbers of the copy number variable region.
(5) matrix of the exact matching seed sequence number after standardization is carried out mending the processing of 0 value, i.e., is become with copy number The seed sequence that different region obtains is the largest number of to compare, and the matrix value of remaining region deficiency number is set as 0.
(6) by A+B 0 value of benefit treated standardization exact matching seed sequence matrix number carry out mathematical modeling, according to Positive and negative result establishes data statistics model, finally obtains the mathematical model of the yin and yang attribute of judgement copy number variation.
(7) step (2)-(5) will be repeated to judgement sample, carry out copy number change using the resulting mathematical model of step (6) Different prediction and judgement is then judged as positive, otherwise is if predicted value is greater than 0.5, preferably greater than 0.6, more preferably greater than 0.8 It is negative.
Preferably, described in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation Cdna sample data source in first cdna sample database and the second cdna sample database in genome sequencing and/or The data that target area capture/amplification is sequenced.
Preferably, described in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation Copy number variation includes gene copy number amplification and/or missing.
Preferably, described in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation Copy number variation include chromosome euploid or aneuploid, insertion, missing, inversion and transposition and the insertion of DNA fragmentation, Missing, repetition, inversion or transposition.
Preferably, described in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation The length of DNA fragmentation is 1Kb or more, preferably 1.5Kb or more.On the other hand, preferably 10Kb is hereinafter, more preferably 8Kb or less.
Preferably, described in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation Gene is ERBB2.
Preferably, in the method for the invention based on two generation sequencing technologies using single pattern detection copy number variation, in step Suddenly data statistics model is established by logistic regression or deep learning algorithm in (6).
The second aspect of the present invention provides a kind of computer system comprising processor, and be configured as executing the present invention First aspect described in method.
Invention not only simplifies experiment and analytical procedures, reduce costs, and analyze result and conventional method have compared with High concordance rate, it is negative to the false positive of traditional method detection and vacation also by Clinical results (such as FISH is verified) are increased Property is effectively corrected.
Detailed description of the invention
Fig. 1 is a kind of exemplary process diagram of the present invention.
Specific embodiment
The existing various exemplary embodiment that the present invention will be described in detail, the detailed description are not considered as to limit of the invention System, and it is understood as the more detailed description to certain aspects of the invention, characteristic and embodiment.
It should be understood that it is to describe special embodiment that heretofore described term, which is only, it is not intended to limit this hair It is bright.In addition, for the numberical range in the present invention, it is thus understood that specifically disclose the range upper and lower bound and they it Between each median.Median and any other statement value in any statement value or stated ranges or in the range Lesser range is also included in the present invention each of between interior median.These small range of upper and lower bounds can be independent Ground includes or excludes in range.
Unless otherwise stated, all technical and scientific terms used herein has the routine in field of the present invention The normally understood identical meanings of technical staff.Although the present invention only describes preferred method and material, of the invention Implement or also can be used and similar or equivalent any method and material described herein in testing.The institute mentioned in this specification There is document to be incorporated by reference into, to disclosure and description method relevant to the document and/or material.It is incorporated to any When document conflicts, it is subject to the content of this specification.Unless otherwise stated, " % " or " amount " is the percentage based on weight Number.
Embodiment
It chooses 5 known ERBB2 amplification positives and 5 negative full exon group data of known ERBB2 amplification is surveyed Try analysis method of the invention.Specifically illustrated (such as flow chart 1) with 1 ERBB2 positive sample embodiment Sample1, remaining reality It applies example and repeats step 1-11, as follows:
1. collecting the full sequencing of extron group data of 272 ERBB2 gene magnification positives, it is collected simultaneously 1029 ERBB2 The full sequencing of extron group data of gene magnification feminine gender, and data are divided into training set and test set two parts;Wherein Training set includes 223 positive samples, and 817 negative samples, test set includes 49 positive samples, 212 negative samples This;
2. gene ERBB2 includes 27 full exons, by region Lj (0 < j < 28) bp of 27 amplifications or missing with 50bp The sliding window of size is divided, step-length 40bp, the available total i=Lj/40 seed sequence in region of each amplification or missing Column, wherein i is rounded if Lj/40 is to divide exactly, if Lj/40 is not to divide exactly, i, which is rounded downwards, adds 1, therefore available 27* in total I seed sequence matrix, wherein Lj be respectively 311bp, 152bp, 214bp, 135bp, 69bp, 116bp, 142bp, 120bp, 127bp、74bp、91bp、200bp、133bp、91bp、161bp、48bp、139bp、123bp、99bp、186bp、156bp、 76bp,147bp,98bp,189bp,253bp,974bp.Corresponding i is respectively 8,4,6,4,2,3,4,4,4,2,3,6,4,3,5, 2、4、4、3、5、4、2、4、3、5、7、25。
3. 27*i seed sequence is carried out not fault-tolerant complete sequence in the data of training 1040 samples of set respectively Column matching, the matrix of 27*i exact matching seed sequence number of available each sample.
4. the exact matching seed sequence matrix number of pair each sample is standardized, i.e., each exact matching seed Sequence number is divided by the amplification or the average of all exact matching seed sequence numbers of absent region.
5. the matrix of the exact matching seed sequence number after pair standardization carries out mending the processing of 0 value, i.e., to expand or lack area The seed sequence that domain obtains is the largest number of to compare, and the matrix value of remaining region deficiency number is set as 0.
6. by 1040 0 values of benefit treated standardization exact matching seed sequence number 27*25 matrix carry out mathematical modeling, 10 times of cross validations are carried out to 1040 samples first, and utilize convolutional neural networks (CNN) algorithm knot in deep learning It closes positive and negative result and hyper parameter, adjusting and optimizing is chosen to model, finally obtaining to training set AUC is 93.04%, to test The optimal mathematical model that set AUC is 94.54%, in this, as the model method for the new samples for judging same type data.Model Parameter is as shown in table 1.
Table 1- model parameter
7. by Sample1 repeat step 2-5, using 6 resulting optimal mathematical models carry out copy number variation prediction and Judgement, predicted value 0.9916596 are greater than 0.5, it is believed that are positive.
Sample2-Sample10 predicted value and judging result are as shown in table 2 below.
The prediction result of each sample of table 2- summarizes
Sample ID Predicted value Observation
Sample1 0.9916596 It is positive
Sample2 0.9989957 It is positive
Sample3 0.9999901 It is positive
Sample4 0.9990958 It is positive
Sample5 0.99751943 It is positive
Sample6 0.012844639 It is negative
Sample7 0.006111831 It is negative
Sample8 0.003521628 It is negative
Sample9 0.008016149 It is negative
Sample10 0.002645513 It is negative
Without departing substantially from the scope or spirit of the invention, the specific embodiment of description of the invention can be done more Kind improvements and changes, this will be apparent to those skilled in the art.Other realities obtained by specification of the invention Applying mode for technical personnel is apparent obtain.Present specification and embodiment are merely exemplary.

Claims (9)

1. a kind of utilize the method for single pattern detection copy number variation based on two generation sequencing technologies, which is characterized in that including following Step:
(1) the first cdna sample database and the second cdna sample database are established, wherein the first cdna sample database includes A Example copy number mutant gene, the second cdna sample database include the B bases for not occurring to copy number variation in corresponding position Cause, wherein A and B is respectively 50 or more natural number;
It (2) is L by lengthjThe j copy number variable region of bp is divided with the sliding window of m bp size, and step-length is n bp, thus I=L is obtained in each copy number variable regionj/ n seed sequence, wherein if Lj/ n is to divide exactly, then i is rounded, if Lj/ n is not Divide exactly, then i, which is rounded downwards, adds 1, amounts to and obtains the matrix being made of j*i seed sequence;
(3) j*i seed sequence is not allowed in the first cdna sample database and the second cdna sample database respectively Wrong sufficient sequence matching obtains the matrix of j*i exact matching seed sequence number in each database;
(4) matrix that seed sequence number is exactly matched in each database is standardized, i.e., each exact matching kind Subsequence number divided by all exact matching seed sequence numbers of the copy number variable region average;
(5) matrix of the exact matching seed sequence number after standardization is carried out mending the processing of 0 value, i.e., with copy number region of variability The seed sequence that domain obtains is the largest number of to compare, and the matrix value of remaining region deficiency number is set as 0;
(6) by A+B 0 value of benefit treated standardization exact matching seed sequence matrix number carry out mathematical modeling, according to yin, yang Property result establish data statistics model, finally obtain judgement copy number variation yin and yang attribute mathematical model;
(7) step (2)-(5) will be repeated to judgement sample, carry out copy number variation using step (6) resulting mathematical model Prediction and judgement are judged as positive if predicted value is greater than 0.5, otherwise are feminine gender.
2. according to claim 1 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies Sign is, the natural number that j is 1 or more.
3. according to claim 2 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies Sign is that the cdna sample data source in the first cdna sample database and the second cdna sample database is in complete The data that gene order-checking and/or target area capture/amplification are sequenced.
4. according to claim 3 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies Sign is that the copy number variation includes gene copy number amplification and/or missing.
5. according to claim 4 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies Sign is, described to copy the euploid or aneuploid, insertion, missing, inversion or transposition and DNA that number variation includes chromosome Insertion, missing, repetition, inversion or the transposition of segment.
6. according to claim 5 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies Sign is that the length of the DNA fragmentation is 1Kb or more.
7. according to claim 1 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies Sign is that the gene is ERBB2.
8. according to claim 1 utilize the method for single pattern detection copy number variation, spy based on two generation sequencing technologies Sign is, establishes data statistics model by logistic regression or deep learning algorithm in step (6).
9. a kind of computer system, which is characterized in that it includes processor, and be configured to execute according to claim 1- 8 described in any item methods.
CN201910541057.5A 2019-06-21 2019-06-21 Method and computer system for detecting copy number variation by using single sample based on second-generation sequencing technology Active CN110246543B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910541057.5A CN110246543B (en) 2019-06-21 2019-06-21 Method and computer system for detecting copy number variation by using single sample based on second-generation sequencing technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910541057.5A CN110246543B (en) 2019-06-21 2019-06-21 Method and computer system for detecting copy number variation by using single sample based on second-generation sequencing technology

Publications (2)

Publication Number Publication Date
CN110246543A true CN110246543A (en) 2019-09-17
CN110246543B CN110246543B (en) 2021-02-26

Family

ID=67888607

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910541057.5A Active CN110246543B (en) 2019-06-21 2019-06-21 Method and computer system for detecting copy number variation by using single sample based on second-generation sequencing technology

Country Status (1)

Country Link
CN (1) CN110246543B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111276189A (en) * 2020-02-26 2020-06-12 广州市金域转化医学研究院有限公司 Chromosome balance translocation detection and analysis system based on NGS and application thereof
CN112634987A (en) * 2020-12-25 2021-04-09 北京吉因加医学检验实验室有限公司 Method and device for detecting copy number variation of single-sample tumor DNA
CN113736865A (en) * 2021-09-09 2021-12-03 元码基因科技(北京)股份有限公司 Kit, reaction system and method for detecting gene copy number variation in sample
CN114496300A (en) * 2021-12-20 2022-05-13 北京优迅医学检验实验室有限公司 Method and device for clinical annotation of copy number variation pathogenicity

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130184999A1 (en) * 2012-01-05 2013-07-18 Yan Ding Systems and methods for cancer-specific drug targets and biomarkers discovery
CN106372459A (en) * 2016-08-30 2017-02-01 天津诺禾致源生物信息科技有限公司 Method and device for detecting copy number variation based on amplicon next generation sequencing
CN108073791A (en) * 2017-12-12 2018-05-25 元码基因科技(北京)股份有限公司 Method based on two generation sequencing datas detection target gene structure variation
CN108256289A (en) * 2018-01-17 2018-07-06 湖南大地同年生物科技有限公司 A kind of method based on target area capture sequencing genomes copy number variation
CN108304694A (en) * 2018-01-30 2018-07-20 元码基因科技(北京)股份有限公司 Method based on two generation sequencing data analyzing gene mutations
CN108427864A (en) * 2018-02-14 2018-08-21 南京世和基因生物技术有限公司 A kind of detection method, device and the computer-readable medium of copy number variation
CN108920899A (en) * 2018-06-10 2018-11-30 杭州迈迪科生物科技有限公司 A kind of single exon copy number variation prediction technique based on target area sequencing
CN110808084A (en) * 2019-09-19 2020-02-18 西安电子科技大学 A copy number variation detection method based on single-sample next-generation sequencing data

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130184999A1 (en) * 2012-01-05 2013-07-18 Yan Ding Systems and methods for cancer-specific drug targets and biomarkers discovery
CN106372459A (en) * 2016-08-30 2017-02-01 天津诺禾致源生物信息科技有限公司 Method and device for detecting copy number variation based on amplicon next generation sequencing
CN108073791A (en) * 2017-12-12 2018-05-25 元码基因科技(北京)股份有限公司 Method based on two generation sequencing datas detection target gene structure variation
CN108256289A (en) * 2018-01-17 2018-07-06 湖南大地同年生物科技有限公司 A kind of method based on target area capture sequencing genomes copy number variation
CN108304694A (en) * 2018-01-30 2018-07-20 元码基因科技(北京)股份有限公司 Method based on two generation sequencing data analyzing gene mutations
CN108427864A (en) * 2018-02-14 2018-08-21 南京世和基因生物技术有限公司 A kind of detection method, device and the computer-readable medium of copy number variation
CN108920899A (en) * 2018-06-10 2018-11-30 杭州迈迪科生物科技有限公司 A kind of single exon copy number variation prediction technique based on target area sequencing
CN110808084A (en) * 2019-09-19 2020-02-18 西安电子科技大学 A copy number variation detection method based on single-sample next-generation sequencing data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MIN ZHAO.ET AL: ""Computational tools for copy number variation"", 《BMC BIOINFORMATICS》 *
刘永壮: ""基于高通量测序数据的基因组变异检测"", 《中国博士学位论文全文数据库(电子期刊)》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111276189A (en) * 2020-02-26 2020-06-12 广州市金域转化医学研究院有限公司 Chromosome balance translocation detection and analysis system based on NGS and application thereof
CN112634987A (en) * 2020-12-25 2021-04-09 北京吉因加医学检验实验室有限公司 Method and device for detecting copy number variation of single-sample tumor DNA
CN113736865A (en) * 2021-09-09 2021-12-03 元码基因科技(北京)股份有限公司 Kit, reaction system and method for detecting gene copy number variation in sample
CN114496300A (en) * 2021-12-20 2022-05-13 北京优迅医学检验实验室有限公司 Method and device for clinical annotation of copy number variation pathogenicity

Also Published As

Publication number Publication date
CN110246543B (en) 2021-02-26

Similar Documents

Publication Publication Date Title
CN110246543A (en) The method and computer system of single pattern detection copy number variation are utilized based on two generation sequencing technologies
Oldham et al. Network methods for describing sample relationships in genomic datasets: application to Huntington’s disease
JP4594622B2 (en) Drug discovery method
US20020095260A1 (en) Methods for efficiently mining broad data sets for biological markers
WO2022170909A1 (en) Drug sensitivity prediction method, electronic device and computer-readable storage medium
US20210090686A1 (en) Single cell rna-seq data processing
CN109310332A (en) Method for analyzing numerical data
He et al. Microarrays—the 21st century divining rod?
CN112289376B (en) Method and device for detecting somatic cell mutation
Rabier et al. On the inference of complex phylogenetic networks by Markov Chain Monte-Carlo
WO2021086595A1 (en) Using machine learning-based trait predictions for genetic association discovery
CN113936737A (en) Method, system and equipment for comparing RNA structures based on RNA motif vectors
Nanguneri et al. Characterization of nanoscale organization of f-actin in morphologically distinct dendritic spines in vitro using supervised learning
CN113362894A (en) Method for predicting syndromal cancer driver gene
Padmanaban et al. Between-tumor and within-tumor heterogeneity in invasive potential
Gill et al. Multi‐trait genomic selection improves the prediction accuracy of end‐use quality traits in hard winter wheat
CN110010195A (en) A kind of method and device detecting single nucleotide mutation
Wei et al. DMSC: a dynamic multi-seeds method for clustering 16S rRNA sequences into OTUs
JP2007504542A (en) How to process biological data
Rao et al. Partial correlation based variable selection approach for multivariate data classification methods
CN114317725B (en) Crohn&#39;s disease biomarker, kit and screening method for biomarker
CN110164504A (en) Processing method, device and the electronic equipment of two generation sequencing datas
Kuchroo et al. spARC recovers human glioma spatial signaling networks with graph filtering
Wang et al. ELLA: Modeling Subcellular Spatial Variation of Gene Expression within Cells in High-Resolution Spatial Transcriptomics
WO2023277932A1 (en) Detection of human leukocyte antigen loss of heterozygosity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant