CN114943735A - Tongue picture image-based tumor prediction system and method and application thereof - Google Patents
Tongue picture image-based tumor prediction system and method and application thereof Download PDFInfo
- Publication number
- CN114943735A CN114943735A CN202210861997.4A CN202210861997A CN114943735A CN 114943735 A CN114943735 A CN 114943735A CN 202210861997 A CN202210861997 A CN 202210861997A CN 114943735 A CN114943735 A CN 114943735A
- Authority
- CN
- China
- Prior art keywords
- positive
- tongue
- probability
- image
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 206010028980 Neoplasm Diseases 0.000 title claims abstract description 102
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000012360 testing method Methods 0.000 claims abstract description 116
- 238000012545 processing Methods 0.000 claims abstract description 15
- 208000005718 Stomach Neoplasms Diseases 0.000 claims description 100
- 206010017758 gastric cancer Diseases 0.000 claims description 100
- 201000011549 stomach cancer Diseases 0.000 claims description 100
- 238000012549 training Methods 0.000 claims description 39
- 230000006870 function Effects 0.000 claims description 35
- 239000013598 vector Substances 0.000 claims description 24
- 238000013136 deep learning model Methods 0.000 claims description 23
- 206010009944 Colon cancer Diseases 0.000 claims description 19
- 208000001333 Colorectal Neoplasms Diseases 0.000 claims description 19
- 208000000461 Esophageal Neoplasms Diseases 0.000 claims description 19
- 206010058467 Lung neoplasm malignant Diseases 0.000 claims description 19
- 206010030155 Oesophageal carcinoma Diseases 0.000 claims description 19
- 201000004101 esophageal cancer Diseases 0.000 claims description 19
- 201000005202 lung cancer Diseases 0.000 claims description 19
- 208000020816 lung neoplasm Diseases 0.000 claims description 19
- 238000009826 distribution Methods 0.000 claims description 17
- 206010006187 Breast cancer Diseases 0.000 claims description 14
- 208000026310 Breast neoplasm Diseases 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 13
- 230000004927 fusion Effects 0.000 claims description 13
- 230000008569 process Effects 0.000 claims description 12
- 230000002452 interceptive effect Effects 0.000 claims description 8
- 238000002372 labelling Methods 0.000 claims description 8
- 238000004422 calculation algorithm Methods 0.000 claims description 7
- 230000002349 favourable effect Effects 0.000 claims description 7
- 206010061902 Pancreatic neoplasm Diseases 0.000 claims description 6
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 claims description 6
- 201000002528 pancreatic cancer Diseases 0.000 claims description 6
- 208000008443 pancreatic carcinoma Diseases 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 4
- 230000009286 beneficial effect Effects 0.000 claims description 3
- 238000013100 final test Methods 0.000 claims description 3
- 238000003709 image segmentation Methods 0.000 claims description 3
- 210000004185 liver Anatomy 0.000 claims description 3
- 206010029260 Neuroblastoma Diseases 0.000 claims description 2
- 206010033128 Ovarian cancer Diseases 0.000 claims description 2
- 206010061535 Ovarian neoplasm Diseases 0.000 claims description 2
- 206010060862 Prostate cancer Diseases 0.000 claims description 2
- 208000000236 Prostatic Neoplasms Diseases 0.000 claims description 2
- 208000024770 Thyroid neoplasm Diseases 0.000 claims description 2
- 230000005540 biological transmission Effects 0.000 claims description 2
- 238000011524 similarity measure Methods 0.000 claims description 2
- 206010041823 squamous cell carcinoma Diseases 0.000 claims description 2
- 201000002510 thyroid cancer Diseases 0.000 claims description 2
- 208000029387 trophoblastic neoplasm Diseases 0.000 claims description 2
- 238000003745 diagnosis Methods 0.000 abstract description 59
- 238000012216 screening Methods 0.000 abstract description 23
- 238000013135 deep learning Methods 0.000 abstract description 5
- 238000011156 evaluation Methods 0.000 abstract description 3
- 239000000523 sample Substances 0.000 description 57
- 208000024200 hematopoietic and lymphoid system neoplasm Diseases 0.000 description 30
- 238000012795 verification Methods 0.000 description 28
- 230000035945 sensitivity Effects 0.000 description 19
- 238000010200 validation analysis Methods 0.000 description 18
- 238000003066 decision tree Methods 0.000 description 15
- 238000012706 support-vector machine Methods 0.000 description 15
- 230000000694 effects Effects 0.000 description 14
- 238000013473 artificial intelligence Methods 0.000 description 11
- 201000011510 cancer Diseases 0.000 description 9
- 102000012406 Carcinoembryonic Antigen Human genes 0.000 description 8
- 108010022366 Carcinoembryonic Antigen Proteins 0.000 description 8
- 102000008857 Ferritin Human genes 0.000 description 8
- 238000008416 Ferritin Methods 0.000 description 8
- 108050000784 Ferritin Proteins 0.000 description 8
- 101000623901 Homo sapiens Mucin-16 Proteins 0.000 description 8
- 102100023123 Mucin-16 Human genes 0.000 description 8
- 102000013529 alpha-Fetoproteins Human genes 0.000 description 8
- 108010026331 alpha-Fetoproteins Proteins 0.000 description 8
- 239000000427 antigen Substances 0.000 description 7
- 102000036639 antigens Human genes 0.000 description 7
- 108091007433 antigens Proteins 0.000 description 7
- 201000010099 disease Diseases 0.000 description 7
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000011160 research Methods 0.000 description 7
- 208000004300 Atrophic Gastritis Diseases 0.000 description 6
- 208000007882 Gastritis Diseases 0.000 description 6
- 208000036495 Gastritis atrophic Diseases 0.000 description 6
- 208000016644 chronic atrophic gastritis Diseases 0.000 description 6
- 239000003814 drug Substances 0.000 description 6
- 239000000439 tumor marker Substances 0.000 description 6
- 238000012800 visualization Methods 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 239000011248 coating agent Substances 0.000 description 4
- 238000000576 coating method Methods 0.000 description 4
- 230000007547 defect Effects 0.000 description 4
- 230000037213 diet Effects 0.000 description 4
- 235000005911 diet Nutrition 0.000 description 4
- 238000013399 early diagnosis Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 210000002966 serum Anatomy 0.000 description 4
- 230000000391 smoking effect Effects 0.000 description 4
- 208000024891 symptom Diseases 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000035622 drinking Effects 0.000 description 3
- 210000001035 gastrointestinal tract Anatomy 0.000 description 3
- 238000002575 gastroscopy Methods 0.000 description 3
- 208000024172 Cardiovascular disease Diseases 0.000 description 2
- 206010027476 Metastases Diseases 0.000 description 2
- 102000007066 Prostate-Specific Antigen Human genes 0.000 description 2
- 108010072866 Prostate-Specific Antigen Proteins 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 102000012086 alpha-L-Fucosidase Human genes 0.000 description 2
- 108010061314 alpha-L-Fucosidase Proteins 0.000 description 2
- 230000034994 death Effects 0.000 description 2
- 231100000517 death Toxicity 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 208000015181 infectious disease Diseases 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000004393 prognosis Methods 0.000 description 2
- 108010088201 squamous cell carcinoma-related antigen Proteins 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 208000018556 stomach disease Diseases 0.000 description 2
- 238000001356 surgical procedure Methods 0.000 description 2
- 230000004083 survival effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 206010009900 Colitis ulcerative Diseases 0.000 description 1
- 208000002699 Digestive System Neoplasms Diseases 0.000 description 1
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 1
- 102100028652 Gamma-enolase Human genes 0.000 description 1
- 102100021022 Gastrin Human genes 0.000 description 1
- 108010052343 Gastrins Proteins 0.000 description 1
- 101001133056 Homo sapiens Mucin-1 Proteins 0.000 description 1
- 102100034256 Mucin-1 Human genes 0.000 description 1
- 102000012288 Phosphopyruvate Hydratase Human genes 0.000 description 1
- 108010022181 Phosphopyruvate Hydratase Proteins 0.000 description 1
- 206010041067 Small cell lung cancer Diseases 0.000 description 1
- 201000006704 Ulcerative Colitis Diseases 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 108010036226 antigen CYFRA21.1 Proteins 0.000 description 1
- 102000015736 beta 2-Microglobulin Human genes 0.000 description 1
- 108010081355 beta 2-Microglobulin Proteins 0.000 description 1
- 239000012472 biological sample Substances 0.000 description 1
- 238000001815 biotherapy Methods 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 230000005773 cancer-related death Effects 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- AOXOCDRNSPFDPE-UKEONUMOSA-N chembl413654 Chemical compound C([C@H](C(=O)NCC(=O)N[C@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@H](CCSC)C(=O)N[C@H](CC(O)=O)C(=O)N[C@H](CC=1C=CC=CC=1)C(N)=O)NC(=O)[C@@H](C)NC(=O)[C@@H](CCC(O)=O)NC(=O)[C@@H](CCC(O)=O)NC(=O)[C@@H](CCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H]1N(CCC1)C(=O)CNC(=O)[C@@H](N)CCC(O)=O)C1=CC=C(O)C=C1 AOXOCDRNSPFDPE-UKEONUMOSA-N 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 238000000546 chi-square test Methods 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000002496 gastric effect Effects 0.000 description 1
- 201000005787 hematologic cancer Diseases 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000009401 metastasis Effects 0.000 description 1
- 201000011591 microinvasive gastric cancer Diseases 0.000 description 1
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 239000013610 patient sample Substances 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 238000001959 radiotherapy Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 210000001210 retinal vessel Anatomy 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 208000000587 small cell lung carcinoma Diseases 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000002626 targeted therapy Methods 0.000 description 1
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Public Health (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Quality & Reliability (AREA)
- Pathology (AREA)
- Radiology & Medical Imaging (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of oncology diagnosis, prediction and evaluation, in particular to a tongue picture image-based tumor prediction system, a tongue picture image-based tumor prediction method and application thereof, wherein the system comprises the following steps: a tongue image acquisition module configured to acquire a tongue image of a test sample; a data processing module configured to obtain a probability that a test specimen is positive by: and predicting the probability of the test sample being positive according to the distinguishable characteristics on the tongue picture image obtained by automatic learning. The system also includes an output module. The system aims to apply AI deep learning and diagnose and predict the tumor based on the tongue image, is simple to operate, low in cost and painless and noninvasive, and is a prospective, economical, noninvasive and effective screening system for the tumor by verifying that a large number of test cases prove that the prediction system is applied to the tumor.
Description
Technical Field
The invention relates to the technical field of diagnosis, prediction and evaluation of oncology, in particular to a tongue picture based tumor prediction system, a tongue picture based tumor prediction method and application thereof.
Background
According to the latest data, stomach cancer (GC) is the third leading cause of cancer related death worldwide, and 109 ten thousands of new GC cases and 77 ten thousands of deaths occur in 2020, wherein 48 thousands of new cases and 37 thousands of deaths occur in China, and account for about half of the worldwide cases. GC diagnosis and screening still relies on gastroscopy, but its use is greatly limited due to its invasive nature, high cost and the need for a specialized endoscopist. In addition, due to the lack of specific symptoms in the early stage of gastric cancer, the specificity and sensitivity of clinical disease markers are poor, and more than 60% of patients develop local or distant metastasis at the time of diagnosis. The 5-year survival rate of patients with local early GC exceeds 60%, while the 5-year survival rate of patients with local, distant metastases drops significantly to 30% and 5%, respectively. Therefore, there is an urgent need for new GC diagnosis or screening methods to improve the early diagnosis rate and prognosis effect of this population.
The traditional Chinese medicine is the medical science and cultural heritage which are applied and reserved by the experience of Chinese people for thousands of years, and the tongue picture diagnosis is one of the important bases for the traditional Chinese medicine to diagnose diseases. The theory of traditional Chinese medicine holds that the change of tongue manifestation (color, size and shape of tongue, color, thickness and water content of tongue coating) can reflect the health condition of human body, and is particularly closely related to stomach diseases. However, no research has been carried out to confirm the correspondence between tongue changes and GC, and the value of tongue changes in GC diagnosis and screening.
Artificial Intelligence (AI) is useful for screening, diagnosing and treating various diseases, and schung CY et al have developed a deep learning system (see reference) that can effectively predict the risk of cardiovascular disease by measuring the caliber of retinal blood vessels to assess the risk of cardiovascular disease. The Takenaka K et al scholars developed a deep neural network (see reference) for evaluating endoscopic images of ulcerative colitis patients that identified patients with endoscopic and histological remission with an accuracy of 90.1% and an accuracy of 92.9%.
The patent CN110251084A of Fuzhou data technology research institute Limited provides a tongue image detection and identification method based on artificial intelligence, which is used for solving the problems of real-time detection, shooting, storage and uploading of a tongue body of a tongue image in the tongue image acquisition process and identifying the tongue color, the tongue shape, the tongue quality and the tongue fur color of the tongue image; the scheme mainly relates to a tongue image acquisition and identification technology, wherein the tongue image identification is more focused on extracting the characteristics of tongue image color, texture, tongue coating area or tongue coating thickness and the like, however, the corresponding relation between the tongue image information and a special stomach disease such as gastric cancer is not established in the work.
Shenyang Zhi Lun technology Limited patent CN111710394A proposes an artificial intelligence assisted early gastric cancer screening system, which automatically replaces manual analysis of gastroscope slice images to solve the problem of large workload of positive determination of gastric cancer; however, according to the strategy based on gastroscope image analysis, a large number of gastroscope images acquired by professional instruments are still required to be obtained for model learning, a decision is still required to be made according to the gastroscope image of each tester in the testing stage, and the gastroscope image acquisition still has the defects of large time consumption, high material cost, high testing population standard and the like, so that nationwide screening is difficult to achieve.
Jiangsu Tianrui accurate medical science and technology Limited company CN112133427A provides an artificial intelligence-based gastric cancer auxiliary diagnosis system, which comprises: the system can give diagnosis results in a personalized way according to the acquired data of the patient. The data according to diagnosis of the diagnosis system comprises basic information, life diet, infection history, disease history, family history, clinical symptoms, inspection items and the like of a patient, wherein the collection difficulty of the data such as the clinical symptoms and the inspection items is high, and the previous screening and diagnosis effects can be influenced by depending on the basic information, the life diet, the infection history, the disease history, the family history and the like.
Reference documents:
Cheung CY, Xu D, Cheng CY, et al. A deep-learning system for the assessment of cardiovascular disease risk via the measurement of retinal-vessel calibre. Nature biomedical engineering 2021;5(6):498-508. doi: 10.1038/s41551-020-00626-4 [published Online First: 2020/10/14];
Takenaka K, Ohtsuka K, Fujii T, et al. Development and Validation of a Deep Neural Network for Accurate Evaluation of Endoscopic Images From Patients With Ulcerative Colitis. Gastroenterology 2020;158(8):2150-57. doi: 10.1053/j.gastro.2020.02.012 [published Online First: 2020/02/16]。
the present invention seeks to address these and other needs in the art.
Disclosure of Invention
In order to solve at least one of the above technical problems mentioned in the background art, the present invention aims to provide a tongue image-based tumor prediction system, which aims to apply an AI deep learning model to diagnose and predict tumors based on tongue images, is simple in operation, low in cost, and painless and noninvasive, and verifies through a large number of test cases that the prediction system is a prospective, economic, noninvasive, effective screening and diagnosis prediction system for tumors.
At present, the application of artificial intelligence on the tongue in traditional Chinese medicine image diagnosis mainly focuses on the standardization of tongue feature extraction to eliminate differences caused by human interpretation. AI deep learning is applied for the first time to establish a GC diagnosis model based on tongue picture images, and the value of the GC diagnosis model is evaluated, thereby providing scientific basis for the tongue picture diagnosis theory of traditional Chinese medicine.
The invention is directed to a tongue picture-based tumor prediction system, comprising:
a tongue image acquisition module configured to acquire a tongue image of a test specimen;
a data processing module configured to obtain a probability that a test sample is positive by:
and predicting the probability of the test sample being positive according to the distinguishable characteristics on the tongue picture image obtained by automatic learning.
In a specific embodiment, the tumor is at least one of gastric cancer, breast cancer, colorectal cancer, esophageal cancer, hepatobiliary pancreatic cancer, lung cancer, prostate cancer, thyroid cancer, ovarian cancer, neuroblastoma, trophoblastic tumor, or head and neck squamous carcinoma.
In a specific embodiment, the tumor is at least one of gastric cancer, breast cancer, colorectal cancer, esophageal cancer, hepatobiliary pancreatic cancer, or lung cancer.
In a particular embodiment, the system further includes an output module configured to output the prediction.
In one embodiment, the output module is configured to output the tongue image and the prediction result.
In one embodiment, the output module outputs in at least one mode of electronic display, sound broadcasting, printing and network transmission.
In one embodiment, the discriminative features are from between positive and negative categories on the tongue image. The method aims to obtain the characteristic of discriminability between positive categories and negative categories by fully comparing, analyzing and learning the commonness and difference between and in positive tongue picture images and/or negative tongue picture images, and the probability that the test sample belongs to the positive can be judged by deeply distinguishing the characteristic of discriminability between the positive categories and the negative categories on the tongue picture images of the test sample, so that the tumor prediction of the test sample can be realized through the tongue picture images. The characteristic of the discriminability can be from the commonality and the difference between the positive tongue picture image and the negative tongue picture image, and also from the commonality and the difference between the positive category and the negative category on the single tongue picture image, namely, the characteristic of the discriminability between the positive category and the negative category obtained from the tongue picture can be used for predicting whether the test sample belongs to the positive category or the negative category.
The discriminative features are derived from positive tongue images and negative tongue images of the paired input interactive deep learning model.
In a particular embodiment, the data processing module is specifically configured to predict the probability that a test sample is positive by:
and fully comparing the positive tongue picture image and the negative tongue picture image which are simultaneously input into the interactive deep learning model, automatically learning the commonness and difference between the positive category and the negative category on the tongue picture image, and predicting the probability of the test sample belonging to the positive according to the characteristic of the discriminability between the positive category and the negative category. The scheme of this section aims to obtain the characteristic of discriminability between the positive type and the negative type by sufficiently comparing, analyzing and learning the common property and difference between the positive tongue image and the negative tongue image, and the probability that the tongue image of the test sample input into the model belongs to the positive type can be predicted according to the characteristic of discriminability, so that the model capable of obtaining the characteristic of discriminability between the positive type and the negative type by comparing, analyzing and learning the common property and difference between the positive tongue image and the negative tongue image can be applied to the scheme of this section and also included in the protection scope of the scheme of this section, in particular, the application selects but is not limited to the example analysis by using the APINet model.
In a specific embodiment, the positive tongue image is acquired from a tumor-positive patient.
In a specific embodiment, the negative tongue image is acquired from a tumor-negative patient.
In a specific embodiment, the interactive deep learning model is an APINet model.
In a particular embodiment, the data processing module is particularly configured to obtain the probability that a test sample is positive by:
1) extracting positive features and negative features from a pre-obtained positive tongue image and a pre-obtained negative tongue image;
2) training a model by using positive characteristics and negative characteristics, and outputting the probability that the characteristics belong to each category;
3) and inputting the tongue picture image of the test sample into the trained model, and outputting the probability that the test sample is positive.
In a specific embodiment, the step of extracting positive features and negative features in step 1) includes:
the encoder extracts the feature vector of the image and outputs a positive feature f 1 And negative characteristics f 2 ;
Will f is 1 And f 2 And its characteristics after splicing f m Simultaneously inputting MLP of the feature selection area and correspondingly outputting two control vectors g 1 And g 2 ;
g 1 Separately activate f 1 And f 2 Forming selected features f 1 + And f 2 - ,g 2 Separately activate f 1 And f 2 Forming selected features f 1 - And f 2 + Obtaining two positive features f 1 + And f 1 - With two negative features f 2 + And f 2 - 。
In one embodiment, the MLP of the feature selection regions is fully learned f 1 And f m And outputs a control vector g 1 Similarly, study f 2 And f m And outputs a control vector g 2 。
In a specific embodiment, the training of the model with the positive features and the negative features in step 2) is to input the positive features and the negative features into a full-connected-layer classifier, and output probabilities that the features belong to the respective classes respectively.
In a specific embodiment, when the output features in step 2) are classified into the probabilities of the categories, the cross entropy loss function is minimized according to the categories with four features:
where y is the true label corresponding to the feature, and the function φ c Represents the final full connection layer classifier, f i k Corresponding to the 4 features of the input.
Note that f 1 + Is the control vector g corresponding to the positive feature 1 Activated, thus containing positive characteristic information, and f 1 - The control vector g corresponding to the negative feature 2 Activated, and therefore containing negative characteristic information, the same applies to 2 + And f 2 - 。
In a specific embodiment, when the probabilities of the output features in the step 2) belong to the respective categories, the model pair feature f is considered i + The confidence level of the output should be higher than the feature f i - Minimizing the ordering penalty function:
wherein p is i - And p i + Is a characteristic f i - And f i + The probability distribution over each class, ϵ ∈ [0,1], output by the classifier]Is a specified hyper-parameter, p (c) refers to the probability over a specified category c.
In one embodiment, the step 3) of inputting the tongue image of the test sample into the trained model means inputting a single tongue image of the test sample.
In a specific embodiment, the outputting of the probability that the test sample belongs to the category in step 3) is to finally output a probability distribution of the corresponding test sample on each category, and the category corresponding to the highest probability is taken as the predicted category.
In one embodiment, only the circumscribed rectangle part of the tongue surface area in the tongue image is used for training and testing, so that the influence of the image background on the model can be effectively eliminated.
In a specific embodiment, in the training process, in order to enrich the sample space of the training set, samples in the training set are randomly turned over with a certain probability, sub-images are cut at random positions on the images, finally, linear interpolation is carried out to form images with fixed sizes, and the images are input into the interactive deep learning model after standardization.
In the scheme, only a pair of samples (including a positive tongue image and a negative tongue image) are fully compared to find the commonness and difference of the positive tongue image and the negative tongue image), the paired images are used as input to simulate a real scene, an encoder extracts image feature vectors and then outputs positive features and negative features, the spliced features are combined to finally output a pair of positive features and a pair of negative features, the positive features and the negative features are input to a full-connection layer classifier to output the probability that the features belong to each category respectively, and meanwhile, a cross entropy loss function and a sequencing loss function are minimized to achieve the purpose of training the model. When in test, the tongue picture image of the test sample is input into the system to obtain the probability of tumor positivity, the difference between the positivity and the negativity of the tongue picture image is deeply analyzed, the internal association of the tumor and the tongue picture information is learned based on the deep learning technology, and the probability of tumor positivity is automatically judged aiming at the problems of low tumor early screening accuracy, higher diagnosis strategy cost and the like, so as to screen out the population with high tumor incidence.
The distinguishable characteristic is from a single positive tongue image or a single negative tongue image.
In one embodiment, the interpretable features are obtained by cutting the tongue picture image into n small blocks and then performing feature extraction on the n small blocks to obtain deep features which are favorable for classification.
In a particular embodiment, the data processing module is particularly configured to obtain the probability that a test sample is positive by:
cutting the tongue picture image of the test sample into small blocks, forming an input vector through linear mapping, adding a position index, importing the input vector into a trained deep learning model for feature extraction and feature fusion, outputting the selected deep features which are favorable for classification, and obtaining the probability of each category.
In one embodiment, the deep learning model is trained by:
a) cutting a tongue surface image into n small blocks, then sequentially forming an input sequence by the n small blocks to form an input sequence with the length of n, forming an input vector through linear mapping, and adding position indexes of 0,1,2, … and n-1;
b) and performing feature extraction and feature fusion by using a transFG model-based encoder, outputting the selected deep features favorable for classification, and finally outputting the probability distribution of the deep features belonging to each class through a softmax classifier.
In a specific embodiment, the cutting of the tongue surface image into n small blocks in the step a) means cutting the tongue surface image into n square areas which do not overlap with each other.
In one embodiment, the encoder of step b) performs feature extraction, which includes L +1 transform layers, and includes a self-attention mechanism inside each layer.
In an embodiment, when the encoder in step b) performs feature extraction and feature fusion, in order to remove redundant features, before depth features are input to the last layer, a feature selection module including a multi-head attention mechanism performs region selection, the feature selection module returns an index of a front-row feature with the largest attention weight, and the selected front-row feature is input to the last layer of transform layer for feature fusion.
In a specific embodiment, the front features are the first k features, and k is one of 1,2,3, … …, and 20.
In a specific embodiment, said k =12 is as described above.
In one embodiment, when the output deep features in step b) belong to probability distributions of respective categories, the cross entropy loss function is minimized:
wherein, y i Is an element in the real one-hot label corresponding to the test sample,is the model predicts as class y i The probability of (c). The one-hot labels are labels in the form of 0 and 1 vectors, for example, the labels in the form of one-hot corresponding to the categories 0,1 and 2 are (1,0,0), (0,1,0) and (0,0,1) in three categories.
In one embodiment, when the output deep features in step b) belong to probability distributions of different categories, the contrast loss function is minimized:
wherein N represents the size of the batch during training and D represents the characteristic f i And f j A similarity measure of (c). All negative and positive data pairs are selected within a training batch to minimize contrast loss, so that intra-class features are more aggregated and inter-class feature differences are larger, thereby improving prediction accuracy.
According to the scheme, the tongue picture image is cut into small non-overlapping areas, then the small non-overlapping areas are sequentially formed into a sequence, an input vector is formed through linear mapping, the sequence is input into a TransFG model to be subjected to feature extraction and feature fusion, deep features beneficial to classification are generated, probabilities of the deep features belonging to various categories are output through a softmax classifier, the prediction of the categories of the tongue picture image is completed, and the tumor positive probability of a screening test sample is automatically predicted through an automatic learning mode of a deep learning model.
The characteristic of the interpretability comes from each pixel point of the tongue picture image.
In a particular embodiment, the data processing module is particularly configured to obtain the probability that a test sample is positive by:
inputting a tongue picture image of a test sample into a trained deep learning model, outputting the probability that each pixel point belongs to a positive background, a negative background and a maximum probability category as the prediction category of the pixel point;
the number of the pixels predicted to be positive/(the number of the pixels predicted to be positive + the number of the pixels predicted to be negative) in the test sample is the probability that the test sample is positive.
In a specific embodiment, in the process of obtaining the probability that the test sample is positive, if the number of the pixels predicted to be positive is greater than the number of the pixels predicted to be negative, the test sample is finally predicted to be positive; otherwise, the prediction is negative.
In one embodiment, the deep learning model is trained by:
respectively labeling the positive tongue picture image or the negative tongue picture image pixel by pixel, specifically respectively standardizing the positive tongue surface region pixel, the negative tongue surface region pixel and the background region pixel;
the whole algorithm framework adopts an automatic coding-decoding structure, an image coder is used for coding the characteristics of the whole image, and a characteristic decoder outputs a probability map of the whole image;
and calculating the loss value of each pixel point through the real label of each pixel point and the prediction probability in the probability graph, and updating the model parameters until the training is completed.
In one embodiment, the pixel point of the positive tongue surface area is labeled as 2, the pixel point of the negative tongue surface area is labeled as 1, and the pixel point of the background area is labeled as 0.
In a specific embodiment, a deplabv 3+ image segmentation network structure and/or a Unet series network structure are/is adopted in the automatic encoding-decoding structure. The automatic coding-decoding framework capable of generating the probability map can be applied to the invention, so that more depth network structures can be selected, the Deeplab V3+ model is preferentially selected in the invention, and other frameworks capable of generating the probability map by automatic coding-decoding can also achieve the aim of the invention, such as a Unet series network structure commonly used in medical image processing.
In a specific embodiment, a category interpretation module is added after an output layer of a network structure in the automatic coding-decoding structure, and a final test result is decided based on a probability map of the whole image.
In a specific embodiment, the category interpretation module adopts an interpretation strategy shown in the following formula according to the probability map:
wherein m is the total number of pixel points; the function I is an indication function, when the condition is met, the function value is 1, otherwise, the function value is 0; and t is a pixel class judgment threshold, wherein t belongs to [0,1 ].
In a specific embodiment, t =0.5 in the above interpretation policy formula.
In one embodiment, in the deep learning model training process, a cross-entropy cost function predicted pixel by pixel is adopted:
wherein the true label category of pixel point i is c (positive, negative, background), and P c (i) Representing the probability that pixel i is predicted as category c.
Marking the collected tongue image of different cases with tumor positive and negative respectively according to the determined clinical diagnosis result, obtaining enough marking data necessary for optimizing the learning model and predicting accuracy, adopting a pixel-by-pixel marking mode aiming at the tongue image, using the existing marking tool to outline the tongue area, then giving labels to each positive area pixel point, negative area pixel point and background area pixel point, outputting a probability map through an automatic coding-decoding structure, then calculating the loss difference value of the true marking of each pixel point and the prediction probability in the probability map, updating model parameters to complete model training, inputting the test sample tongue image into a model, automatically judging the probability of tumor positive, screening out the population with high tumor incidence, overcoming the defects of high data collection cost, high tumor early diagnosis cost, low tumor prediction cost and the like, The defects of large-scale general survey and the like are difficult to realize, and the accuracy of the internal test reaches 86.6 percent, so the method has higher clinical application value.
A method for predicting a tumor based on a tongue image, comprising:
obtaining a tongue image of the test sample;
inputting the tongue picture image of the test sample into the system to obtain the tumor positive probability of the test sample.
The application of the system and/or the method for predicting the tumor based on the tongue picture image comprises the following steps:
tumor prediction is performed on a test sample using the system and/or method.
The above-described preferred conditions may be combined with each other to obtain a specific embodiment, in accordance with common knowledge in the art.
The invention has the beneficial effects that:
the invention provides a plurality of tumor prediction systems based on tongue picture images, which take non-biological sample tongue picture images as direct implementation objects, can exert excellent diagnosis and prediction functions on a plurality of tumors by analyzing and learning the commonness and difference between positive characteristics and negative characteristics in the tongue picture images, and can test and predict the gastric cancer with the accuracy rate of about 80 percent, the sensitivity of 0.741-0.826 and the accuracy of 0.785-0.806 when testing internally through analysis and verification of a large batch of real patient samples; the sensitivity in external test reaches 0.841-0.862, the accuracy reaches 0.709-0.734; the sensitivity and the accuracy of the test are both obviously superior to those of a machine learning model applying the traditional blood tumor marker; in addition, the tongue picture image-based tumor prediction system can also show excellent diagnosis and prediction values for various malignant tumors including breast cancer, colorectal cancer, esophageal cancer, liver and gall pancreatic cancer, lung cancer and the like, is obviously superior to the combination of the traditional blood tumor markers, and provides a prospective, economical, noninvasive, effective screening, diagnosis and prediction system and method for tumors.
The invention adopts the technical scheme for achieving the purpose, makes up the defects of the prior art, and has reasonable design and convenient operation.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure.
FIG. 1 is a schematic illustration of a multicenter clinical study and its patient distribution;
FIG. 2 is a discriminative framework of the APINet model;
FIG. 3 is a visualization of the recognition basis of the tongue surface image by the APINet model;
FIG. 4 is ROC and AUC based on three models (SVM, DT, KNN) of 8 hematological tumors, internally validated and externally validated;
FIG. 5 is the ROC and AUC for internal and external verification based on three models of tongue images (APINet, TransFG, DeeplabV3 +);
FIG. 6 is the ROC and AUC of the APINet model for GC and other tumors;
FIG. 7 is a discriminative framework of the TransFG model;
FIG. 8 is a region selection module result visualization of the TransFG model;
FIG. 9 is the ROC and AUC of the TransFG model for GC and other tumors;
FIG. 10 is a schematic diagram of a Deeplab V3+ model training sample labeling process;
FIG. 11 is a discriminative framework for the Deeplab V3+ model;
FIG. 12 is a Deeplab V3+ model prediction results visualization;
FIG. 13 is a graph of the output of the Deeplab V3+ model;
FIG. 14 is ROC and AUC for the DeepLabV3+ model for GC and other tumors;
FIG. 15 is an external verification probability distribution of three tongue manifestation models;
fig. 16 is a representative tongue image with different probabilities.
Detailed Description
Those skilled in the art can appropriately substitute and/or modify the process parameters to implement the present disclosure, but it is specifically noted that all similar substitutes and/or modifications will be apparent to those skilled in the art and are deemed to be included in the present invention. While the invention has been described in terms of preferred embodiments, it will be apparent to those skilled in the art that the technology can be practiced and applied by modifying or appropriately combining the embodiments described herein without departing from the spirit and scope of the invention.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure herein. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is to be noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the technical aspects of the present application. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
APINet model: an APINet model, namely, an attribute personal interaction neural network (APINet) model.
TransFG model: the transFG model, a transformer architecture for fine-grained retrieval (transFG) model.
The present invention is described in detail below.
< clinical specimens >
The national multi-center clinical research is carried out to eliminate the influence of regional, diet and center difference on the research, and the research comprises 11 centers of 8 cities, namely Hangzhou, Wenzhou and Shanghai in the eastern part, Fuzhou in the southern part, Chengdu in the western part, Liaoning and Heilongjiang in the northern part and Taiyuan in the middle part.
As shown in fig. 1, 1111 Gastric Cancer (GC) patients were recruited from 8 centers and 1519 non-gastric cancer (NGC) patients were recruited from 3 centers from 1 month to 10 months from 2020 to 2021, including 169 Healthy Controls (HCs), 648 Superficial Gastritis (SGs) and 702 Atrophic Gastritis (AGs). The system was trained and validated in 865 of patients with Gastric Cancer (GC) and 1287 of patients without gastric cancer (NGC), including 448 of early GC (TNMI + II), 417 of late GC (TNMIII + IV), 141 of healthy control group (HC), 547 of Superficial Gastritis (SG), 599 of Atrophic Gastritis (AG); approximately 80% of cases were used as the training data set and approximately 20% of cases were used as the internal verification data set. In addition, 246 GCs and 232 NGCs from 3 centers were used as independent external validation data sets, including 162 early GCs, 84 late GCs, 28 HCs, 101 SGs, and 103 AGs. These Gastric Cancer (GC) patients are newly diagnosed with gastric cancer, have not received treatment for their disease, nor have surgery, chemotherapy, radiation therapy, targeted therapy or biological therapy been performed for their disease. Patients with Gastric Cancer (GC) had no single tumor, i.e., patients found to have two or more malignancies were also excluded. HCs, SGs and AGs were confirmed by gastroscopy.
Tongue picture images and clinical information were collected for all participants, including age, gender, height, weight, family history, smoking, alcohol consumption, TNM staging, blood tumor markers, etc. The pathological staging is based on the united states cancer joint committee, 8 th edition, stage 23. Tongue image acquisition time was in the morning of gastric surgery for all GC participants, in the morning of gastroscopy for NGC participants, with fasting times exceeding 8 hours, which excluded the effect of diet on tongue images. General patient information such as age, gender, BMI, smoking and drinking between the GC and NGC groups is shown in table 1, which closely matches whether in the training, internal validation or separate external validation data sets.
Table 1 clinical information of GC and NGC participants
In addition, 104 Esophageal Cancer (EC) patients, 129 hepatobiliary pancreatic cancer (HBPC) patients, 116 colorectal cancer (CRC) patients, 260 Lung Cancer (LC) patients and 154 Breast Cancer (BC) patients were recruited from the tumor hospital in zhejiang. Table 2 shows clinical information of other cancer participants, and it can be seen that general information between GC and other cancers, such as age, sex, BMI, smoking and drinking, is well matched except for BC.
TABLE 2 clinical information of other cancer participants
< statistical analysis >
All statistical analyses were performed using SPSS23.0 software (spssinc., Chicago, IL, USA). Results are expressed as mean ± SD or mean ± SEM. Either parametric or non-parametric tests are used depending on whether the data is orthogonally distributed. The count data was analyzed using chi-square test. P <0.05 was considered statistically significant.
< ethical approval >
The research of the application obtains the approval of a centralized ethics committee used by 11 participating centers (IRB-2019-56), and specifically comprises the ethics committee on tumor research in Zhejiang tumor hospital, the first subsidiary hospital of Wenzhou medical university, the tumor hospital in Liaoning province, the subsidiary Renjiao hospital of Shanghai traffic university, the tumor hospital in Fujian province, the subsidiary tumor hospital of Harbin medical university, the tumor hospital in Sichuan province, the tumor hospital in Shanxi province, the Tongde hospital in Zhejiang province, the hospital in Zhejiang province, and the people hospital in Yuhangu city.
< clinical verification >
Example 1:
the verification is carried out by an APINet model, in particular to a tumor diagnosis system based on tongue picture images, which comprises:
a tongue image acquisition module configured to acquire a tongue image of a test specimen;
a data processing module configured to obtain a probability that a test specimen is positive by:
and predicting the probability of the test sample being positive according to the distinguishable characteristics on the tongue picture image obtained by automatic learning.
An interactive deep learning model based on contrast is designed, a pair of tongue surface images input at the same time are fully contrasted, the commonness and the difference between the positive type and the negative type on the tongue surface images are automatically learned, and finally the probability that the test sample belongs to the positive type is predicted according to the characteristic of the discriminability. The overall algorithm framework shown in fig. 2 is divided into three modules: the device comprises a feature fusion module, a feature selection module and a classification module.
A feature fusion module: simultaneously, a pair of tongue images are input, and belong to positive and negative categories respectively. Firstly, the encoder extracts the feature vector of the image and outputs a positive feature f 1 And negative characteristics f 2 。
A feature selection module: will f is 1 And f 2 And its characteristics after splicing f m Simultaneously inputting the MLP of the selection area and outputting two control vectors g 1 And g 2 (f 1 +f m →g 1 ,f 2 +f m →g 2 ) Respectively correspond to f 1 And f 2 . Using control vectors g 1 Separately activate f 1 And f 2 Forming selected features f 1 + And f 2 - ,g 2 Act similarly on f 1 And f 2 Forming selected features f 1 - And f 2 + . Finally, two positive characteristics f are output 1 + And f 1 - And two negative characteristics f 2 + And f 2 - 。
A classification module: we input the selected features into a classifier (full connectivity layer) and finally output the probabilities that the features belong to each class respectively.
And minimizing a cross entropy cost function according to the category to which the four features belong in the training process:
where y is the true label to which the feature corresponds, and the function φ c Represents the final full connection layer classifier, f i k Corresponding to the 4 features input to the classification module. We note that f 1 + Is the control vector g corresponding to the positive feature 1 Is activated, and f 1 - Is controlled by g 2 Negative characteristic information is included, the same applies to f 2 + And f 2 - 。
It should be clear that a more generalized model is paired with feature f i + The confidence level of the output should be higher than the feature f i - So we simultaneously minimize one ordering cost function:
wherein p is i - And p i + Is a characteristic f i - And f i + The probability distribution over each class, ϵ ∈ [0,1], output by the classifier]Is a specified hyper-parameter, p (c) refers to the probability over a specified category c. And only the feature fusion module and the classification module are reserved during model testing, paired input of positive data and negative data during training is changed into single test sample input, the probability distribution of the corresponding test sample output on each class is finally output, and the class corresponding to the maximum probability is taken as a predicted class.
A total of 905 patients were tested, 427 patients from the same center as the training set were tested, and 478 patients from different centers were tested for external tests, and the test results are shown in table 3 and table 4 below.
TABLE 3 internal test results
TABLE 4 results of external tests
In table 3, the number of actual negative cases was 162+52=214, and the number of actual positive cases was 37+176= 213; the prediction results showed that 162 cases were correctly predicted to be negative, while 52 cases were incorrectly predicted to be positive; therefore, the prediction accuracy in the internal test is (number of correctly predicted negatives + number of correctly predicted positives)/total number of test samples = (162+176)/(162+52+37+176) = 79%. Similarly, the accuracy of the external test can reach 71% as shown in table 4. From the results of internal test and external test, the tumor diagnosis system has better prediction accuracy for gastric cancer.
Fig. 3 is a schematic view of visualization of model classification bases, where three test samples in the first row on the left side of the dotted line are positive tongue surface images, the second row is a region where the model is mainly identified according to the tongue surface images, and the right side of the dotted line is a negative sample and a visualization image of the corresponding tongue surface identification bases. The darker the color in the second line of images indicates that the model is more interested in this area, and it is found from the presented results that the area on which the model identification process is based is mainly concentrated on the lingual surface, independent of the black background, and therefore is not affected by the background.
Because clinical symptoms are hidden, the diagnosis and screening depend on the digestive tract endoscope, the early diagnosis rate of the digestive tract tumor is low, the prognosis is poor, and the method brings heavy burden to the society and the economy, and the development of a noninvasive and effective screening and diagnosis method is urgently needed to improve the early diagnosis rate of the digestive system tumor. Artificial intelligence has indicated a clear path to the growing health care systems full of higher precision and computing power, playing an increasingly important role in cancer screening and diagnosis. In our study, an observational, prospective, multicenter clinical study was conducted to evaluate the value of tongue images in screening and diagnosing GC and other tumors.
To further evaluate the value of tongue images as a means of diagnosing and screening tumors, we compared tongue images with blood tumor markers with clinical applications. For comparison, the combination of multiple classical blood tumor markers is selected to verify the prediction of tumors, and the blood tumor markers can be selected from alpha-fetoprotein (AFP), carcinoembryonic antigen (CEA), cancer antigen 125(CA125), cancer antigen 15-3(CA15-3), cancer antigen 199(CA199), cancer antigen 72-4(CA72-4), cancer antigen 242(CA242), cancer antigen 50(CA50), non-small cell lung cancer associated antigen (CYFRA21-1), small cell lung cancer associated antigen (neuron-specific enolase, NSE), squamous cell carcinoma antigen (SCC), Total Prostate Specific Antigen (TPSA), Free Prostate Specific Antigen (FPSA), alpha-L-fucosidase (AFU), virus antibody (EBV-VCA), tumor associated substance (TSGF), Ferritin (Ferritin), beta 2-microglobulin (beta 2-MG); EB, At least one of pancreatic embryonic antigen (POA) or gastrin precursor releasing peptide (PROGRP), particularly at least one selected from CEA, CA242, CA72-4, CA125, CA199, CA50, AFP or Ferritin, more particularly a combination of the above eight blood tumor markers. The prediction method based on the blood tumor marker index comprises the following steps:
1) data preprocessing: training data needs to be complete due to different degrees of loss in serum indices for all cases. Therefore, data are required to be completed before model training, and a K neighbor deficiency interpolation method is adopted to complete the data; specifically, the missing serum index complement value is the average of the values of the 2 nearest neighbors;
2) model training: the invention adopts three machine learning classification methods which are respectively a Support Vector Machine (SVM), a Decision Tree (DT) and a K-nearest neighbor classifier (KNN), specifically, 8 blood tumor marker indexes (CEA, CA242, CA72-4, CA125, CA199, CA50, AFP and Ferritin) of a case correspond to sample characteristics, a negative and positive diagnosis of the case corresponds to a label of a sample, and all completed samples are sent into the three classifiers for fitting;
3) and (3) model evaluation: the model is evaluated by adopting internal verification and external verification; the internal verification uses data of the same hospital and different cases as the training data, while the external verification uses data of hospital cases different from the training data. And predicting the model by using three indexes including sensitivity, specificity and accuracy.
Clinical information of blood tumor markers of GC patients is shown in table 5, and it is found that the blood tumor marker concentrations of CEA, CA424, CA724, CA125, CA199, CA50, AFP, Ferritin, etc. of GC patients are significantly higher than those of NGC patients.
TABLE 5 clinical information of blood tumor markers for GC patients
Training of the model, internal validation and external validation data sets were consistent with tongue image models (excluding the case of blood index loss). The results of the sensitivity, specificity and accuracy verification of the blood tumor markers based on the three machine learning classification methods on GC diagnosis are shown in Table 6, the ROC and AUC of the blood tumor markers on internal verification and external verification are shown in figure 4, the AUC value range of the internal verification is 0.682-0.715, and the AUC value range of the external verification is 0.694-0.760; it is known that the specificity of both internal verification and external verification in the SVM algorithm reaches over 90%, which indicates that the algorithm can provide valuable information for gastric cancer diagnosis. In DT and KNN, the specificity is reduced, the sensitivity and the accuracy are improved to different degrees, and the kit can provide multi-aspect information for gastric cancer diagnosis.
TABLE 6 sensitivity, specificity, accuracy of blood tumor marker-based models for GC diagnostics
It should be understood that the above-mentioned comparison scheme of the present application selects eight serum indicators including CEA, CA242, CA72-4, CA125, CA199, CA50, AFP and Ferritin, and that addition, reduction or replacement of several serum indicators can predict the negative or positive of tumors, especially gastric cancer. The comparison scheme adopts three machine learning classifiers SVM, DT and KNN, and other machine learning classifier methods such as logistic regression and random forest can also achieve the corresponding purpose.
Compared with the aforementioned SVM, DT, KNN, the APINet model of the present embodiment has different degrees of improvement or change in the sensitivity, specificity, and accuracy of GC diagnosis, as shown in table 7.
TABLE 7 sensitivity, specificity and accuracy of the APINet model for GC diagnostics
Table 7 shows sensitivity, specificity and accuracy data of the APINet model based on tongue picture to GC diagnosis, and it can be known that the APINet model has sensitivity and accuracy significantly higher than those of the aforementioned SVM, DT and KNN models based on eight blood tumor markers in both internal verification and external verification, and provides a prospective, economical, noninvasive, effective screening and diagnosis prediction method for tumors.
The APINet model compares input data sufficiently by pairwise interactions to identify contrasted cues for classification. Fig. 5 shows ROC (receiver Operating characteristics) and AUC (area underlying corner) of the APINet model, and it can be seen from fig. 5 that, compared with the SVM, DT and KNN models of fig. 4, the APINet model in fig. 5 has ROC curves far away from the (0,0) - (1,1) connecting line in both internal verification and external verification, the internal verification AUC value reaches 0.875, the external verification AUC value reaches 0.792, and the internal verification AUC values (0.682-0.715) and the external verification AUC values (0.694-0.760) of the SVM, DT and KNN models higher than eight blood tumor markers, and the APINet model is a prediction model with better performance. The AI diagnosis model based on tongue picture image has obviously better diagnosis value to GC than the model which only applies the combination of eight blood tumor markers.
The correlation between the accuracy of the model and clinical information is analyzed, the correlation between the accuracy of the APINet model and the clinical information of a GC patient is shown in a table 8, the correlation between the accuracy of the APINet model and the clinical information of an NGC patient is shown in a table 9, and the accuracy of the APINet model is related to smoking, drinking and blood tumor marker indexes in the judgment of the NGC; in the discrimination of GC, the accuracy of the APINet model is only related to gender. That is, the function of the APINet model to distinguish between GC and NGC is less affected by clinical information.
TABLE 8 correlation of APINet model accuracy with GC patient clinical information
TABLE 9 correlation of APINet model accuracy with NGC patient clinical information
Aiming at observing the specificity and effectiveness of a GC diagnosis model APINet based on tongue picture images, 104 EC, 129 HBPC, 116 CRC, 260 LC and 154 BC patients are selected to evaluate the diagnosis value. The results of the specificity of the APINet model for GC and other tumors are shown in table 10, and it is understood that the tongue image model APINet is most useful for GC diagnosis and has a certain effect on the diagnosis of digestive tract tumors such as EC, HBPC, CRC, LC, etc. ROC and AUC of APINet model for GC and other tumors are shown in fig. 6, which shows that APINet model has the best diagnostic effect on GC, has diagnostic effects with different effects on EC, HBPC, CRC, LC, and the like, and that APINet model is positive for the diagnostic prediction of the above-mentioned various tumors.
TABLE 10 specificity of the APINet model for GC and other tumors
Method | GC | EC | HBPC | CRC | LC | BC |
APINet model | 0.858 | 0.731 | 0.731 | 0.555 | 0.584 | 0.298 |
Example 2:
the verification is carried out by a TransFG model, and particularly relates to a tongue picture-based tumor diagnosis system which comprises:
a tongue image acquisition module configured to acquire a tongue image of a test specimen;
a data processing module configured to obtain a probability that a test specimen is positive by:
and predicting the probability of the test sample being positive according to the distinguishable characteristics on the tongue picture image obtained by automatic learning.
A deep learning model based on a Transformer is designed, an input tongue surface image is divided into small blocks without overlapping, and then the divided small blocks are sequentially composed into a sequence and input into a depth neural network. And finally predicting the probability of the test sample being positive according to the extracted high-discriminative characteristics.
Overall algorithm structure as shown in fig. 7, the input to the entire model is the tongue surface image. Firstly, cutting a tongue surface image into n small blocks, and then sequentially forming an input sequence by the n small blocks to form an input sequence with the length of n. The small image blocks are linearly mapped to form an input vector and position indices 0,1,2, …, n-1 are added. The invention is based on the characteristic extraction of an encoder part of a Transformer model, and comprises L (L =9) +1 Transformer layer, and a self-attention mechanism is contained in each layer. In order to remove redundant features, before deep features are input into the last layer, region selection is carried out through a feature selection module, the module comprises a multi-head attention mechanism, the index of the front k (k =12) block features with the largest attention weight is returned, the selected k features are input into the last layer of transform layer for feature fusion, the selected deep features which are favorable for classification are output, and finally probability distribution of each class is output through a softmax classifier.
When the probability distribution of the deep features belonging to each category is output, the cross entropy loss function is respectively minimized:
and minimizing a contrast loss function:
so that the features in the classes are more aggregated and the features between the classes are more different, thereby improving the prediction accuracy.
In total, 905 cases were tested, wherein 427 cases of internal tests from the same center as the training set and 478 cases of data from different centers were used for external tests, and the test results are shown in the following tables 11 and 12, wherein the accuracy of the internal tests and the accuracy of the external tests can reach 81% and 73%, respectively. From the results of the internal test and the external test, the tumor diagnosis system has better prediction accuracy for gastric cancer.
TABLE 11 internal test results
TABLE 12 results of external tests
As shown in fig. 8, which is a visual diagram of the basis of model classification, the three test samples in the first row on the left side of the dotted line are positive tongue images, the yellow patches in the second row of images are the regions corresponding to the feature indexes returned by the region selection module in the original image, and the negative samples and the region selection results are on the right side of the dotted line. From the presented results, it is found that the region on which the model identification process is based is mainly concentrated on the region with heavier tongue coating in the upper half of the tongue surface, and has low correlation with the black background and the lower half of the tongue surface.
In order to further evaluate the value of the tongue image as a means for diagnosing and screening tumors, the tongue image is compared with blood tumor markers with clinical application, specifically, a transFG model based on the tongue image is compared with SVM, DT, KNN models based on the blood tumor markers. The results show that the transFG model of this example has different degrees of improvement or change in sensitivity, specificity and accuracy of GC diagnosis, as shown in Table 13.
TABLE 13 sensitivity, specificity and accuracy of the TransFG model for GC diagnostics
Table 13 shows sensitivity, specificity and accuracy data of the TransFG model based on tongue picture for GC diagnosis, which indicates that the TransFG model has sensitivity and accuracy for GC diagnosis significantly higher than those of the aforementioned SVM, DT and KNN models based on eight blood tumor markers (CEA, CA242, CA72-4, CA125, CA199, CA50, AFP and Ferritin) in both internal and external validation, providing a prospective, economical, non-invasive, effective screening and diagnosis prediction method for tumors.
The TransFG model automatically selects areas that are favorable for classification in a data-driven manner. Fig. 5 shows ROC and AUC of internal validation and external validation of the TransFG model, and it can be seen from fig. 5 that AUC =0.859 of the TransFG model internal validation and AUC =0.815 of the external validation are significantly higher than AUC values (0.682-0.715) of the internal validation and AUC values (0.694-0.760) of the external validation of SVM, DT and KNN models based on eight blood tumor markers, indicating that the TransFG model is a prediction model with better performance. The AI diagnostic model based on tongue picture image has a diagnostic value for GC obviously superior to the combination of eight blood tumor markers.
The correlation between the accuracy of the model and the clinical information is analyzed, specifically, the correlation between the accuracy of the transFG model and the clinical information of the GC patient is shown in Table 14, and the correlation between the accuracy of the transFG model and the clinical information of the NGC patient is shown in Table 15.
TABLE 14 correlation of TransFG model accuracy with GC patient clinical information
TABLE 15 correlation of TransFG model accuracy with clinical information of NGC patients
The EC, HBPC, CRC, LC and BC patients are selected to evaluate the diagnostic value in order to observe the specificity and effectiveness of the transFG diagnostic model based on the tongue picture. The results of the specificity of the APINet model for GC and other tumors are shown in table 16, and it is understood that similar to the APINet model, the TransFG model is also most useful for GC diagnosis, and has different effects on diagnosis of tumors such as EC, HBPC, CRC, LC, and the like.
TABLE 16 specificity of the TransFG model for GC and other tumors
Method | GC | EC | HBPC | CRC | LC | BC |
TransFG model | 0.862 | 0.692 | 0.636 | 0.664 | 0.512 | 0.303 |
ROC and AUC of the TransFG model GC and other tumors are shown in fig. 9, and it is found that the TransFG model has the best diagnostic effect on GC, that AUC =0.815, and that AUC for tumor diagnosis such as EC, HBPC, CRC, and LC exceeds 0.5, and shows a certain diagnostic effect, and therefore, the TransFG model is positive for the diagnostic prediction of various tumors including GC.
Example 3:
the tumor diagnosis system is verified by a Deeplab V3+ model, and particularly relates to a tongue picture-based tumor diagnosis system, which comprises:
a tongue image acquisition module configured to acquire a tongue image of a test specimen;
a data processing module configured to obtain a probability that a test sample is positive by:
and predicting the probability of the test sample being positive according to the distinguishable characteristics on the tongue picture image obtained by automatic learning.
A bottom-up deep learning decision-making frame is designed by utilizing a computer assistant section, and the positive or recessive judgment that different test objects belong to tumors is automatically made according to a tongue picture image acquired in advance.
In order to learn a better deep learning model on this task, sufficient annotation data is obtained first. As shown in fig. 10, we use a pixel-by-pixel labeling method for the tongue image. The tongue surface area is outlined by using the existing labeling tool, then each pixel point is endowed with a label, if the sample is a tumor positive sample, the label of the pixel point of the tongue surface area is 2, if the sample is a tumor negative sample, the pixel point of the tongue surface area is labeled as 1, and all the pixel points of the background area are 0. It should be clear that, the foregoing labeling 2, 1, and 0 of the tongue surface region pixel points is only exemplary, and all labels that can distinguish the positive sample pixel points, the negative sample pixel points, and the background region pixel points are acceptable, for example, A, B, C.
Based on the tongue picture labeling mode, the invention designs a tongue picture deep learning model from bottom to top, automatically learns the tongue surface characteristics of the tongue picture, and finally outputs the tumor positive probability of the corresponding sample according to the tongue picture information. The whole algorithm framework adopts an automatic coding-decoding structure, as shown in fig. 11, an Encoder is an image Encoder and is used for coding the characteristics of the whole image, a Decoder is a characteristic Decoder, the Decoder outputs a probability map of the whole image and represents the probability that each pixel point is classified into a specified category, and therefore the total category number of the task is the layer number of the model output probability map. Specifically, in all the automatic encoding-decoding structures, we adopted the deplab v3+ image segmentation network structure. In order to effectively judge the positive probability of a test sample, a category interpretation module is added after a Deeplab V3+ network output layer, namely, a final test result is decided based on a probability graph of the whole image.
Assume that the input image pixel point set is M = { i | i =1,2, … …, M }, the category set is C = { C | C =0,1,2}, 0,1,2 in the category set represent background, negative and positive, respectively, P c (i) And the probability that a certain pixel point belongs to the class c is represented. Adopting an interpretation strategy shown by the following formula in a category interpretation module according to the intermediate result probability graph:
wherein the function I is an indication function, and the function value is 1 when the condition is satisfied, and is 0 otherwise. t is a pixel category judgment threshold value, wherein t belongs to [0,1], the value is 0.5 in the invention, so r represents the proportion of a positive region to a lingual region predicted by a model in a lingual image, and the proportion r is taken as an overall frame to predict the probability that a sample is positive for gastric cancer.
In the model training process, a cross entropy cost function predicted pixel by pixel is adopted, and for an input tongue surface image, the corresponding cost function L is as follows:
wherein, the true label category of the pixel point i is c, and P c (i) Representing the probability that pixel i is predicted as category c.
The detailed steps of applying the Deeplab V3+ model to predict the positive probability of the tongue picture image are as follows;
1) during prediction, the model outputs the probability that each pixel point belongs to three categories (background, negative and positive) respectively, and the probability corresponds to 0,1 and 2 in the category interpretation module. Selecting the category with the maximum probability as a prediction category by comparing the probability of each pixel belonging to each category, for example, if one pixel point belongs to the background, the negative probability and the positive probability are respectively 0.3, 0.5 and 0.2, then the pixel point is predicted as the negative category;
2) counting the number of pixels which are respectively predicted to be positive and negative types in an input image, if the number of the pixels which are predicted to be positive is larger than the number of the pixels which are predicted to be negative, the input image is finally predicted to be positive, namely if the number of the positive pixels/(the number of the positive pixels + the number of the negative pixels) is larger than 0.5, the input image is predicted to be positive, and the number of the positive pixels/(the number of the positive pixels + the number of the negative pixels) is the probability that the input image is positive.
To exclude the effect of the image background on the experiment, we only used the circumscribed rectangle part of the lingual area in the image for training and testing. In the training process, in order to enrich the sample space of the training set, samples in the training set are randomly overturned, the overturned tongue image with a specified proportion is subjected to Gaussian blur, and the proportion randomly takes values in the range of 0 to 0.5 in each training period. Totally 678 tongue images are collected, wherein 544 tongue images are used for training, 134 tongue images are used for testing, the test results are shown in table 17, 10 positive samples are judged to be negative, 8 negative samples are judged to be positive, the model prediction accuracy is 87%, and it can be seen that both negative samples and positive samples have high prediction accuracy.
TABLE 17 test results
Fig. 12 shows the test results after model visualization, four test samples are respectively taken from tongue surface images of negative, positive and negative cases, and the final tongue surface image type determination result is the same as the direct labeling result. The first column is an input tongue surface image to be predicted; the second column is a prediction result of each pixel point of the input image, wherein the pixel point of a green area (indicated by an arrow G area in the figure) is predicted to be negative, the pixel point corresponding to a yellow area (indicated by an arrow Y area in the figure) is predicted to be positive, and a purple area (indicated by an arrow P area in the figure) is a background area predicted by a model; therefore, the negative and positive judgment areas are both in the lingual surface area, and the prediction of the image category is not influenced by the background area; the third column indicates the area corresponding to the prediction result in the original drawing. According to the formula, the proportion of the yellow area to the whole tongue surface area is regarded as the probability that the model predicts the positive of the gastric cancer of the sample, and when the probability is more than 0.5, the test sample is positive of the gastric cancer, wherein the tongue surface area predicted by the model is the sum of the positive area and the negative area.
As fig. 13 shows the semantic segmentation results of some samples of the deplabv 3+ model based on tongue images, the three test samples in the first row on the left side of the dotted line are positive tongue images, the whole tongue surface area in the second row of images is the corresponding area of the returned feature index in the original image, and it can be known that the pixels marked as yellow are predicted to be classified as positive; similarly, the three test samples in the first row on the right side of the dotted line are negative tongue images, the entire tongue surface area in the second row of images is the corresponding area of the returned feature index in the original image, and it can be known that the pixel marked green is predicted to be negative. Therefore, the positive and negative judgment regions are both in the lingual surface region, and the prediction of the image category is not affected by the background region.
In order to further evaluate the value of tongue images as a means for diagnosing and screening tumors, we compared tongue images with blood tumor markers having clinical applications, specifically comparing the deplab v3+ model based on tongue images with SVM, DT, KNN models based on blood tumor markers. The results show that the Deeplab V3+ model of this example has different degrees of improvement or change in sensitivity, specificity and accuracy of GC diagnosis, as shown in Table 18.
Table 18, sensitivity, specificity and accuracy of the deplab v3+ model to GC diagnosis.
Table 18 shows that the deplabv 3+ model based on tongue images shows superior sensitivity and accuracy to GC diagnosis, both in internal and external verification, and is superior to those of the aforementioned SVM, DT and KNN models based on eight blood tumor markers (CEA, CA242, CA72-4, CA125, CA199, CA50, AFP and Ferritin) (0.283-0.566, 0.362-0.539) and accuracy (0.603-0.622, 0.645-0.662), enriching prospective, economical, noninvasive, effective tumor screening and diagnosis prediction methods.
Fig. 5 shows ROC and AUC of the inner validation and outer validation of the DeeplabV3+ model, and it can be seen from fig. 5 that DeeplabV3+ model inner validation AUC =0.836 and outer validation AUC =0.801 are significantly higher than inner validation AUC values (0.682-0.715) and outer validation AUC values (0.694-0.760) of SVM, DT and KNN models based on eight blood tumor markers, and it can be seen that DeeplabV3+ model is a prediction model with better performance. The AI diagnostic model based on tongue picture image has a diagnostic value for GC obviously superior to the combination of eight blood tumor markers.
The correlation between the model accuracy and the clinical information is analyzed, the correlation between the model accuracy of Deeplab V3+ and the clinical information of the GC patient is shown in a table 19, the correlation between the model accuracy of Deeplab V3+ and the clinical information of the NGC patient is shown in a table 20, and the accuracy of the Deeplab V3+ model is only related to gender in the judgment of the NGC; in the discrimination of GC, the deplabv 3+ model was associated with BMI and tumor location, but not with other clinical information. Therefore, the function of the deplab v3+ model to distinguish between GCs and NGCs is less affected by clinical information.
TABLE 19 correlation of Deeplab V3+ model accuracy with GC patient clinical information
TABLE 20 correlation of Deeplab V3+ model accuracy with NGC patient clinical information
The aforementioned EC, HBPC, CRC, LC and BC patients were selected to evaluate the diagnostic value in order to observe the specificity and effectiveness of the tongue image-based GC diagnostic model deplab v3 +. The results of the specificity of the deplabv 3+ model for GC and other tumors are shown in table 21, and it is understood that the deplabv 3+ model is most useful for GC diagnosis and has different effects on tumors such as EC, HBPC, CRC, and LC, similarly to the APINet model and the TransFG model.
TABLE 21 specificity of the Deeplab V3+ model for GC and other tumors
Method | GC | EC | HBPC | CRC | LC | BC |
Deeplab V3+ model | 0.841 | 0.644 | 0.687 | 0.672 | 0.527 | 0.299 |
As shown in fig. 14, it is understood that the deeplab v3+ model GC and ROC and AUC of other tumors are the best diagnostic effect of the deeplab v3+ model on GC, and that AUC =0.801 shows a certain diagnostic effect, since AUC of the model exceeds 0.5 for tumor diagnosis such as EC, HBPC, CRC, and LC, the model deeplab v3+ model is positive for the diagnostic prediction of various tumors including GC.
Comprehensive analysis was performed on the APINet model, the TransFG model, and the deplab v3+ model provided in the previous examples 1-3, and fig. 15 shows the external verification probability distribution (gastric cancer) of the three tongue image models, it can be known that most cases are distributed on both sides, that is, the three judgments for positive cases and negative cases are more definite, there were few cases of diagnostic ambiguity between 0.41 and 0.60, indicating that the model was reliable for the diagnostic prediction of tumors, and figure 16 shows representative tongue images with different probabilities (gastric cancer, intersection of the three models), it can be known that if no automatic learning model is involved, the positive probability is difficult to distinguish from the tongue picture image by visual observation alone, so the application provides a plurality of tumor diagnosis methods based on the tongue picture image, has excellent diagnosis and prediction values for various tumors including gastric cancer and the like, and provides scientific basis for the tongue picture diagnosis theory of traditional Chinese medicine.
Conventional techniques in the above embodiments are known to those skilled in the art, and therefore, will not be described in detail herein.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.
While the invention has been described in detail and with reference to specific embodiments thereof, it will be apparent to one skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope thereof.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.
The invention is not the best known technology.
Claims (37)
1. A tongue image-based tumor prediction system, comprising:
a tongue image acquisition module configured to acquire a tongue image of a test sample;
a data processing module configured to obtain a probability that a test specimen is positive by:
and predicting the probability of the test sample being positive according to the distinguishable characteristics on the tongue picture image obtained by automatic learning.
2. The system of claim 1, wherein: the tumor is at least one of gastric cancer, breast cancer, colorectal cancer, esophageal cancer, liver and gall pancreatic cancer, lung cancer, prostate cancer, thyroid cancer, ovarian cancer, neuroblastoma, trophoblastic tumor or head and neck squamous carcinoma.
3. The system according to claim 1 or 2, characterized in that: the tumor is at least one of gastric cancer, breast cancer, colorectal cancer, esophageal cancer, liver and gall pancreatic cancer or lung cancer.
4. The system of claim 1, wherein: the system also includes an output module configured to output the prediction.
5. The system of claim 4, wherein: the output module is configured to output the tongue picture image and the prediction result.
6. The system according to claim 4 or 5, characterized in that: the output module outputs in at least one mode of electronic display, sound broadcasting, printing and network transmission.
7. The system of claim 1, wherein: the discriminative features are from between positive and negative categories on the tongue picture.
8. The system according to claim 1 or 7, characterized in that: the interpretable features are derived from a positive tongue image and a negative tongue image of the paired input interactive deep learning model.
9. The system of claim 8, wherein: the data processing module is specifically configured to predict the probability that a test specimen is positive by:
and fully comparing the positive tongue picture image and the negative tongue picture image which are simultaneously input into the interactive deep learning model, automatically learning the commonness and difference between the positive category and the negative category on the tongue picture image, and predicting the probability that the test sample belongs to the positive according to the characteristic of the discriminability between the positive category and the negative category.
10. The system of claim 9, wherein:
the positive tongue picture is collected from a tumor positive patient;
the negative tongue image was taken from a tumor-negative patient.
11. The system of claim 9, wherein: the interactive deep learning model is an APINet model.
12. The system according to claim 9 or 11, characterized in that: the data processing module is specifically configured to obtain a probability that a test sample is positive by:
1) extracting positive characteristics and negative characteristics from a pre-obtained positive tongue image and a pre-obtained negative tongue image;
2) training a model by using positive characteristics and negative characteristics, and outputting the probability that the characteristics belong to each category;
3) and inputting the tongue picture image of the test sample into the trained model, and outputting the probability that the test sample is positive.
13. The system of claim 12, wherein: the step 1) of extracting the positive features and the negative features comprises the following steps:
the encoder extracts the feature vector of the image and outputs a positive feature f 1 And negative characteristics f 2 ;
Will f is 1 And f 2 And its characteristics after splicing f m Simultaneously inputting MLP of the feature selection area and correspondingly outputting two control vectors g 1 And g 2 ;
g 1 Separately activate f 1 And f 2 Forming selected features f 1 + And f 2 - ,g 2 Separately activate f 1 And f 2 Forming selected features f 1 - And f 2 + Obtaining two positive features f 1 + And f 1 - With two negative features f 2 + And f 2 - 。
14. The system of claim 13, wherein:
MLP sufficient learning of feature selection regions f 1 And f m And outputs a control vector g 1 ,
MLP full learning of feature selection regions f 2 And f m And outputs a control vector g 2 。
15. The system of claim 12, wherein: and 2) the training model with the positive features and the negative features specifically inputs the positive features and the negative features into a full-connected layer classifier, and outputs the probability that the features belong to each class respectively.
16. The system according to claim 12 or 15, characterized in that: step 2), when the output features belong to the probability of each category, minimizing a cross entropy loss function according to the four feature categories:
where y is the true label corresponding to the feature, and the function φ c Represents the final full connection layer classifier, f i k Corresponding to the 4 features of the input.
17. The system according to claim 12 or 15, characterized in that: step 2) when the output characteristics belong to the probability of each category, considering the model pair characteristics f i + The confidence level of the output should be higher than the feature f i - Minimizing the ordering penalty function:
wherein p is i - And p i + Is a characteristic f i - And f i + The probability distribution over each class, ϵ ∈ [0,1], output by the classifier]Is a specified hyper-parameter, p (c) refers to the probability over a specified category c.
18. The system of claim 12, wherein: and 3) inputting the tongue picture image of the test sample into the trained model refers to inputting the tongue picture image of the single test sample.
19. The system according to claim 12 or 18, characterized in that: and 3) outputting the probability of the test sample belonging to the category refers to finally outputting the probability distribution of the corresponding test sample on each category, and taking the category corresponding to the maximum probability as the predicted category.
20. The system according to claim 1 or 7, characterized in that: the distinguishable characteristic is from a single positive tongue image or a negative tongue image.
21. The system of claim 20, wherein: the distinguishable features are obtained by cutting the tongue picture image into n small blocks and then forming an input sequence for feature extraction to obtain deep features beneficial to classification.
22. The system of claim 20, wherein: the data processing module is specifically configured to obtain a probability that a test sample is positive by:
cutting the tongue picture image of the test sample into small blocks, forming an input vector through linear mapping, adding a position index, importing the input vector into a trained deep learning model for feature extraction and feature fusion, outputting the selected deep features which are favorable for classification, and obtaining the probability of belonging to each category.
23. The system according to any one of claims 22, wherein: the deep learning model completes training through the following steps:
a) cutting a tongue surface image into n small blocks, then sequentially forming an input sequence by the cut n small blocks to form an input sequence with the length of n, forming an input vector through linear mapping, and adding position indexes of 0,1,2, … and n-1;
b) performing feature extraction and feature fusion by using a coder based on a TransFG model, outputting selected deep features favorable for classification, and finally outputting probability distribution of the deep features belonging to each category through a softmax classifier; and finishing the training of the model.
24. The system of claim 23, wherein: the encoder in the step b) performs feature extraction, and comprises an L +1 layer transform layer and a self-attention mechanism inside each layer.
25. The system of claim 23, wherein: and b) when the encoder performs feature extraction and feature fusion, before the depth features are input to the last layer, performing region selection through a feature selection module comprising a multi-head attention mechanism, returning the index of the front-row features with the maximum attention weight by the feature selection module, and inputting the selected front-row features to the last layer of transform layer for feature fusion.
26. The system of claim 25, wherein: the front-row features are the first k features, and k is one of 1,2,3, … … and 20.
27. The system of claim 23, wherein: minimizing a cross entropy loss function when the output deep features belong to the probability distribution of each category:
29. The system according to claim 1 or 7, wherein: the characteristic of the interpretability comes from each pixel point of the tongue picture image.
30. The system of claim 29, wherein: the data processing module is specifically configured to obtain a probability that a test sample is positive by:
inputting a tongue picture image of a test sample into a trained deep learning model, outputting the probability that each pixel point belongs to a positive background, a negative background and a maximum probability category as the prediction category of the pixel point;
the number of the pixels predicted to be positive in the test sample/(the number of the pixels predicted to be positive + the number of the pixels predicted to be negative) is the probability that the test sample belongs to positive.
31. The system of claim 30, wherein:
the deep learning model is trained through the following steps:
respectively labeling the positive tongue picture image or the negative tongue picture image pixel by pixel, specifically respectively standardizing the positive tongue surface region pixel, the negative tongue surface region pixel and the background region pixel;
the whole algorithm framework adopts an automatic coding-decoding structure, an image coder is used for coding the characteristics of the whole image, and a characteristic decoder outputs a probability map of the whole image;
and calculating the loss value of each pixel point through the real label of each pixel point and the prediction probability in the probability graph, and updating the model parameters until the training is completed.
32. The system of claim 31, wherein: the automatic coding-decoding structure adopts a Deeplab V3+ image segmentation network structure and/or a Unet series network structure.
33. The system according to claim 31 or 32, wherein: and adding a category interpretation module after an output layer of a network structure in the automatic coding-decoding structure, and deciding a final test result based on the probability graph of the whole image.
34. The system of claim 33, wherein: the category interpretation module adopts an interpretation strategy shown by the following formula according to the probability chart:
wherein m is the total number of pixel points; the function I is an indication function, when the condition is met, the function value is 1, otherwise, the function value is 0; and t is a pixel class judgment threshold, wherein t belongs to [0,1 ].
35. The system of claim 31 or 32, wherein: in the deep learning model training process, a cross entropy cost function predicted pixel by pixel is adopted:
wherein the true label category of pixel point i is c (positive, negative, background), and P c (i) Representing the probability that pixel i is predicted as category c.
36. The tumor prediction method based on the tongue picture image is characterized by comprising the following steps:
obtaining a tongue image of the test sample;
inputting a tongue image of a test sample into the system of any one of claims 1-35 to obtain a tumor positive probability for the test sample.
37. Use of the system of any one of claims 1 to 35 and/or the method of claim 36, comprising:
tumor prediction is performed on a test sample using the system and/or method.
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310150147.8A CN117173084A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on tongue image and application thereof |
CN202310762869.9A CN116883330A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on DeeplabV3+ model and application of tumor prediction system and method |
CN202310762871.6A CN116977284A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on interactive deep learning model and application thereof |
CN202210861997.4A CN114943735A (en) | 2022-07-22 | 2022-07-22 | Tongue picture image-based tumor prediction system and method and application thereof |
CN202310762870.1A CN116912177A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on TransFG model and application thereof |
PCT/CN2023/103841 WO2024016992A1 (en) | 2022-07-22 | 2023-06-29 | Tumor prediction system and method based on tongue image, and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210861997.4A CN114943735A (en) | 2022-07-22 | 2022-07-22 | Tongue picture image-based tumor prediction system and method and application thereof |
Related Child Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310762870.1A Division CN116912177A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on TransFG model and application thereof |
CN202310762869.9A Division CN116883330A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on DeeplabV3+ model and application of tumor prediction system and method |
CN202310150147.8A Division CN117173084A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on tongue image and application thereof |
CN202310762871.6A Division CN116977284A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on interactive deep learning model and application thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114943735A true CN114943735A (en) | 2022-08-26 |
Family
ID=82911052
Family Applications (5)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310150147.8A Pending CN117173084A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on tongue image and application thereof |
CN202310762869.9A Pending CN116883330A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on DeeplabV3+ model and application of tumor prediction system and method |
CN202310762871.6A Pending CN116977284A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on interactive deep learning model and application thereof |
CN202210861997.4A Pending CN114943735A (en) | 2022-07-22 | 2022-07-22 | Tongue picture image-based tumor prediction system and method and application thereof |
CN202310762870.1A Pending CN116912177A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on TransFG model and application thereof |
Family Applications Before (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310150147.8A Pending CN117173084A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on tongue image and application thereof |
CN202310762869.9A Pending CN116883330A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on DeeplabV3+ model and application of tumor prediction system and method |
CN202310762871.6A Pending CN116977284A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on interactive deep learning model and application thereof |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310762870.1A Pending CN116912177A (en) | 2022-07-22 | 2022-07-22 | Tumor prediction system and method based on TransFG model and application thereof |
Country Status (2)
Country | Link |
---|---|
CN (5) | CN117173084A (en) |
WO (1) | WO2024016992A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024016992A1 (en) * | 2022-07-22 | 2024-01-25 | 浙江省肿瘤医院 | Tumor prediction system and method based on tongue image, and application thereof |
CN117649413A (en) * | 2024-01-30 | 2024-03-05 | 数据空间研究院 | Intestinal cancer deep learning auxiliary diagnosis method based on tongue picture of traditional Chinese medicine |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118570556B (en) * | 2024-07-31 | 2024-10-18 | 西安邮电大学 | Mammary tumor classification method based on multi-sequence MRI feature enhancement |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106295139A (en) * | 2016-07-29 | 2017-01-04 | 姹ゅ钩 | A kind of tongue body autodiagnosis health cloud service system based on degree of depth convolutional neural networks |
CN106529446A (en) * | 2016-10-27 | 2017-03-22 | 桂林电子科技大学 | Vehicle type identification method and system based on multi-block deep convolutional neural network |
CN108553081A (en) * | 2018-01-03 | 2018-09-21 | 京东方科技集团股份有限公司 | A kind of diagnostic system based on tongue fur image |
CN109700433A (en) * | 2018-12-28 | 2019-05-03 | 深圳铁盒子文化科技发展有限公司 | A kind of tongue picture diagnostic system and lingual diagnosis mobile terminal |
CN110033858A (en) * | 2018-12-28 | 2019-07-19 | 深圳铁盒子文化科技发展有限公司 | A kind of tongue picture analysis method and its storage medium |
CN110532907A (en) * | 2019-08-14 | 2019-12-03 | 中国科学院自动化研究所 | Based on face as the Chinese medicine human body constitution classification method with tongue picture bimodal feature extraction |
WO2020215697A1 (en) * | 2019-08-09 | 2020-10-29 | 平安科技(深圳)有限公司 | Tongue image extraction method and device, and a computer readable storage medium |
WO2021247905A1 (en) * | 2020-06-04 | 2021-12-09 | Xcures, Inc. | Methods and systems for precision oncology using a multilevel bayesian model |
CN114463346A (en) * | 2021-12-22 | 2022-05-10 | 成都中医药大学 | Complex environment rapid tongue segmentation device based on mobile terminal |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113160966B (en) * | 2021-02-25 | 2023-07-07 | 西安理工大学 | Tongue picture diagnosis method and tongue picture diagnosis system based on multitask learning |
CN113888518A (en) * | 2021-10-14 | 2022-01-04 | 重庆南鹏人工智能科技研究院有限公司 | Laryngopharynx endoscope tumor detection and benign and malignant classification method based on deep learning segmentation and classification multitask |
CN117173084A (en) * | 2022-07-22 | 2023-12-05 | 浙江省肿瘤医院 | Tumor prediction system and method based on tongue image and application thereof |
-
2022
- 2022-07-22 CN CN202310150147.8A patent/CN117173084A/en active Pending
- 2022-07-22 CN CN202310762869.9A patent/CN116883330A/en active Pending
- 2022-07-22 CN CN202310762871.6A patent/CN116977284A/en active Pending
- 2022-07-22 CN CN202210861997.4A patent/CN114943735A/en active Pending
- 2022-07-22 CN CN202310762870.1A patent/CN116912177A/en active Pending
-
2023
- 2023-06-29 WO PCT/CN2023/103841 patent/WO2024016992A1/en unknown
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106295139A (en) * | 2016-07-29 | 2017-01-04 | 姹ゅ钩 | A kind of tongue body autodiagnosis health cloud service system based on degree of depth convolutional neural networks |
CN106529446A (en) * | 2016-10-27 | 2017-03-22 | 桂林电子科技大学 | Vehicle type identification method and system based on multi-block deep convolutional neural network |
CN108553081A (en) * | 2018-01-03 | 2018-09-21 | 京东方科技集团股份有限公司 | A kind of diagnostic system based on tongue fur image |
CN109700433A (en) * | 2018-12-28 | 2019-05-03 | 深圳铁盒子文化科技发展有限公司 | A kind of tongue picture diagnostic system and lingual diagnosis mobile terminal |
CN110033858A (en) * | 2018-12-28 | 2019-07-19 | 深圳铁盒子文化科技发展有限公司 | A kind of tongue picture analysis method and its storage medium |
WO2020215697A1 (en) * | 2019-08-09 | 2020-10-29 | 平安科技(深圳)有限公司 | Tongue image extraction method and device, and a computer readable storage medium |
CN110532907A (en) * | 2019-08-14 | 2019-12-03 | 中国科学院自动化研究所 | Based on face as the Chinese medicine human body constitution classification method with tongue picture bimodal feature extraction |
WO2021247905A1 (en) * | 2020-06-04 | 2021-12-09 | Xcures, Inc. | Methods and systems for precision oncology using a multilevel bayesian model |
CN114463346A (en) * | 2021-12-22 | 2022-05-10 | 成都中医药大学 | Complex environment rapid tongue segmentation device based on mobile terminal |
Non-Patent Citations (4)
Title |
---|
JU HE等: "TransFG: A Transformer Architecture for Fine-Grained Recognition", 《ARXIV》 * |
PEIQIN ZHUANG等: "Learning Attentive Pairwise Interaction for Fine-Grained Classification", 《ARXIV》 * |
王秋月: "基于手机平台的舌象采集分析的研究", 《中国优秀硕士学位论文全文数据库 医药卫生科技辑》 * |
辛飞祥: "基于视觉感知融合的中医诊断分析模型研究", 《中国优秀硕士学位论文全文数据库 医药卫生科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024016992A1 (en) * | 2022-07-22 | 2024-01-25 | 浙江省肿瘤医院 | Tumor prediction system and method based on tongue image, and application thereof |
CN117649413A (en) * | 2024-01-30 | 2024-03-05 | 数据空间研究院 | Intestinal cancer deep learning auxiliary diagnosis method based on tongue picture of traditional Chinese medicine |
Also Published As
Publication number | Publication date |
---|---|
CN116883330A (en) | 2023-10-13 |
WO2024016992A1 (en) | 2024-01-25 |
CN117173084A (en) | 2023-12-05 |
CN116977284A (en) | 2023-10-31 |
CN116912177A (en) | 2023-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115082437B (en) | Tumor prediction system and method based on tongue picture image and tumor marker and application | |
CN114943735A (en) | Tongue picture image-based tumor prediction system and method and application thereof | |
Du et al. | Identification of COPD from multi-view snapshots of 3D lung airway tree via deep CNN | |
US20220180518A1 (en) | Improved histopathology classification through machine self-learning of "tissue fingerprints" | |
CN109523535A (en) | A kind of preprocess method of lesion image | |
WO2024016990A1 (en) | Tumor prediction system and method based on tongue coating microorganisms, and application thereof | |
Zhu et al. | An accurate prediction of the origin for bone metastatic cancer using deep learning on digital pathological images | |
yahia Ibrahim et al. | An enhancement technique to diagnose colon and lung cancer by using double CLAHE and deep learning | |
Zhou et al. | Integrative deep learning analysis improves colon adenocarcinoma patient stratification at risk for mortality | |
Ba et al. | Histopathological diagnosis system for gastritis using deep learning algorithm | |
Deng et al. | The investigation of construction and clinical application of image recognition technology assisted bronchoscopy diagnostic model of lung cancer | |
Tang et al. | CLELNet: A continual learning network for esophageal lesion analysis on endoscopic images | |
Wang et al. | Towards reliable and explainable AI model for pulmonary nodule diagnosis | |
Lee et al. | A robust model training strategy using hard negative mining in a weakly labeled dataset for lymphatic invasion in gastric cancer | |
Yuan et al. | MA19. 11 Predicting Future Lung Cancer Risk with Low-Dose Screening CT Using an Artificial Intelligence Model | |
CN114004821A (en) | Intestinal ganglion cell auxiliary identification method based on cascade rcnn | |
Feng et al. | Differentiation between COVID‐19 and bacterial pneumonia using radiomics of chest computed tomography and clinical features | |
Paayas et al. | OCEAN-Ovarian Cancer subtypE clAssification and outlier detectioN using DenseNet121 | |
Lim et al. | Lung Cancer Detection on CT Scan Images with Deep Learning Methods: Sugeno Fuzzy Integral-Based CNN Ensemble Method | |
Ahmadvand et al. | A Deep Learning Approach for the Identification of the Molecular Subtypes of Pancreatic Ductal Adenocarcinoma Based on Whole Slide Pathology Images | |
Cai et al. | An Integrated Clinical and Computerized Tomography-Based Radiomic Feature Model to Separate Benign from Malignant Pleural Effusion | |
Li | Image Classification of Brain Tumor Based on Enhanced VGG 19 Convolutional Neural Network | |
Severeyn et al. | Early Diagnosis of Pancreatic Cancer using Urinary Biomarkers and Machine Learning | |
Zhu et al. | Two-step artificial intelligence system for endoscopic gastric biopsy improves the diagnostic accuracy of pathologists | |
Liu et al. | Enhancing Diagnostic Precision in Gastric Bleeding through Automated Lesion Segmentation: A Deep DuS-KFCM Approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |