CN112241945A

CN112241945A - Digital pathological image intelligent analysis method with deep learning algorithm and hardware integral optimization

Info

Publication number: CN112241945A
Application number: CN201910643108.5A
Authority: CN
Inventors: 张佩珩; 周丕轩; 刘旭华
Original assignee: Guangzhou Xinrui Medical Technology Co ltd
Current assignee: Guangzhou Xinrui Medical Technology Co ltd
Priority date: 2019-07-17
Filing date: 2019-07-17
Publication date: 2021-01-19

Abstract

Through the overall optimization design of an algorithm and algorithm processor hardware, a deep learning model suitable for pathological image recognition is submitted, the training process and the derivation speed of the model are accelerated by utilizing a high-performance computing technology, and the precision of the AI auxiliary diagnosis system is further improved through various image processing methods and multi-model combined training.

Description

Digital pathological image intelligent analysis method with deep learning algorithm and hardware integral optimization

The technical route is as follows:

the method mainly comprises the steps of constructing a pathological image deep learning model, improving accuracy and improving speed of pathological recognition, through discussing advantages and disadvantages of different deep learning models in pathological image recognition, providing an EU-Net image semantic segmentation model with fewer high-performance parameters, and meanwhile, optimizing the flow of each stage in the pathological image recognition flow based on a Tensorflow platform, so that the accuracy and the derivation speed of a pathological image recognition system are improved. Finally, the evaluation results in conclusion and suggestion.

The research method comprises the following steps:

and researching an automatic labeling correction algorithm of the pathological image. This can lead to errors in the vicinity of the marked lines, since different classes of cells are very difficult to distinguish from each other with well-defined boundaries. In view of this, the project will study a method for reducing the loss of the area during neural network training so as to eliminate the boundary error.

An Otsu threshold automatic segmentation algorithm of the ultrahigh-resolution images based on the thread pool is adopted, so that threshold segmentation can be rapidly carried out on the pathological images. The conventional Otsu algorithm needs to read the whole image into a memory for processing during calculation, but the memory occupied by the high-resolution image is large, on one hand, a common computer does not have the required memory, and on the other hand, the time spent on opening up the large memory is long, and on the other hand, the algorithm processing speed is low. Therefore, an Otsu algorithm implementation flow based on a thread pool is designed, the method is not limited by a memory, the computing capacity of a CPU is fully utilized, and the computing time of the algorithm is greatly reduced.

Aiming at the defects of low precision and low speed of the existing deep learning method in intelligent interpretation of cancer pathological pictures, the project calculates fewer research parameters and a more efficient image semantic segmentation model EU-Net. The model is based on an image semantic segmentation model U-Net, model parameters are reduced to 1/10 of the original model through the use of a large amount of depth perception separable convolution kernels in the model, and meanwhile, in order to improve the accuracy of the model, a pooling layer is replaced by a hole convolution layer. The application of the two types of convolution greatly improves the interpretation speed of the cancer pathological picture.

Aiming at the existing intelligent interpretation flow of cancer pathological pictures, data reading multithreading parallelization is carried out in a data input stage, preprocessed batch data are stored by utilizing a queue technology, and the throughput of data samples is greatly improved; in the training and derivation stage, according to the characteristic of small model, a multi-GPU synchronous training algorithm is constructed, and the training and derivation speed of the semantic segmentation model is greatly improved; and in the post-processing stage, the predicted result is reduced in proportion to synthesize the whole image so as to accelerate the feature extraction. The data throughput of the whole intelligent cancer pathological picture interpretation process is maximized, and the interpretation speed is greatly improved.

A processing system for pathological image segmentation using a full convolution network. Unlike traditional target recognition and image segmentation, pathological images generally have the characteristic of ultrahigh resolution, and the single resolution is generally greater than 100000 × 200000, so that the method cannot be directly applied to the traditional image recognition and segmentation processing flow. In the project, a series of operations including image segmentation, image processing, image segmentation, heat map generation, feature mining, learning and the like are researched to segment pathological images, and parallel technology and multi-model combined training are utilized to improve the algorithm recognition effect and greatly reduce the processing time of the system.

The whole process of deep learning pathological image is divided into the following steps as shown in the figure. The Model shown in the figure is the current mainstream Deep learning pathological image recognition framework, and the most core part is the convolutional neural network classification Model in the Deep learning Model (Deep Model), which is still the mainstream Model in pathological image recognition at present.

Drawings

Fig. 1 is a mainstream model in pathological image recognition.

Detailed Description

According to the latest 2018 global cancer statistics report (GLOBOCAN) statistics, the number of global cancer occurrence and death is high, and due to the limitation of diagnosis technology, accurate diagnosis of cancer has certain challenge in diagnosis, and the accuracy of diagnosis directly determines the formulation of the following treatment scheme, so that the method has important clinical value. The project anthropomorphic automatically realizes the quantitative analysis of cell and tissue levels. Through a big data-driven learning mode, 1) automatic identification of a pathological change area, 2) extraction of morphological characteristics of pathological tissues and 3) construction of a diagnosis decision support system are realized, and multiple disease types can be expanded. The support system is combined with the server to achieve the intelligent analysis speed of 38 seconds per slice, and provides a quick, accurate, high-repeatability and standardized pathology auxiliary diagnosis result for a doctor.

The project takes a high-resolution full-scan pathological image as an object, and focuses on expanding various disease types through the driving of a large amount of high-quality data, thereby enabling pathologists and realizing efficient and accurate diagnosis and decision. The research to be carried out by the project has important clinical significance, and simultaneously explores a new solution for the method for analyzing the high-resolution full-scanning pathological image and the research of the pathology group based on the sub-vision, and can provide a new technology and a solution for the analysis and the processing of other similar images

The invention has the bright points that:

1) an automatic labeling correction algorithm for pathological images is provided to reduce errors generated near a labeling line due to the fact that cells of different classes are mixed with each other.

2) An Otsu threshold automatic segmentation algorithm of the ultrahigh-resolution images based on the thread pool is provided, so that threshold segmentation can be rapidly carried out on the pathological images.

3) The image semantic segmentation model EU-Net with fewer parameters and higher efficiency is provided, and a depth perception separable convolution kernel and a void convolution layer implementation method are researched.

4) The data reading multithreading parallelization is carried out in the data input stage, the preprocessed batch data are stored by utilizing a queue technology, and the data IO speed is accelerated.

5) The method greatly reduces the processing time of the system by using the GPU parallel technology, and further improves the algorithm recognition effect by using multi-model joint training.

The enrichment and perfection of pathological image data sets and the development of deep learning technology promote the generation of a large number of intelligent pathological image recognition systems using the deep learning technology. However, the convolutional neural network classification model used by most pathology recognition systems does not identify small cancer regions well and predict the whole image too slowly.

Based on the requirements, the project utilizes the deep learning technology, and mainly improves the accuracy and speed of pathological image recognition under the conditions of considering the characteristics of oversized pixels of pathological images and the characteristics of high accuracy requirement of pathological image recognition, and two key problems are mainly discussed, including: how to construct a deep learning model to meet the requirements of pathological image recognition on speed and accuracy, and how to accelerate the training of the model and the derivation of the pathological image recognition through a TensorFlow platform. The key technical problems to be solved and the specific research and development contents comprise:

on one hand, a deep learning model suitable for pathological image recognition is constructed. In recent years, deep learning is developed, so that image classification, target detection and semantic segmentation have mature frames, however, few frames use pathological images with oversized pixels as application scenes, the 20-ten-thousand-10-thousand oversized pixels of the pathological images cannot be completely put into a deep learning model for training, results obtained by different models in the same way are different in precision, and the sizes and derivation speeds of models obtained by different models are different, so that the deep learning model suitable for pathological image recognition is constructed, and the deep learning model has important significance for improving the accuracy and the derivation speed of cancer recognition of the pathological images.

On the other hand, the training process of the model is accelerated, and the derivation speed is improved. The efficient model can achieve the optimal training effect and derivation speed only by combining a high-performance training framework, the input flow of data and the scheduling mode of the multi-GPU training model can affect the speed of the model, and if the data is not supplied with the GPU quickly, the utilization rate of the GPU is extremely low, so that the data input flow is optimized, and the GPU training scheduling algorithm is adjusted according to the model, so that the method has important significance for improving the pathological image recognition speed and accuracy.

In summary, the research objective of the project is to greatly improve the accuracy and the prediction speed of the pathological image recognition system through the construction of the deep learning model and the optimization of the pathological recognition process based on the pathological image data.

The traditional habits in the pathological field are changed in a digitalized manner, the data storage and analysis level is improved, the use threshold of medical units is reduced, and the popularization of the AI technology in the field of health medical big data is promoted. In hospitals above the second level in China, the pathological data which needs to be managed for a long time by a pathological center or a pathological department is hundreds of thousands of parts to millions of parts, and the hospitals all have urgent needs for digital pathological big data storage and AI auxiliary analysis, but are often limited by expenditure budget and technical capability, so that an AI big data product which is low in cost, high in reliability, easy to use and easy to manage is provided for the hospitals facing the market, the pathological doctors in primary hospitals are helped to improve the diagnosis and data management level, and the hospitals have huge social benefits and huge economic benefits.

Claims

1. Extracting the interested region in the pathological section image, namely removing irrelevant parts, and only leaving the pathological tissue region possibly containing cancer.

2. A training data set is constructed, segmentation is carried out on samples with the size of 256 × 256 or 512 × 512 in a region of interest, the samples with the cancer regions in the small samples are set as positive samples, and other samples are set as negative samples, and data enhancement and balance work is carried out at the same time.

3. The google lenet neural network based on the above data set was trained.

4. And (3) constructing a heat map (cancer probability map), predicting the output of the input small sample picture through a trained GoogleLeNet model, and combining the outputs to form the heat map.

5. And (4) carrying out feature extraction on the whole section of the constructed heat map, training an Xgboost classifier, and carrying out cancer judgment on the digital pathological section.