CN110992377B

CN110992377B - Image segmentation method, device, computer-readable storage medium and equipment

Info

Publication number: CN110992377B
Application number: CN201911214444.4A
Authority: CN
Inventors: 刘恩佑; 尹思源; 袁勇; 陈宽; 王少康
Original assignee: Infervision Medical Technology Co Ltd
Current assignee: Infervision Medical Technology Co Ltd
Priority date: 2019-12-02
Filing date: 2019-12-02
Publication date: 2020-12-22
Anticipated expiration: 2039-12-02
Also published as: CN110992377A

Abstract

The embodiment of the application provides an image segmentation method and device. The method comprises the following steps: acquiring an image sequence comprising a plurality of images, wherein each image in the plurality of images comprises an image of a segmentation target; extracting a plurality of pixel points of which the pixel values are smaller than a set pixel threshold value in the image; connecting a plurality of pixel points in a plurality of images into a three-dimensional entity; classifying the three-dimensional entity through a machine learning model to obtain a target three-dimensional entity comprising a segmentation target; and segmenting the segmentation target from the image according to the target three-dimensional entity. The image segmentation method has higher robustness, occupies less computing resources, and can be conveniently matched with other functions for use.

Description

Image segmentation method, device, computer-readable storage medium and equipment

Technical Field

The invention relates to the technical field of image processing, in particular to an image segmentation method, an image segmentation device, a computer-readable storage medium and computer-readable storage equipment.

Background

It makes sense to segment the desired region from the image, for example, the lung parenchymal region from a chest CT (Computed Tomography) image, which enables the computer to compute and process the local region of the image that represents a particular entity in a targeted manner. However, image segmentation, in particular medical image segmentation, is also challenging. Due to the high resolution of medical images, the image segmentation method in the prior art is not robust, or occupies a large amount of computing resources, so that the segmentation function is difficult to be used with other functions.

Disclosure of Invention

In view of the above, embodiments of the present invention are directed to providing an image segmentation method, apparatus, computer-readable storage medium and device, which can solve the above problems.

In one aspect, the present application provides an image segmentation method, including: acquiring an image sequence comprising a plurality of images, wherein each image in the plurality of images comprises an image of a segmentation target; extracting a plurality of pixel points of which the pixel values are smaller than a set pixel threshold value in the image; connecting a plurality of pixel points in a plurality of images into a three-dimensional entity; classifying the three-dimensional entity through a machine learning model to obtain a target three-dimensional entity comprising a segmentation target; and segmenting the segmentation target from the image according to the target three-dimensional entity.

In a particular embodiment of the present application, the setting of the pixel threshold is set according to an image characteristic of the segmentation target.

In a particular embodiment of the present application, the set pixel threshold is determined based on a maximum pixel value, a minimum pixel value, or an average pixel value of the segmentation target in the image.

In a particular embodiment of the application, the segmentation target comprises a lung and the image comprises a CT image.

In a particular embodiment of the present application, the set pixel threshold is represented by a value of Hu (Hounsfield units), which is in the range of-1000 to-800.

In a specific embodiment of the present application, connecting a plurality of pixel points in a plurality of images into a three-dimensional entity includes: and connecting a plurality of pixel points in the plurality of images into a three-dimensional entity through a connected domain algorithm.

In a particular embodiment of the present application, the machine learning model comprises an SVM (support vector machine) model.

In a particular embodiment of the present application, classifying the three-dimensional entity through the machine learning model to obtain a target three-dimensional entity including a segmentation target includes: calculating characteristic parameters of the three-dimensional entity; inputting the characteristic parameters into a machine learning model for classification; and obtaining a target three-dimensional entity according to the output of the machine learning model.

In a particular embodiment of the present application, the characteristic parameters comprise one or more of the following: physical volume, Hu value of the center point, coordinates of the center point, length-width ratio of the circumscribed cuboid, length-width ratio of the inscribed cuboid, physical width, physical height, physical length, and average Hu density.

In a particular embodiment of the present application, the image segmentation method further comprises: labeling the training samples to label the type represented by each training sample in the training samples; and inputting the marked training samples into a machine learning model so as to train the machine learning model.

In a particular embodiment of the present application, segmenting the segmentation target from the image based on the target three-dimensional entity comprises: extracting a background area of the image, wherein the background area does not comprise a target three-dimensional entity; filling the same pixel values in other areas except the background area in the image; and obtaining a segmentation target according to the filled image.

In another aspect, the present application provides an image segmentation apparatus comprising: the system comprises an acquisition module, a display module and a processing module, wherein the acquisition module is used for acquiring an image sequence comprising a plurality of images, and each image in the plurality of images comprises an image of a segmentation target; the extraction module is used for extracting a plurality of pixel points of which the pixel values are smaller than a set pixel threshold value in the image; the connecting module is used for connecting a plurality of pixel points in a plurality of images into a three-dimensional entity; the classification module is used for classifying the three-dimensional entity through the machine learning model to obtain a target three-dimensional entity comprising a segmentation target; and the segmentation module is used for segmenting the segmentation target from the image according to the target three-dimensional entity.

In another aspect, the present application provides a computer-readable storage medium storing a computer program for executing the above-described image segmentation method.

In another aspect, the present application provides an image segmentation apparatus comprising: a processor; a memory for storing the processor-executable instructions; the processor is used for executing the image segmentation method.

According to the image segmentation method, the image segmentation device, the computer-readable storage medium and the computer-readable storage equipment, the low-dimensional image (such as a two-dimensional image) is raised to the high-dimensional image (such as a three-dimensional image), and the region to be segmented in the low-dimensional image is segmented by the classification operation of the high-dimensional image, so that the segmentation problem of the two-dimensional image is converted into the classification problem of a three-dimensional entity, and the calculation is simplified. Such a method is particularly suitable for scenes that are capable of generating image sequences, such as CT image sequences, so that a plurality of images are combined into one image of a higher dimension, thereby performing high-dimensional operations. The classification is carried out through the machine learning model without occupying too much computing resources, and meanwhile, the introduction of the machine learning model can improve the robustness and can make up the defects that the traditional image processing technology depends on the selection of a starting point, holes are easy to appear and the like.

Drawings

FIG. 1 shows a flow diagram of an image segmentation method according to an embodiment of the present application;

FIG. 2 shows a flow diagram of an image segmentation method according to an embodiment of the present application;

FIG. 3 illustrates an image obtained during an intermediate stage of an image segmentation method according to an embodiment of the present application;

FIG. 4 illustrates an image obtained during an intermediate stage of an image segmentation method according to an embodiment of the present application;

FIG. 5 illustrates an image prior to segmentation using an image segmentation method according to an embodiment of the present application;

FIG. 6 illustrates an image after segmentation using an image segmentation method according to an embodiment of the present application;

FIG. 7 is a block diagram schematically illustrating an image segmentation apparatus according to an embodiment of the present application;

fig. 8 shows a block diagram of a schematic structure of an image segmentation apparatus according to an embodiment of the present application.

Detailed Description

The present invention is described in detail below with reference to specific embodiments in order to make the concept and idea of the present invention more clearly understood by those skilled in the art. It is to be understood that the embodiments presented herein are only a few of all embodiments that the present invention may have. Those skilled in the art who review this disclosure will readily appreciate that many modifications, variations, or alterations to the described embodiments, either in whole or in part, are possible and within the scope of the invention as claimed.

As used herein, the terms "first," "second," and the like are not intended to imply any order, quantity, or importance, but rather are used to distinguish one element from another. As used herein, the terms "a," "an," and the like are not intended to mean that there is only one of the described items, but rather that the description is directed to only one of the described items, which may have one or more. As used herein, the terms "comprises," "comprising," and other similar words are intended to refer to logical interrelationships, and are not to be construed as referring to spatial structural relationships. For example, "a includes B" is intended to mean that logically B belongs to a, and not that spatially B is located inside a. Furthermore, the terms "comprising," "including," and other similar words are to be construed as open-ended, rather than closed-ended. For example, "a includes B" is intended to mean that B belongs to a, but B does not necessarily constitute all of a, and a may also include C, D, E and other elements.

The terms "embodiment," "present embodiment," "an embodiment," "one embodiment," and "one embodiment" herein do not mean that the pertinent description applies to only one particular embodiment, but rather that the description may apply to yet another embodiment or embodiments. Those of skill in the art will understand that any of the descriptions given herein for one embodiment can be combined with, substituted for, or combined with the descriptions of one or more other embodiments to produce new embodiments, which are readily apparent to those of skill in the art and are intended to be within the scope of the present invention.

In some embodiments of the present application, an image may refer to the impression or recognition that light reflected or transmitted by an object is distributed in the human brain. According to the form of the image, the image may include a still image such as a photograph, a picture, and the like, and also include a moving image such as a video, a moving picture (e.g., a GIF (image interchange format) picture), and the like. The images may include images taken with a camera or other imaging device, manually drawn or synthesized images, and automatically drawn or synthesized images by a machine, depending on the manner in which the images are generated. In terms of the format of images, in particular digital images, the images may include JPEG (joint photographic experts group) images, PNG (portable network graphics) images, BMP (bitmap) images, TIFF (tagged image file format) images and DICOM (medical imaging and communications standard) images, wherein DICOM images are typically in the format employed for CT images, X-ray images, nuclear magnetic resonance images. The images include life scene images (life photographs, self-photographs, tourist photographs, etc.), medical images (CT images, X-ray images, nuclear magnetic resonance images, B-ultrasonic (B-type ultrasonic) images, etc.), geological images (geological radar images, rock slice images, etc.), industrial inspection images (e.g., flaw detection images), monitoring images (visible light video monitoring images, infrared video monitoring images), astronomical observation images, electron microscope images, and the like according to the purpose of the images.

In some embodiments of the present application, image segmentation may refer to extracting a specific region from an image or dividing a specific boundary such that one or more portions of the image can be clearly distinguished from one or more other portions for targeted computation or processing. The manner in which the image segmentation is performed may include edge detection, threshold segmentation, region growing methods, deep learning methods, and the like. The edge detection may be that, for an area to be divided having an edge with a discontinuous gray value, that is, a gray edge, boundary points of the area are detected by a derivative or other algorithm, and the boundary points are connected to form an edge of the area, so that the area is divided. The threshold segmentation may refer to segmentation according to a rule of determining whether or not any pixel in the target region has the same gradation feature, for example, by acquiring a threshold of a gradation using a gradation histogram and then classifying any pixel with reference to the threshold. The region growing method may be a method of dividing a target region by forming the target region with a combination of pixel points having similar characteristics. The deep learning method can be that an artificial neural network model is trained through an artificially labeled training set, and then a target area is identified by using the trained artificial neural network model, so as to be further segmented.

Generally, image segmentation techniques generally include two approaches, namely, a conventional image processing approach and a deep learning approach. The traditional image processing methods, namely, methods except artificial intelligence, such as edge detection and threshold segmentation, have the defect of low robustness. For example, the region growing method and the edge detection are easily interfered by some abnormal pixel points in the image, so that the segmented image has a hole. The artificial intelligence mode, such as the deep learning mode, has the disadvantages that a large amount of artificial marking training data is needed, more computing resources are occupied when the artificial intelligence mode is used, and the artificial intelligence mode is difficult to be matched with other functions.

Fig. 1 shows a flow chart of an image segmentation method according to an embodiment of the present application. The image segmentation method of fig. 1 may be performed by a computing device (e.g., a server).

As shown in fig. 1, the image segmentation method according to the present embodiment includes:

s110, acquiring an image sequence comprising a plurality of images, wherein each image in the plurality of images comprises an image of a segmentation target;

s120, extracting a plurality of pixel points of which the pixel values are smaller than a set pixel threshold value in the image;

s130, connecting a plurality of pixel points in a plurality of images into a three-dimensional entity;

s140, classifying the three-dimensional entity through a machine learning model to obtain a target three-dimensional entity comprising a segmentation target;

s150, segmenting the segmentation target from the image according to the target three-dimensional entity.

According to the image segmentation method of the embodiment, the two-dimensional image sequence is combined into the three-dimensional image, so that the segmentation target in the two-dimensional image becomes the three-dimensional entity, and the edge contour of the two-dimensional segmentation target becomes the outer surface of the three-dimensional entity, so that the pattern recognition is performed according to the appearance characteristics of the three-dimensional entity, and the edge contour meeting the requirement can be effectively recognized. The image segmentation method combines machine learning and a traditional image processing algorithm, so that the robustness is improved, the computational complexity is reduced, and the segmentation method can be combined with other functions.

In one embodiment, the sequence of images may refer to a series of images sequentially acquired at different times and different orientations of the target. The image sequence may refer to a sequence in which images are spatially arranged, such as CT images sequentially taken at different positions of a human body, or a plurality of flaw detection images sequentially taken at different positions with respect to one workpiece. The image sequence may be a time-series sequence of images, such as images continuously shot (high-speed shooting, delayed shooting), video frames, or the like.

In one embodiment, acquiring the image sequence may refer to receiving the image sequence transmitted from an external device or manually input, or generating the image sequence through the device's own function (scanning, shooting, etc.) and transmitting the image sequence to a module or part for image segmentation.

In one embodiment, the segmentation target may refer to an entity represented by a region desired to be segmented from the image. The division target may be point-like (zero-dimensional), linear (one-dimensional), planar (two-dimensional), and bulk (three-dimensional). The punctual segmentation target may refer to a small region represented by one or more pixels segmented from a one-dimensional, two-dimensional, or three-dimensional image, such as celestial bodies or stars in an astronomical observation image. The linear segmentation target may refer to a linear region segmented from a two-dimensional or three-dimensional image, such as a peripheral contour of an object (e.g., a human body contour), a feature line of the object (e.g., a grain trend of a rock), and the like. The planar division target may be a two-dimensional block region divided from a two-dimensional or three-dimensional image, for example, a plane logo (license plate, banner, etc.), a written text region, or the like. The volumetric segmentation object may be a three-dimensional volume segmented from a three-dimensional image or model, and may be a solid body, for example, a human organ in a three-dimensional image obtained by combining a plurality of continuous CT images, or a space, for example, an indoor space in a digitally reconstructed building.

The segmentation target may be the whole object, such as a chair, from a classroom scene image, or a part of an object, such as a license plate of a car from a road scene image showing the car, or an object containing an attached structure. In the last case, the segmentation objective may be: an object including a peripheral space, for example, when an animal is segmented from an image, a hair on the body of the animal and a space around the hair are segmented together; objects containing peripheral structures, such as human figures, when segmented from images, the human body is segmented together with the clothing it is wearing; objects comprising an interior space, for example when segmenting rocks from an image, segmenting rocks comprising a number of cavities, for example when segmenting lungs from an X-ray image, segmenting lungs with airways; when an object containing internal auxiliary structures, for example the heart, is segmented from a CT image, the heart with blood vessels is segmented.

In one embodiment, the image of the segmentation target may refer to an image or a visualization for displaying morphological features of the segmentation target. Segmenting the image of the target may include: external images such as visible light photographs taken from the outside; internal images, such as images taken by transmission of X-rays or other means that can show internal structures; the sectional image may be an image taken along a plane (normal or oblique plane) or a curved surface, such as a CT image, a magnetic resonance image, or the like.

In one embodiment, the image includes a video of the division target, which may mean that the video of the division target constitutes the whole image, or the video of the division target constitutes a part of the image, or the video of the division target overlaps with the image; the term "image" may also mean an image obtained by transforming or processing an image of a segmentation target, and may represent some features of an original image.

In an embodiment, a pixel point may refer to an inseparable minimum unit or element that forms an image, and is usually in a small square block shape or a point shape, and positions and pixel values of all pixels together form a final appearance of an image.

In one embodiment, the pixel value may be a value that determines the appearance characteristics of the pixel, such as brightness and color, and is, for example, between 0 and 255. In an embodiment, the setting of the pixel threshold may refer to an artificially set pixel value threshold, which is used to divide all pixel values in all images of the image sequence, and extract all pixel points higher than the threshold and/or all pixel points lower than the threshold.

In an embodiment, extracting the pixel points may refer to finding out all pixel points that meet a specific condition, or determining their respective position coordinates. In an embodiment, extracting the pixel points whose pixel values are smaller than the set pixel threshold may refer to finding out all the pixel points whose pixel values are smaller than the set pixel threshold, so that the pixel points can be processed or calculated in a targeted manner. The specific way of extraction may be to reserve all the pixel points whose pixel values are smaller than the set pixel threshold, and perform normalization processing on all other pixel points, for example, to assign a fixed pixel value (e.g., 0); the method may also be a binarization method, that is, all pixels with pixel values smaller than the set pixel threshold are assigned to a fixed pixel value (e.g., 0), and all pixels with pixel values larger than the set pixel threshold are assigned to a different fixed pixel value (e.g., 255).

In one embodiment, a three-dimensional entity may refer to a region of a three-dimensional image or a higher dimensional image having dimensions or features of three dimensions, length, width, and height, that a computer or other machine can identify, calculate, and process.

In an embodiment, connecting the plurality of pixel points in the plurality of images into the three-dimensional entity may mean extracting pixel points for each image in the image sequence, and connecting all the pixel points extracted from all the images together to form the three-dimensional entity.

In an embodiment, the method for connecting the plurality of pixel points in the plurality of images into the three-dimensional entity may be as follows: a connected domain algorithm or a region growing algorithm.

Specifically, the connected domain refers to an image region which is adjacent in an image and has the same pixel value, the connected domain algorithm marks out the pixel value in the adjacent domain around each pixel point and the same pixel point to form one or more connected domains, the neighborhood of the pixel of the two-dimensional plane image has 4 neighborhoods and 8 neighborhoods, and the neighborhood of the pixel (also called voxel) of the three-dimensional image has 12 neighborhoods, 18 neighborhoods and 26 neighborhoods. The region growing algorithm comprises: determining a seed pixel as a growth starting point, summing the pixel points with similar characteristics in the area near the seed pixel into the position of a target area, and repeating the previous steps by using a newly added pixel as a new seed pixel until any pixel point can not meet the condition to be included, thereby finally obtaining the three-dimensional entity. Embodiments of the present invention are not limited thereto, and other algorithms for connecting a plurality of pixels into a three-dimensional entity can be conceived by those skilled in the art based on the above.

In one embodiment, machine learning may refer to the process by which a computer or other machine can simulate or implement human learning behavior to acquire new knowledge or skills, reorganize existing knowledge structures to continually improve their performance. Types of machine learning may include: supervised learning, such as Support Vector Machines (SVMs), decision trees, and the like; unsupervised learning, such as clustering; deep learning, such as convolutional neural networks. In one embodiment, a machine learning model may refer to a model built into a computer or other machine that is capable of implementing machine learning functionality.

In one embodiment, classifying three-dimensional entities through a machine learning model may refer to: firstly, training a machine learning model by manually marking a sample, so that the machine learning model can distinguish a target three-dimensional entity meeting the requirement from other three-dimensional entities not meeting the requirement; and then, inputting the three-dimensional entity obtained in the last step into a trained machine learning model, classifying the three-dimensional entity, and judging whether the three-dimensional entity is a target three-dimensional entity containing a segmentation target. In one embodiment, the target three-dimensional entity may refer to a three-dimensional entity that can embody a boundary or a contour of a segmentation target, and thus can be used to segment the segmentation target from an image.

In an embodiment, the three-dimensional entity comprising the segmentation target may mean (if the segmentation target is three-dimensional), that the segmentation target constitutes a part of the target three-dimensional entity, that the segmentation target constitutes the whole of the target three-dimensional entity, or that the segmentation target coincides with (most of) the target three-dimensional entity. In another embodiment, the three-dimensional entity including the segmentation target may also mean (if the segmentation target is 0-dimensional, one-dimensional or two-dimensional), that the target three-dimensional entity is formed by extending a point-like (0-dimensional), a line-like (one-dimensional) or a planar (two-dimensional) segmentation target in a new dimension (e.g., a spatial dimension, a temporal dimension, etc.). For example, a plurality of images are taken in succession from the side of a pedestrian as an image sequence, a three-dimensional entity is formed by the image sequence in which the side image of the human body extends in the time dimension, and the human body contour in the images is segmented by means of the three-dimensional entity.

In an embodiment, the three-dimensional entities are classified through a machine learning model to obtain a target three-dimensional entity including a segmentation target, which may mean that a target three-dimensional entity meeting requirements exists in one or more three-dimensional entities obtained in the previous step according to a classification result. At this point, further processing may be performed. In another embodiment, the three-dimensional entities are classified by a machine learning model to obtain a target three-dimensional entity including a segmentation target, or the target three-dimensional entity that meets the requirement does not exist in one or more three-dimensional entities obtained in the previous step according to a classification result. At this time, returning to the second step, re-extracting a plurality of pixel points whose pixel values are smaller than the set pixel threshold value in the image, for example, re-extracting after adjusting the set pixel threshold value, or adjusting other parameters (for example, adjusting the window width and the window level of the CT image) so as to obtain a target three-dimensional entity meeting the requirements; this segmentation may also be discarded.

In one embodiment, segmenting the segmentation target from the image may refer to a computer or other machine that can recognize the contour of the segmentation target, distinguish the segmentation target from other parts of the image, and perform calculations and processing specific to the segmentation target. In an embodiment, the segmentation target is segmented from the image, and the specific method may be that the position coordinates of each pixel point on the contour of the segmentation target are obtained, so as to connect into a complete contour; it is also possible to assign the same pixel values (for example, black or white pixel values) to all regions in the image except for the division target so that these regions do not affect the next processing or recognition. In an embodiment, the segmenting target is segmented from the image according to the target three-dimensional entity, which may refer to that the positions of the pixel points forming the target three-dimensional entity in each image are judged according to the coordinates of all the pixel points forming the target three-dimensional entity, and the pixel points are found out, so that the segmenting target is obtained.

The following describes another embodiment of the present invention, which is a specific example of the embodiment of fig. 1 and may include one or more features of one or more of all of the embodiments described above.

In the present embodiment, the setting of the pixel threshold is set in accordance with the image characteristics of the division target.

According to the embodiment, the set pixel threshold is determined according to the image characteristics of the segmentation target, so that the extracted pixel points can accurately reflect the morphological characteristics of the segmentation target, and particularly can reflect the overall morphological outline of the segmentation target.

In one embodiment, the image characteristics of the segmentation target may refer to explicit characteristics of the segmentation target in the image that can be computed, observed, and identified. The image characteristics of the segmentation target may be, for example: dividing the size and the distribution rule of pixel values of an image area where a target is located; the change rule of the pixel values of the edge of the image area where the segmentation target is located; and the change rule of the pixel value of the image area where the segmentation target is located, such as texture features. In one embodiment, the setting of the pixel threshold value according to the image characteristics of the segmentation target may be based on the image characteristics of the segmentation target, and may be performed by setting one pixel threshold value such that the pixels extracted by the pixel threshold value can sufficiently exhibit the morphological features of the segmentation target, for example, the pixel threshold value can cover all pixels for displaying the segmentation target as much as possible.

In the present embodiment, the set pixel threshold is determined based on the maximum pixel value, the minimum pixel value, or the average pixel value of the segmentation target in the image.

According to the embodiment, the pixel threshold is determined according to the maximum pixel value, the minimum pixel value or the average pixel value of the segmentation target, so that the calculation of the threshold can be simplified, and the pixel threshold meeting the requirement can be obtained quickly.

In an embodiment, the maximum pixel value of the segmentation target in the image may refer to the pixel value of the pixel with the maximum pixel value among all pixels in the range in which the segmentation target is displayed in the image. In an embodiment, the minimum pixel value of the segmentation target in the image may refer to a pixel value of a pixel having a minimum pixel value among all pixels in a range in which the segmentation target is displayed in the image. In an embodiment, the average pixel value of the segmentation target in the image may refer to an average value of pixel values of all pixels in a range in which the segmentation target is displayed in the image.

In an embodiment, determining the set pixel threshold according to a certain pixel value may refer to directly using the pixel value as the set pixel threshold; or, the set pixel threshold value may be obtained by performing calculation according to the pixel value, for example, by adding a fixed value to the pixel value; the term "set pixel threshold" may also mean a set pixel threshold obtained by performing estimation based on the pixel value, for example, a range of pixel values is estimated with the pixel value as a median, and a maximum value, a minimum value, or an arbitrary median of the range is used as the set pixel value.

In an embodiment, the set pixel threshold is determined according to the maximum pixel value of the segmentation target in the image, which may mean that the maximum pixel value of the segmentation target in the image is directly used as the set pixel threshold; or, the method may also refer to calculating according to a maximum pixel value of the segmentation target in the image to obtain a set pixel threshold, for example, subtracting a fixed value from the maximum pixel threshold, adding a fixed value, or substituting a function to obtain the set pixel threshold; the set pixel threshold may be obtained by estimating the maximum pixel value of the segmentation target in the image, for example, a range of pixel values may be estimated with the maximum pixel value as a reference, and the minimum value of the range may be used as the set pixel value.

In an embodiment, the set pixel threshold is determined according to the minimum pixel value of the segmentation target in the image, which may mean that the minimum pixel value of the segmentation target in the image is directly used as the set pixel threshold; or, the method may also refer to calculating according to a minimum pixel value of the segmentation target in the image to obtain a set pixel threshold, for example, subtracting a fixed value from the minimum pixel threshold, adding a fixed value, or substituting a function to obtain the set pixel threshold; the set pixel threshold may be obtained by estimating the minimum pixel value of the segmentation target in the image, for example, a range of pixel values may be estimated with the minimum pixel value as a reference, and the maximum value of the range may be set as the set pixel value.

In an embodiment, the set pixel threshold is determined according to an average pixel value of the segmentation target in the image, which may mean that the average pixel value of the segmentation target in the image is directly used as the set pixel threshold; or, the method may also refer to calculating an average pixel value of the segmentation target in the image to obtain a set pixel threshold, for example, subtracting a fixed value from the average pixel threshold, adding a fixed value to the average pixel threshold, or substituting a function to obtain the set pixel threshold; the set pixel threshold may be obtained by estimating an average pixel value of the segmentation target in the image, for example, a range of pixel values may be estimated with the average pixel value as a reference, and any intermediate value of the range may be used as the set pixel value.

In this embodiment, the segmentation target includes a lung and the image includes a CT image.

According to this embodiment, it is particularly advantageous to apply the segmentation method of the present application in the segmentation of CT images of the lungs. The lung has better homogeneity, the pixel value has no obvious fluctuation, the outline has stability, and the method is particularly suitable for being segmented from CT images with higher resolution.

In one embodiment, the lung refers to a respiratory organ of a human or mammal. In one embodiment, the CT image may be an image obtained by performing a cross-sectional scan around a portion of a human body with a highly sensitive detector one by one using a precisely collimated X-ray beam, gamma rays, ultrasonic waves, or the like. The CT image has the characteristics of short scanning time, clear image and high resolution, and can better display organs formed by soft tissues. The CT image includes X-ray CT, gamma-ray CT, and the like. For a lung CT image, the lungs contain a large volume of air, so their average Hu value is close to air (air Hu value-1000). Segmentation of lung regions from CT images may facilitate automated lesion detection analysis for lung-related diseases; some indexes of the lung region such as volume density can be calculated more accurately; lesion marking may be assisted, for example in conjunction with pulmonary nodule detection, as a pre-processing procedure for pulmonary nodule detection, to remove non-pulmonary false positives; but also to participate in the decision-making of the doctor or caregiver.

In the present embodiment, the set pixel threshold is represented by a value of Hu, which is in the range of-1000 to-800.

According to the present embodiment, setting the set pixel threshold within this range proves to be particularly easy to extract the three-dimensional solid of the lung, enabling efficient segmentation of the lung region.

In one embodiment, the Hu value, or CT value, is expressed in Hounsfield units (Hounsfield Unit). In one embodiment, the pixel values are represented by Hu values, which may mean that, for a CT image, the pixel value of each pixel point has a corresponding Hu value, which represents the attenuation value of an X-ray or other ray after being absorbed through the tissue. The Hu values for human tissue range from-1000 Hu for air to +1000Hu for cortical bone, for a total of 2000 CT values. In one embodiment, the Hu value of the set pixel threshold is in the range of-1000 to-800, which may mean that the set pixel threshold of the CT image corresponds to a Hu value, and the Hu value is in the range of-1000 to-800. This is because the air Hu value is-1000, while the lung Hu value is slightly greater than air. In a particular embodiment, the Hu value for the pixel threshold is set in the range of-950 to-850. In a particular embodiment, the Hu value for the pixel threshold is set in the range of-925 to-875. In a particular embodiment, the Hu value for the pixel threshold is set in the range of-905 to-895. In a particular embodiment, the Hu value for the pixel threshold is set to-900.

In this embodiment, S130 in the embodiment of fig. 1 specifically includes: and connecting a plurality of pixel points in the plurality of images into a three-dimensional entity through a connected domain algorithm.

According to the embodiment, the connected domain algorithm is an algorithm frequently used in the traditional image processing, the algorithm technology is mature, the use is simple, and excessive occupation of equipment computing resources is avoided.

In an embodiment, the connected component algorithm may be an algorithm that marks out pixel points having the same pixel values in a neighborhood around a pixel point in an image to form one or more marked regions. In an embodiment, the plurality of pixel points in the plurality of images are connected to form a three-dimensional entity through a connected domain algorithm, which may mean that, for the plurality of images subjected to pixel point extraction, all the pixel points extracted from all the images may be assigned to the same pixel value. Then, a connected component algorithm is applied to all images, and all adjacent (e.g., 26-neighborhood) pixel points to which the pixel value is assigned are found and marked as a three-dimensional region.

In the present embodiment, the Machine learning model includes an SVM (Support Vector Machine) model.

According to the embodiment, the SVM model is a simpler and easier-to-use machine learning model, training can be realized without too many samples, consumed computing resources are less, and the SVM model is very suitable for being used as a machine learning model for image preprocessing.

In one embodiment, the SVM model may refer to a classifier that performs binary classification on data in order to find a hyperplane to segment samples so that the interval is maximized. The SVM model has the characteristics of simple structure, excellent performance and generalization capability, suitability for processing high-dimensional data and high learning speed. The SVM model comprises a linear SVM model, a non-linear SVM model, a multi-classification SVM model, a least square SVM model, a structured SVM model and the like.

In this embodiment, S140 in the embodiment of fig. 1 specifically includes: calculating characteristic parameters of the three-dimensional entity; inputting the characteristic parameters into the machine learning model for classification; and obtaining a target three-dimensional entity according to the output of the machine learning model.

According to the embodiment, the three-dimensional entity is classified through the characteristic parameters of the three-dimensional entity, so that the characteristic variables of the three-dimensional entity can be controlled within a certain range, a very complex machine learning model is avoided from being constructed due to excessive variables, and the calculated amount is obviously increased; in addition, different key characteristic parameters are selected according to different attributes of the three-dimensional entity, so that the classification efficiency can be improved.

In one embodiment, the characteristic parameter of the three-dimensional entity may refer to a parameter capable of characterizing the three-dimensional entity. By way of example, the characteristic parameters of the three-dimensional entity may include: dimensions, such as length, width, volume, and the like; location, such as center point coordinates; pixel value distributions such as average pixel value, maximum pixel value, minimum pixel value, and the like; physical properties such as density, weight, etc.

In an embodiment, calculating the characteristic parameters of the three-dimensional entity may refer to calculating the three-dimensional entity in the three-dimensional image formed by the image sequence by a pre-written algorithm to obtain corresponding characteristic parameters; or manually calculating or estimating characteristic parameters and inputting the characteristic parameters into a machine.

In an embodiment, the step of inputting the feature parameters into the machine learning model for classification may be to input the calculated feature parameters into a machine learning model trained in advance for automatic classification.

In an embodiment, the obtaining of the target three-dimensional entity according to the output of the machine learning model may mean that the machine learning model classifies the input three-dimensional entities one by one, determines whether the three-dimensional entities belong to three-dimensional entities including a segmentation target, and performs the next step once the three-dimensional entities classified as the target three-dimensional entities including the segmentation target are output; or, after all the obtained three-dimensional entities are input into the machine model, if a target three-dimensional entity meeting the requirements is not output, returning to the original step, adjusting some parameters (for example, setting a pixel threshold value) or other aspects, obtaining a new three-dimensional entity again, and inputting the new three-dimensional entity into the machine learning model again until the target three-dimensional entity is obtained; it may also mean that no target three-dimensional entity meeting the requirements is output, and the segmentation of the segmentation target in the current image sequence is abandoned.

In this embodiment, the characteristic parameters include one or more of the following: physical volume, Hu value of the center point, coordinates of the center point, length-width ratio of the circumscribed cuboid, length-width ratio of the inscribed cuboid, physical width, physical height, physical length, and average Hu density.

According to the embodiment, the selected characteristic parameters prove to well represent the characteristics of the segmentation target, and the target three-dimensional entity can be effectively screened out through the selected parameters, so that the segmentation target is segmented from the image.

In an embodiment, the physical volume may refer to a real volume of a three-dimensional entity in the objective physical world, e.g. a real physical volume occupied by a lung organ in a human body.

In one embodiment, the Hu value of the central point may refer to a Hu value of a pixel at a geometric central point or a gravity center point of a three-dimensional entity in a three-dimensional image formed by a CT image sequence, for example, a Hu value at a geometric central point of a lung organ.

In one embodiment, the coordinates of the central point may refer to coordinate values of a geometric central point or a gravity point of the three-dimensional entity in a cartesian coordinate system, a polar coordinate system or a cylindrical coordinate system in the three-dimensional image, for example, the coordinates of the geometric central point of the lung organ.

In an embodiment, the length to width ratio of the circumscribed cube may refer to the length to width ratio of the circumscribed cube of the three-dimensional entity, e.g. the length to width ratio of a circumscribed cuboid of a lung organ.

In an embodiment, the aspect ratio of the inscribed cube may refer to a length to width ratio of the inscribed cube of the three-dimensional entity, such as a length to width ratio of an inscribed cuboid of a lung organ.

In one embodiment, the physical width may refer to a real width of the three-dimensional entity in the objective physical world, such as a length of a lung organ in a left-right direction of a human body.

In an embodiment, the physical height may refer to a real height of the three-dimensional entity in the objective physical world, such as a length of a lung organ in a top-bottom direction of a human body.

In an embodiment, the physical length may refer to a real length of a three-dimensional entity in an objective physical world, such as a length of a lung organ in an anterior-posterior direction of a human body.

In one embodiment, the average Hu density may refer to the sum of Hu values of pixel points for displaying all parts of a three-dimensional entity in a three-dimensional image formed by a CT image sequence divided by the number of pixel points, for example, the sum of Hu values of pixel points of a lung organ in the CT three-dimensional image divided by the number of pixel points.

Fig. 2 shows a flow chart of an image segmentation method according to an embodiment of the invention. The present embodiment is described below with reference to fig. 2. This embodiment is a specific example of the embodiment of fig. 1, and may include one or more features of one or more of all of the embodiments described above.

In this embodiment, the image segmentation method further includes: s111, labeling the training samples to label the type represented by each training sample in the training samples; and S112, inputting the marked training samples into the machine learning model to train the machine learning model.

According to the embodiment, the machine learning model is trained, so that the output of the machine learning model is close to the result desired by people, and the classification of the characteristic parameters is more accurate and reliable.

In an embodiment, the training samples may refer to samples used to train a machine learning model, wherein desired classification results have been labeled in advance. A training sample is, for example, a set of characteristic parameters of a three-dimensional entity. In an embodiment, the step of labeling the training samples may be to label, in a manual manner, a result that is desired to be obtained through the machine learning model in the training samples, so as to control output of the machine learning model during training, so that the output of the machine learning model can be continuously close to the labeled result. In an embodiment, the type indicated by the training sample is marked by manually determining the type of the training sample in advance, and marking the type on the training sample, for example, manually determining the type of a three-dimensional entity with a set of feature parameters in advance (for example, whether the three-dimensional entity belongs to a target three-dimensional entity), and then marking the set of feature parameters to indicate the feature parameters which belong to or do not belong to the target three-dimensional entity. In an embodiment, the labeled training samples are input into the machine learning model to train the machine learning model, which may mean that a plurality of groups of characteristic parameters labeled with the types of the training samples are input into the machine learning model, so that the machine learning model continuously operates, and the output types of the machine learning model gradually approach or reach the classification results labeled in advance.

Another embodiment according to the present invention is described below with reference to fig. 3 and 4, and this embodiment is a specific example of the embodiment of fig. 1 and may include one or more features of one or more of all of the embodiments described above.

In this embodiment, S150 in the embodiment of fig. 1 specifically includes: extracting a background area of the image, wherein the background area does not comprise a target three-dimensional entity; filling the same pixel values in other areas except the background area in the image; and obtaining a segmentation target according to the filled image.

According to the embodiment, the same value is filled in all parts except the background area, so that the gap in the target three-dimensional entity can be removed, and the segmentation target can be segmented more completely.

In an embodiment, the background regions may refer to those regions that are the background of the target three-dimensional entity and do not contribute to subsequent processing, are not focused on, or are not of interest, see the black parts in fig. 3. In an embodiment, extracting the background region of the image may refer to finding out the background region of the image so that the other regions except the background region in the image are regions where or surrounded by the segmentation target. In an embodiment, the specific way of extracting the background region may be a water flooding filling method, that is, starting from a pixel, filling the same color to pixels having the same pixel value as that in the neighborhood (4 neighborhood, 8 neighborhood, etc.) of the pixel until the boundary of the graph; other methods that can extract the background region as a connected domain are also possible, such as region growing algorithm, edge detection algorithm, etc.

In one embodiment, the other region than the background region may refer to a region where only the target three-dimensional entity exists, for example, a region in the image occupied by the target three-dimensional entity; it may also refer to a combination of the target three-dimensional entity and other regions, such as a region not occupied by the target three-dimensional entity but surrounded by the portion occupied by the target three-dimensional entity. Specifically, the surrounding background area in the image surrounds a central area, which is mainly the segmentation object, but there are some gaps in the segmentation object belonging to other parts, see the black small blocks in the middle white part in fig. 3. In an embodiment, the filling of the same pixel values in the other regions of the image except the background region may refer to that, after the background region is extracted, the image is subjected to a reverse selection operation, and the other regions of the image except the background region are selected and filled so as to have the same pixel values.

In an embodiment, obtaining the segmentation target according to the filled image may mean that the filled image is a binary image with only two pixel values (see fig. 4), and performing calculation according to the binary image to obtain a result that the image only displays the segmentation target, and the rest of the image does not display or has the same pixel value. In an embodiment, a specific method for obtaining the segmentation target according to the filled image may be a multiplication operation, that is, taking the filled binary image as a template, and performing a multiplication operation with the original image to obtain an image which only displays the segmentation target and has the remaining pixel value of 0.

A specific example of an image segmentation method according to an embodiment of the present invention is described below with reference to fig. 5 and 6.

A sequence of CT images is loaded containing cross sectional images of the lung region, see fig. 5. And (3) deducting all three-dimensional connected regions with larger air occupation ratio by utilizing the characteristic that the air Hu value is-1000, for example, extracting pixel points smaller than a threshold value-900 in all images through binarization operation to form the three-dimensional connected regions. And calculating characteristic parameters of the extracted three-dimensional connected domain, such as average Hu density, physical volume and the like. And classifying the three-dimensional connected domains through a SVM classifier trained in advance, and if the three-dimensional connected domains classified as the lung region exist, carrying out the next step. And finding out a two-dimensional region in the image corresponding to the three-dimensional connected domain classified into the lung region, wherein the obtained two-dimensional region usually has gaps. And filling the two-dimensional area by using a water diffusion filling method, and removing gaps to obtain a binary template. Finally, the binarization template is multiplied with the original CT image, i.e. the lung region is segmented from the original image, see fig. 6.

Fig. 7 is a block diagram illustrating a schematic structure of an image segmentation apparatus 700 according to an embodiment of the present invention.

According to the present embodiment, the image segmentation apparatus 700 includes:

an obtaining module 710, configured to obtain an image sequence including a plurality of images, where each image of the plurality of images includes an image of a segmentation target;

an extracting module 720, configured to extract a plurality of pixel points in the image, where a pixel value of the pixel points is smaller than a set pixel threshold;

a connecting module 730, configured to connect the plurality of pixel points in the plurality of images into a three-dimensional entity;

a classification module 740, configured to classify the three-dimensional entity through a machine learning model to obtain a target three-dimensional entity including the segmentation target;

a segmentation module 750 configured to segment the segmentation target from the image according to the target three-dimensional entity.

For the description of the modules of this embodiment, refer to the description above regarding the embodiment of fig. 1.

In an embodiment, the connection module 730 is configured to:

and connecting a plurality of pixel points in the plurality of images into a three-dimensional entity through a connected domain algorithm.

In an embodiment, the classification module 740 further comprises:

the calculating unit is used for calculating characteristic parameters of the three-dimensional entity;

the classification unit is used for inputting the characteristic parameters into the machine learning model for classification;

and the acquisition unit is used for acquiring a target three-dimensional entity according to the output of the machine learning model.

In an embodiment, the image segmentation apparatus 700 further includes:

the labeling module 760 is configured to label the plurality of training samples to label a type represented by each of the plurality of training samples;

the training module 770 is configured to input the labeled multiple training samples into a machine learning model to train the machine learning model.

For specific details of each module of the present embodiment, refer to the detailed description above regarding the embodiment of fig. 2.

In one embodiment, the segmentation module 750 further comprises:

the extraction unit is used for extracting a background area of the image, and the background area does not comprise a target three-dimensional entity;

a filling unit for filling the same pixel value to other regions except the background region in the image;

and the acquisition unit is used for acquiring the segmentation target according to the filled image.

Fig. 8 shows a block diagram of an image segmentation apparatus according to an embodiment of the present application. Next, an image segmentation apparatus according to an embodiment of the present application is described with reference to fig. 8.

As shown in fig. 8, the image segmentation apparatus 800 includes one or more processors 810 and memory 820.

The processor 810 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the image splitting device 800 to perform desired functions.

Memory 820 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 810 to implement the image segmentation methods of the various embodiments of the present application described above and/or other desired functions.

In one example, the image segmentation apparatus 800 may further include: an input device 830 and an output device 840, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

For example, the input device 830 may be a microphone or an array of microphones for capturing a speech input signal; may be a communications network connector for receiving the collected input signals from a cloud or other device; but may also include, for example, a keyboard, mouse, etc.

The output device 840 may output various information including the determined distance information, direction information, and the like to the outside. The output devices 840 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.

Of course, for the sake of simplicity, only some of the components related to the present application in the image segmentation apparatus 800 are shown in fig. 8, and components such as buses, input/output interfaces, and the like are omitted. In addition, the image segmentation apparatus 800 may include any other suitable components, depending on the particular application.

In addition to the above-described methods and apparatus, embodiments of the present application may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform the steps in a parking method according to various embodiments of the present application described in the above-mentioned "exemplary methods" section of this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The concepts, principles and concepts of the invention have been described above in detail in connection with specific embodiments (including examples and illustrations). It will be appreciated by persons skilled in the art that embodiments of the invention are not limited to the specific forms disclosed above, and that many modifications, alterations and equivalents of the steps, methods, apparatus and components described in the above embodiments may be made by those skilled in the art after reading this specification, and that such modifications, alterations and equivalents are to be considered as falling within the scope of the invention. The scope of the invention is only limited by the claims.

Claims

1. An image segmentation method, comprising:

acquiring an image sequence comprising a plurality of images, wherein each image in the plurality of images comprises an image of a segmentation target;

extracting a plurality of pixel points of which the pixel values are smaller than a set pixel threshold value in the image;

connecting the plurality of pixel points in the plurality of images into a three-dimensional entity;

classifying the three-dimensional entity through a machine learning model to obtain a target three-dimensional entity comprising the segmentation target;

and obtaining the positions of the segmentation targets in the plurality of images according to the positions of the pixel points forming the target three-dimensional entity in the plurality of images, so as to segment the segmentation targets from the plurality of images.

2. The image segmentation method according to claim 1, wherein the set pixel threshold is set according to an image characteristic of the segmentation target.

3. The image segmentation method according to claim 1 or 2, wherein the set pixel threshold is determined according to a maximum pixel value, a minimum pixel value, or an average pixel value of the segmentation target in the image.

4. The image segmentation method according to claim 1 or 2, wherein the segmentation target includes a lung, and the image includes an electron Computed Tomography (CT) image.

5. The image segmentation method according to claim 4, wherein the set pixel threshold is represented by a Hu value in Henschel units, the Hu value of the set pixel threshold being in a range of-1000 to-800.

6. The image segmentation method according to claim 1 or 2, wherein the connecting the plurality of pixel points in the plurality of images into a three-dimensional entity comprises:

and connecting the plurality of pixel points in the plurality of images into a three-dimensional entity through a connected domain algorithm.

7. The image segmentation method according to claim 1 or 2, wherein the machine learning model comprises a Support Vector Machine (SVM) model.

8. The image segmentation method according to claim 5, wherein the classifying the three-dimensional entity through a machine learning model to obtain a target three-dimensional entity including the segmentation target comprises:

calculating characteristic parameters of the three-dimensional entity;

inputting the characteristic parameters into the machine learning model for classification;

and obtaining a target three-dimensional entity according to the output of the machine learning model.

9. The image segmentation method according to claim 8, wherein the feature parameters include one or more of: physical volume, Hu value of the center point, coordinates of the center point, length-width ratio of the circumscribed cuboid, length-width ratio of the inscribed cuboid, physical width, physical height, physical length, and average Hu density.

10. The image segmentation method according to claim 8 or 9, further comprising:

labeling a plurality of training samples to label the type represented by each training sample in the plurality of training samples;

inputting the labeled training samples into the machine learning model to train the machine learning model.

11. The image segmentation method according to claim 1 or 2, wherein the segmenting the segmentation target from the image according to the target three-dimensional entity comprises:

extracting a background area of the image, wherein the background area does not comprise the target three-dimensional entity;

filling other areas except the background area in the image with the same pixel value;

and obtaining the segmentation target according to the filled image.

12. An image segmentation apparatus, comprising:

the system comprises an acquisition module, a display module and a processing module, wherein the acquisition module is used for acquiring an image sequence comprising a plurality of images, and each image in the plurality of images comprises an image of a segmentation target;

the extraction module is used for extracting a plurality of pixel points of which the pixel values are smaller than a set pixel threshold value in the image;

the connecting module is used for connecting the pixel points in the images into a three-dimensional entity;

the classification module is used for classifying the three-dimensional entity through a machine learning model to obtain a target three-dimensional entity comprising the segmentation target;

and the segmentation module is used for obtaining the positions of the segmentation targets in the plurality of images according to the positions of the pixel points forming the target three-dimensional entity in the plurality of images, so that the segmentation targets are segmented from the plurality of images.

13. A computer-readable storage medium, characterized in that the storage medium stores a computer program for executing the image segmentation method according to any one of claims 1 to 11.

14. An image segmentation apparatus characterized by comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor configured to perform the image segmentation method according to any one of claims 1 to 11.