CN101103924A

CN101103924A - Breast cancer computer-aided diagnosis method and system based on mammography

Info

Publication number: CN101103924A
Application number: CNA2007100527471A
Authority: CN
Inventors: 宋恩民; 姜娈; 金人超; 刘宏; 许向阳
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2007-07-13
Filing date: 2007-07-13
Publication date: 2008-01-16
Anticipated expiration: 2027-07-13
Also published as: CN100484474C

Abstract

The invention discloses a mammogram-based computer-aided diagnosis method and system for breast cancer. The present invention firstly inputs a mammogram to be diagnosed into the system, and obtains a series of related features about the suspicious mass area through the processing of the region of interest extraction module, the suspicious mass area segmentation module and the feature extraction module of the suspicious mass area value, and then input the feature value into the trained classifier to classify and identify the suspicious mass area, and finally locate the segmentation result of the final suspicious mass area automatically detected by the computer on the input mammogram to be diagnosed, and calculate the obtained The region-related eigenvalues of are displayed to the radiologist as needed, thereby prompting the radiologist to focus on the region and region-related key parameters. The invention can improve the accuracy and efficiency of radiologists in diagnosing breast cancer to a certain extent, and more objectively and effectively assist radiologists to put forward diagnosis opinions and treatment plans.

Description

Mammary gland X-ray radiography-based breast cancer computer-aided diagnosis method and system

Technical Field

The invention belongs to the field of application of computer analysis technology of medical images, and particularly relates to a breast cancer computer-aided diagnosis method and system based on a breast X-ray radiography.

Background

Breast cancer is one of the most common malignant tumors of middle-aged and old women, and the breast cancer is the first malignant tumor of women living in China and tends to rise year by year. At present, no good strategy exists for preventing breast cancer, and early diagnosis is the most effective way for reducing morbidity and mortality and improving the cure rate of breast cancer. Studies have shown that mammograms (breast molybdenum target x-ray technique) are an effective method for screening early clinical asymptomatic breast cancer, a general screening method for breast disease that has been approved by the Food and Drug Administration (FDA).

The imaging principle limits that high-density normal breast tissues in a mammography can also have high brightness similar to abnormal tissues, the focus of young Asian women with dense breast tissues is not easy to find, missed diagnosis and misdiagnosis are easy to cause, the early diagnosis of the breast cancer depends on the experience, professional ability and fatigue degree of a radiologist to a great extent, the subjective factors have great influence, and particularly, the diagnosis efficiency and accuracy are difficult to ensure in the general investigation activity of the breast diseases.

The computer-aided diagnosis (CAD) method and the system of the breast cancer combine the strong computing power of a computer in the automatic analysis of the x-ray radiography of the breast, effectively provide valuable 'second opinion' for radiologists, and have incomparable advantages in the aspects of reducing the diagnosis workload and improving the diagnosis efficiency and objectivity. The results of the Study by T.W.Freer et al show that the clinical use of Breast cancer Computer-aided diagnosis and its system significantly improves the Detection rate of early malignant tumors (see T.W.Freer, et al, "Screening Mammgraphics with Computer-aided Detection: prospectral Study of 12,860 Patents in a Community Breast Center," Radiology,220 (3): 781-786 (2001)). As such, computer-aided diagnosis methods of breast cancer and systems thereof are being increasingly accepted by medical institutions worldwide and applied to clinical practice.

Generally, computer-aided diagnosis methods and systems for breast cancer include the aided diagnosis of breast masses and the aided diagnosis of microcalcifications (see, in particular, david Gur., bin Zheng, yuan Hsiang Chang, computerized detection of tumors and microcalcifications in digital models, united States Patent No.5,627,907). The auxiliary diagnosis is to automatically analyze and process the mammary gland image through a computer, identify suspicious masses or lesion regions of suspicious microcalcification foci in the image, acquire a series of relevant characteristic parameters of the marked suspicious masses or suspicious microcalcification foci regions, return and display the parameters to a radiologist according to needs, and provide reference ideas for assisting diagnosis. Currently, computer-aided diagnosis methods and systems thereof have the characteristics of high sensitivity and low false positive rate in the detection of suspicious areas of microcalcifications, and have been accepted and adopted by radiologists (see, in particular, r.f. brem, j.w. hoffmeister, g.zisman, et al, "a computer-aided detection system for evaluation of breast cancer by breast pathological application and delivery size," AJR, am.j.rogentonol 184 (2005)), but the detection effect in breast mass-aided diagnosis is low and is not relied on by radiologists. Therefore, how to design a computer-aided diagnosis method and system for breast cancer to accurately determine the suspicious breast cancer mass region in the x-ray mammogram, and extract a series of effective region-related characteristic parameters, which accurately and effectively assist the doctor in breast cancer diagnosis becomes the focus of research.

Disclosure of Invention

The invention aims to provide a breast cancer computer-aided diagnosis method based on a breast X-ray radiography, which provides reference opinions for diagnosis of radiologists and is beneficial to improving the efficiency and the accuracy of positioning and diagnosing breast cancer masses; the invention also provides a system for realizing the method.

The invention provides a breast cancer computer-aided diagnosis method based on a breast X-ray radiography, which comprises the following steps:

inputting a mammary gland x-ray radiography to be diagnosed;

step (2) extracting a region of interest from the input mammography x-ray photograph to obtain the position of an initial suspicious lump region;

step (3) segmenting suspicious masses in the region of interest and determining the boundaries of the suspicious masses;

step (4) calculating the relevant characteristic value of the segmented suspicious lump area;

inputting the calculated characteristic values into a classifier, analyzing the initial suspicious tumor area and determining the final suspicious tumor area;

and (6) positioning the final segmentation result of the suspicious tumor area on the X-ray mammogram to be diagnosed input in the step (1), and displaying the characteristic value of the tumor area calculated in the step (4) to a user according to the requirement.

The invention provides a breast cancer computer-aided diagnosis system based on a breast x-ray radiography, which comprises an input module, a region-of-interest extraction module, a suspicious mass segmentation module, a suspicious mass region characteristic extraction module, a classification diagnosis module and an output module, wherein the input module is used for inputting a breast x-ray radiography image;

the input module is used for receiving a mammary gland x-ray radiography to be diagnosed input by a user and transmitting the mammary gland x-ray radiography to the interest region extraction module;

the interesting region extraction module is used for extracting an interesting region in an input mammary gland X-ray photograph to obtain initial suspicious tumor region position information and transmitting the position information to the suspicious tumor segmentation module;

the suspicious tumor segmentation module is used for segmenting the suspicious tumor in the region of interest extracted by the region of interest extraction module to obtain boundary information of the suspicious tumor and transmitting the boundary information to the suspicious tumor region characteristic extraction module;

the suspicious tumor region feature extraction module calculates to obtain a related feature value according to the received boundary information of the suspicious tumor and transmits the related feature value to the classification diagnosis module;

the classification diagnosis module is used for inputting the calculated characteristic value of each suspicious lump region into the classifier, automatically classifying and identifying the initial suspicious lump region by a computer, determining the final suspicious lump region, transmitting the result to the output module, positioning the segmentation result of the detected final lump region on the input X-ray mammary gland photograph to be diagnosed by the output module, and displaying the calculated region-related characteristic value to a user according to the requirement.

Firstly, inputting a mammary gland x-ray photograph to be diagnosed into a system, obtaining a series of related characteristic values of a suspicious mass through region-of-interest extraction, suspicious mass segmentation and characteristic extraction of a suspicious mass region (a region containing the suspicious mass), then inputting the characteristic values into a classifier to classify and diagnose an initial suspicious mass region, finally positioning the segmentation result of the final suspicious mass region automatically detected by a computer on the input mammary gland x-ray photograph to be diagnosed, and displaying the series of related characteristic values obtained by calculation to a user according to needs. In conclusion, the method automatically detects the suspicious breast mass through a breast X-ray radiography-based breast cancer computer-aided diagnosis method and a breast X-ray radiography-based breast cancer computer-aided diagnosis system, provides the position and the shape of the suspicious mass, and provides a series of relevant characteristic parameters of the suspicious mass according to the requirements, thereby prompting the radiologist to focus on the area needing attention and the relevant key parameters of the area, and improving the accuracy and the efficiency of the radiologist in breast cancer diagnosis to a certain extent.

Drawings

FIG. 1 is a flow chart of a breast cancer computer-aided diagnosis method based on mammography according to the present invention;

FIG. 2 is a schematic structural diagram of a breast cancer computer-aided diagnosis system based on mammography according to the present invention;

FIG. 3 is a schematic diagram illustrating a process of obtaining candidate boundary points according to an embodiment of the present invention;

FIG. 4 is a list of features extracted from a suspicious mass region according to an embodiment of the present invention;

FIG. 5 is a diagram illustrating a classifier training process according to an embodiment of the present invention.

Detailed Description

The present invention is described in further detail below with reference to the attached drawings and examples.

As shown in fig. 1, the method of the present invention comprises the steps of:

(1) Inputting a mammary gland X-ray photograph to be diagnosed.

(2) And extracting the region of interest on the input mammary gland x-ray photograph to obtain the initial suspicious lump region position.

The analysis of the whole image not only has a large amount of redundant information, but also is easy to introduce errors. In order to improve the speed and accuracy of the processing, the processing object needs to be reduced from the whole image to a plurality of small areas, namely, regions of interest, and the positions of the regions of interest are the positions of the initial suspicious lump areas of the subsequent processing.

For region of interest extraction, a left and right breast image comparison method has been studied (see in particular F.F. Yin, M.L. Giger, doi Kunio, and et al, "Computerized detection of massages in digital mammograms: analysis of biological events," Medical Physics, 18.

Considering the characteristics of high brightness, approximately circular shape, and contrast due to gray level difference with surrounding tissues, the method of the present invention uses a template matching method to locate the region of interest, i.e., the initial suspicious tumor region. The concrete implementation is as follows:

(2.1) obtaining a template T by utilizing a two-dimensional hyperbolic secant (sech) function, and calculating the correlation between the input mammary gland x-ray radiography and the template T to obtain a correlation image;

a two-dimensional hyperbolic secant (sech) function is adopted to generate a template T with the size of (2L + 1) x (2L + 1)

The center of the template is used as an origin, x and y represent horizontal and vertical coordinates in the template, the value ranges are [ -L, L ], and L is a positive integer. α = input mammogram maximum gray level-1, β = ln (2 × α)/L × L.

Moving the template T on the input mammary gland x-ray photograph pixel by pixel, and calculating the correlation cor (T, S) between the template T and the subimage S covered by the template T by using the formula (2)

Wherein, mu _st Is the average value, mu, of the gray-scale products of the corresponding pixels of the sub-image S and the template T _t And mu _s Is the average of the pixel gray levels, σ, within the template T and the sub-image S _t And σ _s Is the variance value of the pixel gray levels in the template T and the sub-image S. By applying the method, each pixel point in the input mammary gland X-ray radiography can be calculated to obtain a correlation value related to the template T, and the correlation value range is [ -1,1]. Then all the correlation degrees are calibrated to obtain a correlation degree graphLike this. Setting the correlation degree smaller than 0 as 0; otherwise, the input mammography maximum gray level is multiplied by the correlation value.

(2.2) carrying out binarization processing on the correlation degree image by using a selection threshold, extracting all connected regions from the binarization correlation degree image and determining the center of each connected region;

selecting a suitable threshold T _low Performing binarization on the correlation image obtained in the step (2.1)That is, if the correlation value of a certain pixel point of the correlation image is smaller than T _low Then the point is assigned a value of 0, otherwise it is assigned a value of 1. And extracting a plurality of connected regions from the correlation degree binary image. And (3) taking the point with the maximum correlation degree in the corresponding region in the correlation degree image obtained in the step (2.1) as the center of each connected region.

(2.3) redefining a template with a size different from that of the template in the step (2.1) by using the template defining method in the step (2.1), and performing multi-scale analysis on each connected region extracted in the step (2.2);

according to the template defining method described in step (2.1), 3 templates are redefined according to 200%,33% and 66% of the original template size 2L + 1. For each connected region extracted in the step (2.2), moving the center of each scale template to the position of the center of the connected region on the original input mammogram, respectively calculating the correlation degree of each scale template and the corresponding sub-image in the original input mammogram image, taking the maximum value of the calculated 3 correlation degrees and the correlation degree obtained in the step (2.1) as the final correlation degree of the connected region, and selecting a proper threshold value T _high (T _high Greater than T _low ) The final degree of correlation is less than T _high The connected regions of (c) are excluded as false positive regions (non-tumor regions considered by the computer).

(2.4) screening the residual connected regions processed in the step (2.3) by using area and shape rules;

and (4) further screening the connected regions which are not excluded in the multi-scale analysis method in the step (2.3) by adopting simple area and shape rules. In general, a connected region with a small area (usually 5-30 pixels) is usually corresponding to a mammary calcification focus, while a long-strip-shaped region is usually corresponding to normal glandular tissue, and the connected region satisfying the two conditions is excluded as a false positive region.

(2.5) extracting a region of interest;

after the region screening in step (2.4), a square (usually 125 × 125 pixels) is cut out from the original input breast image for the remaining connected region by taking its geometric center as the center, and the extracted square region is the region of interest or referred to as the initial suspicious mass region.

(3) And (3) segmenting the suspicious masses in the region of interest extracted in the step (2), and determining the boundaries of the suspicious masses.

After the region of interest is extracted, suspicious masses contained in the region of interest need to be segmented, and the boundaries of the suspicious masses need to be accurately determined.

Multi-layer topographic region growing has been studied for suspicious mass Segmentation (see in particular b.zheng, y.h. chang, d.gu. "Computerized Detection of massages in differential Mammograms using single-Image Segmentation and a multi-layer topographic visualization analysis," ad.radio., 2: 259-266 (2001)), segmentation based on multi-resolution analysis (see in particular Liu, C.babbs, and E.Delp, "Multiresolution Detection of partitioned versions in Digital algorithms," IEEE Transactions on Image Processing,10 (6): 874 884 (2001)), threshold Segmentation based on Fuzzy entropy (see in particular S.Amr.abdel-Daym, mahmoud r.El-Sakka, "Fuzzy entry based segmented Detection of partitioned versions in Digital algorithms," Images Processing of the 2005 Engineering and biological analysis number reference, 4017-4022 (Septer 2005, 2005), and so on.

The method adopts a segmentation method based on image gradient and dynamic programming method. The concrete implementation is as follows:

(3.1) determining candidate boundary points on the boundary of the suspicious lump by using the gradient correlation characteristics of the image of the region of interest;

as shown in the left diagram of fig. 3, for the step (2) of obtaining a region of interest and a given gray threshold Thres ₁ The region of interest can be binarized. Obtaining a corresponding contour line 1 in the binarized region of interest by using a boundary tracking method; using a threshold value Thres ₂ Repeating the above process to obtain the contour line 2; to meet _i Repeating the above process to obtain a contour line I; say, adopt threshold Thres _n Repeating the above process to obtain the contour line N. And selecting a plurality of thresholds to obtain a plurality of contour lines (contour line groups), wherein the density of the contour lines is related to the image gradient, the image gradient at the dense part of the contour lines is larger, and the image gradient at the sparse part of the contour lines is smaller.

And (4) taking the center of the region of interest as an end point, sequentially leading R rays outwards in an anticlockwise direction at equal angle intervals from a zero-degree angle, and solving the intersection point of each ray and the contour line group. If the Euclidean distance between two adjacent intersection points on the same ray is less than D _min Then the two intersections are said to be connected. For the existence of more than S _min And the connected point set is represented by taking the center of the connected point set as a candidate boundary point. There may or may not be multiple candidate boundary points on a ray. The right image in fig. 3 is the candidate boundary points marked in the original region of interest.

(3.2) obtaining a plurality of candidate boundary lines from the candidate boundary points on each ray obtained in the step (3.1), and selecting the optimal candidate boundary line by using a dynamic planning method, namely determining the boundary line of the suspicious lump;

ideally, if there is only one candidate boundary point on each ray in step (3.1), the candidate boundary points on each ray are connected in sequence to form a unique candidate boundary line, i.e. the suspicious lump boundary line. In practical situations, however, a plurality of candidate boundary points may be provided on one ray, one candidate boundary point is selected on each ray each time, the candidate boundary points selected on each ray are sequentially connected to form a candidate boundary line, so that a plurality of candidate boundary lines can be formed, according to the characteristics that the actual boundary line passes through the position where the image gray value changes greatly (namely the image gradient value is large) and has certain smoothness, the method of the invention determines the cost of each candidate boundary line by using a dynamic programming method and setting a cost function, and selects a candidate boundary line with the optimal cost from all candidate boundary lines as the final suspicious lump boundary line. The concrete implementation is as follows:

setting a candidate boundary line S: { n ₁ ，n ₂ ，…n _R }, variable n _i Representing the candidate boundary points selected on the ith ray. If the ith ray has no candidate boundary point, carrying out interpolation according to the distance between the candidate boundary point on the (i-1) th ray and the (i + 1) th ray and the center of the region of interest. The cost function C of the candidate boundary line S is the sum of local costs corresponding to all candidate boundary points on the candidate boundary line, that is:

local cost C (n) _i ) From an internal cost C _int (n _i ) And an external cost C _ext (n _i ) Consists of the following components:

C(n _i )＝αC _int (n _i )+C _ext (n _i ) (4)

where alpha is a constant value for adjusting the smoothness of the boundary line.

Internal cost C _int (n _i ) Defined as candidate boundary points n _i And n _i-1 Normalized distance between：

Wherein dist (n) _i ，n _i-1 )、dist(O，n _i )、dist(O，n _i-1 ) Respectively represent candidate boundary points n _i And n _i-1 Center of interest O and n _i Center of interest O and n _i-1 The smaller the normalized distance, i.e., the smoother the candidate boundary line, the smaller the cost.

External cost C _ext (n _i ) Defined as candidate boundary points n _i The more connected points represented by a candidate point, the larger the gradient of the candidate point, the more likely the candidate point is to be a real boundary point.

C _ext (n _i )＝-(n _i Number of communication points) (6)

And obtaining the cost of each candidate boundary line according to the cost function, and determining an optimal candidate boundary line in all the candidate boundary lines by adopting dynamic planning as a final suspicious lump boundary line.

(4) The serial related characteristic values of the segmented suspicious mass are calculated, and the selected characteristics can be generally divided into geometric characteristics, morphological characteristics, gray level characteristics, texture characteristics and the like.

The selected eigenvalues should follow several characteristics:

(1) identifiability: the characteristic values of different types of objects are obviously different;

(2) reliability: similar characteristic values are applied to the similar objects;

(3) independence: strong correlation should not exist among the characteristic values;

according to the above rules, 25 features of the suspicious tumor region are extracted, and the detailed description is shown in the table of fig. 4.

(5) And inputting the calculated characteristic values into a classifier, and automatically analyzing the initial suspicious tumor region by a computer to determine the final suspicious tumor region.

Suspicious mass classification is the last stage of automatic detection of breast masses. After extracting the characteristic values reflecting the characteristics of the tumor from the suspicious tumor with the boundary, the classifier is used to determine whether the suspicious tumor is positive (the true tumor area considered by the computer) or false positive. The selection and design of the classifier largely determines the accuracy of the tumor detection. Classification is an important component of pattern recognition theory, and can generally use methods such as linear classification, heuristic rules, statistical classification, fuzzy classification, artificial neural networks, and the like to classify features. The method of the invention adopts an improved k nearest neighbor method to classify the extracted features in the step (4).

The basic rule of the k-nearest neighbor method is: and (4) finding k samples closest to (or most similar to) the feature vector of the test sample from all the samples (except the test sample), voting the samples, and classifying the test sample into the category with the largest sample voting number. The k-nearest neighbor classifier adopted by the method firstly defines a feature vector similarity function and a decision function (DI for short).

(1) Definition of similarity function

Test specimen Y _Q Is denoted as V (Y) _Q ) Sample X divided by test sample) is denoted as V (X), the similarity function is defined as the inverse of the squared euclidian distance between two feature vectors, i.e.

(2) Definition of decision function

In pair with the test sample Y _Q In the decision making process, in principle, the influence of the sample participation decision with the feature vector closer to the feature vector should be larger, and in an original k-nearest neighbor classifier, the influence of the vector distance difference is difficult to be reflected by a simple voting methodAnd (6) defining a row. The following takes decision test samples as the tumor class and the normal class as examples.

Calculating the test sample Y by the first k samples with the similarity arranged from large to small as shown in the formula (8) _Q The decision value of (c). In and test sample Y _Q Among the first k recent samples, mass was designated as Mass, normal as Norm, and N, sim (Y) were designated as Normal _Q X) is the test specimen Y defined in (1) _Q Similarity to sample X, rnk (X) represents the order of sample X in the similarity arrangement, X _j ^Mass Represents the jth lump sample, and the value range of j is [1, M']，X _l ^Norm Represents the l normal class sample, and the value range of l is [1, N ]]。

The decision value calculation method considers the classes of samples adjacent to the test sample like a simple voting method, also considers the sequence of the similarity of the samples and the test sample, and experiments prove that the decision function is superior to the calculation method of the original k neighbor decision function.

For the application of the classifier, training data is firstly used to train the classifier to obtain classifier parameters suitable for specific problems, and the classifier training process includes two steps of collecting classifier training data and obtaining classifier parameters, as shown in fig. 5:

(5.1) collecting classifier training data;

firstly, inputting a group of mammary X-ray radiographs with known diagnosis results, and applying the steps of the region-of-interest extraction, the suspicious tumor region segmentation and the feature extraction of the suspicious tumor region to obtain the segmentation results of the suspicious tumor and a series of related feature values, thereby completing the collection of classifier training data.

(5.2) obtaining classifier parameters;

training the designed classifier by using the calculated related characteristic value of the suspicious mass and the actual diagnosis result of the suspicious mass to obtain classification parameters, and writing the classification parameters into a classifier parameter file until the training process of the classifier is finished.

(6) And (3) positioning the final segmentation result of the suspicious tumor region on the X-ray mammary gland photograph to be diagnosed input in the step (1), and displaying the characteristic value of the tumor region calculated in the step (4) to a user according to the requirement.

As shown in fig. 2, the diagnosis assistance system of the present invention includes an input module 100, a region of interest extraction module 200, a suspicious mass segmentation module 300, a suspicious mass region feature extraction module 400, a classification diagnosis module 500, and an output module 600.

The input module 100 is used for receiving the x-ray radiograph of the breast to be diagnosed input by the user and transmitting the x-ray radiograph to the region of interest extraction module 200.

The region of interest extraction module 200 extracts the region of interest in the input mammography, obtains the initial suspicious mass region position information according to the steps described in the step (2), and transmits the position information to the suspicious mass segmentation module 300.

The suspicious tumor segmentation module 300 segments the suspicious tumor in the region of interest extracted by the region of interest extraction module 200 according to the process described in the step (3), so as to obtain boundary information of the suspicious tumor, and transmit the boundary information to the suspicious tumor region feature extraction module 400.

The suspicious mass region feature extraction module 400 calculates a series of region-related feature values, such as geometric features, morphological features, gray-scale features, texture features, etc., according to the received boundary information of the suspicious mass, and transmits the result to the classification diagnosis module 500.

The classification diagnosis module 500 inputs the calculated feature value of each suspicious mass region into a classifier, performs computer automatic classification and identification on the initial suspicious mass region, determines the final suspicious mass region, and transmits the result to the output module 600, the output module 600 positions the segmentation result of the final suspicious mass region automatically detected by the computer on the input x-ray mammogram of the breast to be diagnosed, and displays the calculated region-related feature value to the user as required.

Example (c):

the invention provides a breast cancer computer-aided diagnosis method based on galactophore X-ray radiography and a system thereof, which relate to a plurality of parameters, the parameters are comprehensively adjusted and set aiming at the data characteristics of specific processing so as to achieve the good performance of the whole system, and the parameters set aiming at the data set processing of the invention are listed:

step (2.1), obtaining an initial template size related parameter L =25 by using a two-dimensional hyperbolic secant (sech) function;

step (2.2) selecting threshold value T for binarization processing of correlation degree image _low =0.5 × input mammary x-ray radiograph maximum gray level;

the threshold value T selected in the multi-scale analysis of the step (2.3) _high =0.6 × input mammography maximum gray level;

and (3.1) sequentially leading R =64 rays outwards along the counterclockwise direction at equal angular intervals from a zero-degree angle by taking the center of the region of interest as an end point, and obtaining the intersection point of each ray and the contour line group. If the Euclidean distance between two adjacent intersection points on the same ray is less than D _min And =3, the two intersection points are said to be connected. For the existence of more than S _min =10 connected sets of points;

step (3.2) local cost C (n) _i ) From an internal cost C _int (n _i ) And an external cost C _ext (n _i ) Consists of the following components:

C(n _i )＝αC _int (n _i )+C _ext (n _i )

where the constant parameter a =110 for adjusting the smoothness of the boundary line.

The method automatically analyzes and processes the suspicious breast lump area in the mammary X-ray radiograph through a mammary X-ray radiograph-based breast cancer computer-aided diagnosis system, provides the lump position and the lump shape, and provides a series of characteristic parameters related to the area according to the requirement, thereby prompting the radiologist to focus on the area needing important attention and the area-related important parameters, and improving the accuracy and the efficiency of the radiologist for breast cancer diagnosis to a certain extent. The implementation of the present invention is not limited to the scope disclosed in the above examples, and the technical solutions described above may be implemented in a manner different from the above examples.

Claims

1. a method for computer-aided diagnosis of breast cancer based on mammography, the steps comprising:

Step (1) input a mammogram to be diagnosed;

Step (2) extracting the region of interest on the input mammogram to obtain the initial suspicious mass region position;

Step (3) Segment the suspicious mass in the region of interest, and determine the boundary of the suspicious mass;

Step (4) calculates the relevant eigenvalues of the segmented suspicious mass region;

Step (5) Input the calculated eigenvalues into the classifier, analyze the initial suspicious mass area, and determine the final suspicious mass area;

Step (6) locates the segmentation result of the final suspicious mass area on the mammogram to be diagnosed inputted in step (1), and displays the eigenvalues of the mass area calculated in step (4) to user.

2. method according to claim 1, is characterized in that: step (2) comprises following process:

Step (2.1) utilizes the two-dimensional hyperbolic secant function to obtain a template, calculates the correlation between the above-mentioned input mammogram and the template, and obtains a correlation image;

Step (2.2) If the correlation value of a certain pixel in the correlation image is less than the selected threshold, then assign the point a value of 0, otherwise assign it to 1; extract all connected regions in the binarized correlation image, and extract step (2.1 ) The point with the maximum correlation in the corresponding region in the correlation image obtained as the center of the connected region;

Step (2.3) uses the two-dimensional hyperbolic secant function to obtain a template with a different size from the template in step (2.1); for each connected region extracted in step (2.2), move the center of each scale template to the center of the connected region in the original Input the position on the mammogram, calculate the correlation between each scale template and the corresponding sub-image in the original input mammogram image, and use the maximum value of all the calculated correlations as the final correlation of the connected region , select a threshold to exclude the connected regions whose final correlation is less than the threshold as false positive regions;

Step (2.4) For the connected regions not excluded in the step (2.3), exclude the connected regions of 5-30 pixel size and the connected regions of strip shape;

Step (2.5) After the area screening of (2.4), for the retained connected area, take its geometric center as the center, intercept the square in the original input breast image, and the extracted square area is the area of interest or the initial suspicious mass area.

3. The method according to claim 1 or 2, characterized in that: step (3) segments suspicious mass in the region of interest according to the following steps:

Step (3.1) Obtain the contour line group related to the gradient characteristic of the image of the region of interest, and then take the center of the region of interest as the endpoint, start from the zero angle, and draw several rays outward in the counterclockwise direction at equal angular intervals to obtain The intersection points of each ray and the contour line group, so as to obtain the boundary candidate points on each ray;

Step (3.2) Obtain multiple candidate boundary lines from the candidate boundary points on each ray obtained in (3.1), and then select the best candidate boundary line as the suspicious mass boundary line.

4. according to the described method of claim 1,2 or 3, it is characterized in that: the classifier in the step (5) adopts following method to redefine the eigenvector similarity function and the decision function in the original k-nearest neighbor method:

The eigenvector of the test sample Y _Q is denoted as V(Y _Q ), and the eigenvector of the sample X other than the test sample is denoted as V(X), then the similarity function is defined as the reciprocal of the square of the Euclidean distance between the two eigenvectors ,Right now

Sim Sim (({Y Y}_{Q Q},, X x)) = = \frac{11}{{| | | | V V (({Y Y}_{Q Q})) - - V V ((X x)) | | | |}^{22}}

Using the first k samples arranged in descending order of similarity, use the following formula to calculate the decision value of the test sample Y _Q , where, in the first k samples closest to the test sample Y _Q , the mass class is recorded as Mass, and its sample The number is M, the normal class is recorded as Norm, and the number of samples is N. Sim(Y _Q , X) is the similarity between the test sample Y _Q and sample X defined in ①, and Rnk(X) represents that the sample X is similar X _j ^Mass represents the jth mass sample, the value range of j is [1, M], X _l ^Norm represents the lth normal sample, and the value range of l is [1, N ],but

DI DI (({Y Y}_{Q Q})) = = \frac{{Σ Σ}_{j j = = 11}^{M m} {{Sim Sim (({Y Y}_{Q Q},, {x x}_{j j}^{Mass mass})) \times \times ((K K + + 11 - - Rnk Rnk (({X x}_{j j}^{Mass mass}))))}}}{{Σ Σ}_{j j = = 11}^{M m} {{Sim Sim (({Y Y}_{Q Q},, {X x}_{j j}^{Mass mass})) \times \times ((K K + + 11 - - Rnk Rnk (({X x}_{j j}^{Mass mass}))}} + + {Σ Σ}_{l l = = 11}^{N N} {{Sim Sim (({Y Y}_{Q Q},, {X x}_{l l}^{Norm Norm})) \times \times ((K K + + 11 - - Rnk Rnk (({X x}_{l l}^{Norm Norm}))}}} . .

5. A breast cancer computer-aided diagnosis system based on mammography, comprising an input module (100), a region of interest extraction module (200), a suspicious mass segmentation module (300), a suspicious mass region feature extraction module (400) ), classification diagnosis module (500) and output module (600);

The input module (100) is used to receive the mammogram to be diagnosed inputted by the user, and send it to the region of interest extraction module (200);

The region of interest extraction module (200) is used to extract the region of interest in the input mammogram to obtain the initial suspicious mass area position information, and send it to the suspicious mass segmentation module (300);

The suspicious mass segmentation module (300) is used to segment the suspicious mass in the region of interest extracted by the region of interest extraction module (200), obtain the boundary information of the suspicious mass, and send it to the suspicious mass region feature extraction module (400);

The suspicious mass area feature extraction module (400) calculates relevant feature values according to the received boundary information of the suspicious mass, and sends it to the classification diagnosis module (500);

The classification and diagnosis module (500) is used to input the calculated feature value of each suspicious mass area into the classifier, perform computer automatic classification and recognition on the initial suspicious mass area, determine the final suspicious mass area, and send the result to the output module ( 600), the output module (600) locates the detected segmentation result of the final mass region on the input mammogram to be diagnosed, and displays the calculated region-related feature values to the user as required.