CN109711411B - Image segmentation and identification method based on capsule neurons - Google Patents
Image segmentation and identification method based on capsule neurons Download PDFInfo
- Publication number
- CN109711411B CN109711411B CN201811505408.9A CN201811505408A CN109711411B CN 109711411 B CN109711411 B CN 109711411B CN 201811505408 A CN201811505408 A CN 201811505408A CN 109711411 B CN109711411 B CN 109711411B
- Authority
- CN
- China
- Prior art keywords
- target
- capsule
- shape
- network
- segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 239000002775 capsule Substances 0.000 title claims abstract description 34
- 210000002569 neuron Anatomy 0.000 title claims abstract description 27
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000003709 image segmentation Methods 0.000 title description 5
- 230000011218 segmentation Effects 0.000 claims abstract description 31
- 230000006870 function Effects 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 7
- 238000005457 optimization Methods 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 abstract description 8
- 230000008569 process Effects 0.000 description 9
- 230000001537 neural effect Effects 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a capsule neuron-based collaborative segmentation identification method. The method uses a network built by capsule neurons to model and learn the shape knowledge of a target, and builds a model for collaborative segmentation and recognition based on the network. Compared with a classical scalar neuron, the capsule neuron can analyze and capture the geometric relation from a low-layer local instance to a high-layer local instance of the target layer by layer until the target is integrated. It can therefore unwrap the characteristics of target and background interference; and the features of the object may be further used for reconstruction and generation of the object. Based on the property of the capsule neurons in the capsule network, the invention builds a network topology structure of an encoder-decoder, can effectively learn and utilize the prior knowledge and information of the target, and applies the prior knowledge and information to the collaborative segmentation and recognition model. The method has strong expansibility, and the network of the encoder and the decoder can be replaced by other proper neural networks to meet different requirements.
Description
Technology neighborhood
The invention belongs to image segmentation, automatic identification and target representation neighborhood, and particularly relates to an image segmentation identification method based on capsule neurons. The model makes efficient use of the properties specific to capsule neurons.
Background
In a model and a technical method of mutual cooperation of target segmentation and target identification, effective expression of a target is a key problem. The appropriate model and expression method, and how to generate the referenceable target object based on the prior knowledge, play an important role in establishing the cooperative process. In addition, the expansibility of the model is a point to be considered in the practical application process, and in some cases, the model needs to be expanded or reduced to different degrees for different applications to meet different requirements in terms of resource performance.
In recent years, deep learning and deep neural networks have played a great role in many computer vision and image processing tasks. The convolutional neural network is the most commonly used one in the deep neural networks at present, and is favored by the research community and the industrial community due to the strong expansibility and the excellent learning ability and expression ability. The capsule neuron is a nerve unit recently proposed by professor Hinton, and mainly aims to solve the problem that a convolutional neural network loses characteristic position information in an inference process. Capsule neurons focus on capturing the geometric relationships of target parts to target whole, trying to preserve such relationships and propagate their associated information. Therefore, the capsule neuron can resolve the target and the characteristics thereof from a plurality of interferences, and most of the interferences are filtered.
The characteristics of the capsule neurons are very helpful for the task of collaborative target segmentation and identification, on one hand, real targets can be analyzed from the segmentation results and the characteristics of the targets can be extracted, and most of interference in the segmentation process is filtered out, and the characteristics can be used for reconstructing or generating the real targets. On the other hand, the capsule-based deep neural network also has better expansibility.
According to the invention, a network of a coder-decoder framework is built based on the capsule neural unit and is introduced into a model for collaborative target segmentation and recognition, so that the learning, expression and generation of target shape knowledge are realized, and further the mutual collaboration of a segmentation task and a recognition task is realized.
Disclosure of Invention
The invention aims to provide an image segmentation and identification method based on capsule neurons. The method utilizes a capsule neuron-based deep neural network to learn, model and express the shape of the target. The deep neural network comprises two basic modules: an encoder and a decoder. The encoder uses a capsule neural unit, and the capsule neural unit is used for extracting and identifying target features in the current segmentation result; the decoder is used for generating an object shape for being referenced by the segmentation model based on the extracted object features and the recognition result. The two modules enable the two tasks to communicate with each other, communicate information with each other, work in concert to achieve better performance, and make the segmentation and recognition processes more explanatory.
The invention adopts the following technical scheme: a segmentation identification method based on capsule neuron images comprises the following steps:
step 1: based on two tuple data { target shape m containing L different classesiObject class label yiWhere i 1, N is the sample number, mi∈{0,1}H×WH and W are respectively an image miUsing capsule neurons to build and train an encoder network Enc for learning, extracting each target shape miIs characterized byWherein D is the dimension of the top capsule neuron of the encoder network; at the same time, based on the extracted features ViTraining a decoder network Dec for generating the target shape;
step 2: for the image to be segmented and identifiedWhere there is one and only one object, C is the number of channels in the image I, E (q, t) using an energy function based on the image datadata(I, q) carrying out initial segmentation on I, and obtaining an initial result q belonging to [0, 1 ] by segmentation according to the principle of optimal energy]H×WThe value q (x) of the pixel point position x characterizes the probability that the pixel belongs to the target;
and step 3: analyzing and identifying the initial result q by using an encoder network Enc to obtain the target shape feature V, wherein the label t of the identified target category is arglmax||vlL, where vlFor the l line, | V in the target feature VlAnd | is its modulo length. The characteristics of the capsule neurons determine | | | vl||∈[0,1]Thus | | | vlI also represents the probability that the target belongs to class I;
and 4, step 4: generating a reference shape of the target using a decoder network Dec based on V and the recognition result tThe energy function in update step 2 is as follows:
E(q,t)=α×Edata(I,q)+(1-α)×Eshape(q,t)
Eshape(q, t) is a reference shapeWith the frontQ is a loss function, and alpha is weight; and obtaining an updated segmentation result q by using the updated energy function according to the principle of optimal energy.
And 5: and (5) repeating the steps 2, 3 and 4 until q converges or reaches the maximum iteration number, and outputting a segmentation result q and the identified target class label t.
The invention has the beneficial effects that:
(1) analyzing a target segmentation result by using a network established by a capsule neural unit, capturing a geometric relation from a target local part to a target whole, and filtering redundant interference information in the process of executing a cooperative task;
(2) the features extracted by the capsule network have stronger semantic information, and each dimensional feature can represent one attribute of a target, which brings interpretability to the identification process;
(3) the network of the structure of the encoder-decoder in the collaborative model has better expansibility, and can be replaced by other proper neural network modules, thereby expanding the application range of the collaborative model.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is an image to be segmented and identified;
fig. 3 to 7 show segmentation recognition results obtained in iterations 1, 20, 40, 60, and 80, where L is 30;
fig. 8-12 are reference shapes generated during iterations 1, 20, 40, 60, 80.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
On the contrary, the invention is intended to cover alternatives, modifications, equivalents and alternatives which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of the present invention, certain specific details are set forth in order to provide a better understanding of the present invention. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details.
Referring to fig. 1, a flowchart of steps of a capsule neuron-based collaborative segmentation and recognition model according to an embodiment of the present invention is shown.
Given a training dataset { target shape miObject class label yiTesting a target image ItestThe method comprises the following steps:
1. training shape expression models and appearance expression models
(1.1) based on the data set D0(target shape m)iObject class label yiAnd expanding the target shape appropriately (namely, expanding a data set), and performing displacement, deformation, rotation and perspective transformation of different degrees on part of the training shapes to generate more shapes for training. Which is defined with its tags as a data set All target shapes are photographedNormalized to 80 x 80 size.
(1.2) adding D1Sample pair ofInputting the shape into a coder-decoder network for shape learning, and establishing a shape recognition model Enc and a shape generation model Dec;
(1.3) the encoder-decoder network structure is:
2. For test image ItestFor example, FIG. 2
(2.1) in this embodiment, the image data energy term is created by using the following method, where f (x) -logp (i (x)) iq (x)) is ≧ τ, g (x) -logp (i (x)) iq (x)) < τ, where T is a foreground probability confidence threshold, and i (x) is image data (e.g., a gray scale value) of a pixel point x. p (I (x) q (x) is equal to tau) represents the pixel color distribution of the foreground area, and p (I (x) q (x) is less than tau) represents the pixel color distribution of the background area. The data item is thus Edata(I;q)=∑xq (x) f (x) (1-q (x)) g (x); for energy function E (q, t) ═ Edata(I, q), dividing to obtain an initial result q according to the principle of energy optimization0;
(2.2) analyzing and identifying the target shape q by using the encoder network Enc to obtain the target shape feature V, and identifying the target class label t ═ arglmax||vlL, where vlFor the l line, | V in the target feature VlAnd | is its modulo length. The characteristics of the capsule neurons determine | | | vl||∈[0,1]Thus | | | vlI also represents the probability that the target belongs to class I;
(2.3) generating a reference shape of the object using the decoder network Dec based on V and the recognition result tThe energy function in the update (2.1) is as follows:
E(q,t)=α×Edata(I,q)+(1-α)×Eshape(q,t)
Eshape(q, t) is a reference shapeA loss function with q, α being the weight; and obtaining an updated segmentation result q by using the updated energy function according to the principle of optimal energy.
(2.4) repeating the step 2.1-2.3 until q converges or reaches the maximum iteration number, and outputting the segmented target q and the identified target class label t. The iterative process is as follows:
(a) in the k-th sub-optimization iteration process, the segmentation result q of the k-1-th time is divided by using Enck-1Extracting and identifying the shape to obtain the target characteristic VkAnd the recognition result tk;
(b) Based on target characteristics VkAnd the recognition result tkExcept for VkSatisfies max | | vlThe other is the feature of interference information, so all the other rows except the t-th row are set to be 0, and then V is setkExpanded into a vector as input to the generative model Dec to generate a reference shape
(d) weighting the two energy terms to obtain the final energy
E(q,t)=α×Edata(q)+(1-α)×Eshape(q,t)
Adding edge constraint termsBased on the splitbegman method, the total energy can be converted to the following form:
8-12 are reference shapes generated during iterations 1, 20, 40, 60, 80, and it can be seen that the reference shape of the target generated is slightly rough because the confidence of segmentation and recognition at the initial iteration is not very high, but the capsule network still analyzes a rough target from the reference shape, filters partial interference information, and retains a rough outline of the target; as the iteration progresses, the generated reference shape becomes finer and more specific, and conforms more and more to the target area in the actual test image.
Meanwhile, the features extracted in the identification process have strong interpretability, on one hand, the extracted features can be used for reconstructing a target shape; on the other hand, due to the nature of the capsule neurons themselves, each dimensional feature represents some deformation property of the target.
And because the capsule neuron is only a special neural unit in the deep neural network, the encoder network and the decoder network module also have expansibility naturally, and the scale and the number of layers of the network can be reduced or increased, so that the encoder-decoder module in the method can be replaced by other suitable network modules to meet different resource constraints and application requirements.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (1)
1. A segmentation identification method of an image based on capsule neurons is characterized by comprising the following steps:
step 1: based on two tuple data { target shape m containing L different classesiObject class label yiWhere i is 1, …, N is the sample number, mi∈{0,1}H×WH and W are respectively an image miUsing capsule neurons to build and train an encoder network Enc for learning, extracting each target shapemiIs characterized byWherein D is the dimension of the top capsule neuron of the encoder network; at the same time, based on the extracted features ViTraining a decoder network Dec for generating the target shape;
step 2: for the image to be segmented and identifiedWhere there is one and only one object, C is the number of channels in the image I, respectively, using an energy function E based on the image datadata(I, q) carrying out preliminary segmentation on the I, and obtaining an initial segmentation result q by segmentation according to the principle of optimal energy0∈[0,1]H×WThe value q (x) of the pixel point position x characterizes the probability that the pixel belongs to the target;
and step 3: splitting the result q using the encoder network Enc0Analyzing and identifying to obtain the target shape feature V, and identifying the target type label t ═ argmax | | | VlL, where vlFor the l line, | V in the target feature Vl| | is its modulo length; the characteristics of the capsule neurons determine | | | vl||∈[0,1]Thus | | | vlI also represents the probability that the target belongs to class I;
and 4, step 4: generating a reference shape of the target using a decoder network Dec based on V and the recognition result tThe update energy function is as follows:
E(q0,t)=α×Edata(I,q0)+(1-α)×Eshape(q0,t)
Eshape(q0and t) is a reference shapeWith the initial segmentation result q0α is the weight; using the updated energy function in terms of energyOptimal principle, obtaining updated segmentation result
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811505408.9A CN109711411B (en) | 2018-12-10 | 2018-12-10 | Image segmentation and identification method based on capsule neurons |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811505408.9A CN109711411B (en) | 2018-12-10 | 2018-12-10 | Image segmentation and identification method based on capsule neurons |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109711411A CN109711411A (en) | 2019-05-03 |
CN109711411B true CN109711411B (en) | 2020-10-30 |
Family
ID=66255596
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811505408.9A Active CN109711411B (en) | 2018-12-10 | 2018-12-10 | Image segmentation and identification method based on capsule neurons |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109711411B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298844B (en) * | 2019-06-17 | 2021-06-29 | 艾瑞迈迪科技石家庄有限公司 | X-ray radiography image blood vessel segmentation and identification method and device |
CN110570394B (en) * | 2019-08-01 | 2023-04-28 | 深圳先进技术研究院 | Medical image segmentation method, device, equipment and storage medium |
CN111161280B (en) * | 2019-12-18 | 2022-10-04 | 浙江大学 | Contour evolution segmentation method based on neural network |
CN113065394B (en) * | 2021-02-26 | 2022-12-06 | 青岛海尔科技有限公司 | Method for image recognition of article, electronic device and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108062753A (en) * | 2017-12-29 | 2018-05-22 | 重庆理工大学 | The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study |
CN108921227A (en) * | 2018-07-11 | 2018-11-30 | 广东技术师范学院 | A kind of glaucoma medical image classification method based on capsule theory |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11074495B2 (en) * | 2013-02-28 | 2021-07-27 | Z Advanced Computing, Inc. (Zac) | System and method for extremely efficient image and pattern recognition and artificial intelligence platform |
CN104537676B (en) * | 2015-01-12 | 2017-03-22 | 南京大学 | Gradual image segmentation method based on online learning |
JP2020510463A (en) * | 2017-01-27 | 2020-04-09 | アーテリーズ インコーポレイテッド | Automated segmentation using full-layer convolutional networks |
WO2018156778A1 (en) * | 2017-02-22 | 2018-08-30 | The United States Of America, As Represented By The Secretary, Department Of Health And Human Services | Detection of prostate cancer in multi-parametric mri using random forest with instance weighting & mr prostate segmentation by deep learning with holistically-nested networks |
EP3629898A4 (en) * | 2017-05-30 | 2021-01-20 | Arterys Inc. | Automated lesion detection, segmentation, and longitudinal identification |
CN108846384A (en) * | 2018-07-09 | 2018-11-20 | 北京邮电大学 | Merge the multitask coordinated recognition methods and system of video-aware |
-
2018
- 2018-12-10 CN CN201811505408.9A patent/CN109711411B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108062753A (en) * | 2017-12-29 | 2018-05-22 | 重庆理工大学 | The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study |
CN108921227A (en) * | 2018-07-11 | 2018-11-30 | 广东技术师范学院 | A kind of glaucoma medical image classification method based on capsule theory |
Also Published As
Publication number | Publication date |
---|---|
CN109711411A (en) | 2019-05-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109409222B (en) | Multi-view facial expression recognition method based on mobile terminal | |
CN108491880B (en) | Object classification and pose estimation method based on neural network | |
CN109711411B (en) | Image segmentation and identification method based on capsule neurons | |
CN108304357B (en) | Chinese character library automatic generation method based on font manifold | |
CN110046671A (en) | A kind of file classification method based on capsule network | |
CN103425996B (en) | A kind of large-scale image recognition methods of parallel distributed | |
CN113989890A (en) | Face expression recognition method based on multi-channel fusion and lightweight neural network | |
CN104268593A (en) | Multiple-sparse-representation face recognition method for solving small sample size problem | |
Mittelman et al. | Weakly supervised learning of mid-level features with Beta-Bernoulli process restricted Boltzmann machines | |
CN106503661B (en) | Face gender identification method based on fireworks deepness belief network | |
CN112307714A (en) | Character style migration method based on double-stage deep network | |
CN109325513B (en) | Image classification network training method based on massive single-class images | |
CN112069900A (en) | Bill character recognition method and system based on convolutional neural network | |
Xu et al. | Face expression recognition based on convolutional neural network | |
CN110263855B (en) | Method for classifying images by utilizing common-basis capsule projection | |
CN107358172B (en) | Human face feature point initialization method based on human face orientation classification | |
Al-Zubaidi et al. | Two-dimensional optical character recognition of mouse drawn in Turkish capital letters using multi-layer perceptron classification | |
CN110598022A (en) | Image retrieval system and method based on robust deep hash network | |
CN114743133A (en) | Lightweight small sample video classification and identification method and system | |
CN105809200A (en) | Biologically-inspired image meaning information autonomous extraction method and device | |
CN112163605A (en) | Multi-domain image translation method based on attention network generation | |
CN113269235B (en) | Assembly body change detection method and device based on unsupervised learning | |
CN112488238B (en) | Hybrid anomaly detection method based on countermeasure self-encoder | |
CN113128624B (en) | Graph network face recovery method based on multi-scale dictionary | |
Dembani et al. | UNSUPERVISED FACIAL EXPRESSION DETECTION USING GENETIC ALGORITHM. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |