[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN109711411B - Image segmentation and identification method based on capsule neurons - Google Patents

Image segmentation and identification method based on capsule neurons Download PDF

Info

Publication number
CN109711411B
CN109711411B CN201811505408.9A CN201811505408A CN109711411B CN 109711411 B CN109711411 B CN 109711411B CN 201811505408 A CN201811505408 A CN 201811505408A CN 109711411 B CN109711411 B CN 109711411B
Authority
CN
China
Prior art keywords
target
capsule
shape
network
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811505408.9A
Other languages
Chinese (zh)
Other versions
CN109711411A (en
Inventor
于慧敏
黄伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201811505408.9A priority Critical patent/CN109711411B/en
Publication of CN109711411A publication Critical patent/CN109711411A/en
Application granted granted Critical
Publication of CN109711411B publication Critical patent/CN109711411B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a capsule neuron-based collaborative segmentation identification method. The method uses a network built by capsule neurons to model and learn the shape knowledge of a target, and builds a model for collaborative segmentation and recognition based on the network. Compared with a classical scalar neuron, the capsule neuron can analyze and capture the geometric relation from a low-layer local instance to a high-layer local instance of the target layer by layer until the target is integrated. It can therefore unwrap the characteristics of target and background interference; and the features of the object may be further used for reconstruction and generation of the object. Based on the property of the capsule neurons in the capsule network, the invention builds a network topology structure of an encoder-decoder, can effectively learn and utilize the prior knowledge and information of the target, and applies the prior knowledge and information to the collaborative segmentation and recognition model. The method has strong expansibility, and the network of the encoder and the decoder can be replaced by other proper neural networks to meet different requirements.

Description

Image segmentation and identification method based on capsule neurons
Technology neighborhood
The invention belongs to image segmentation, automatic identification and target representation neighborhood, and particularly relates to an image segmentation identification method based on capsule neurons. The model makes efficient use of the properties specific to capsule neurons.
Background
In a model and a technical method of mutual cooperation of target segmentation and target identification, effective expression of a target is a key problem. The appropriate model and expression method, and how to generate the referenceable target object based on the prior knowledge, play an important role in establishing the cooperative process. In addition, the expansibility of the model is a point to be considered in the practical application process, and in some cases, the model needs to be expanded or reduced to different degrees for different applications to meet different requirements in terms of resource performance.
In recent years, deep learning and deep neural networks have played a great role in many computer vision and image processing tasks. The convolutional neural network is the most commonly used one in the deep neural networks at present, and is favored by the research community and the industrial community due to the strong expansibility and the excellent learning ability and expression ability. The capsule neuron is a nerve unit recently proposed by professor Hinton, and mainly aims to solve the problem that a convolutional neural network loses characteristic position information in an inference process. Capsule neurons focus on capturing the geometric relationships of target parts to target whole, trying to preserve such relationships and propagate their associated information. Therefore, the capsule neuron can resolve the target and the characteristics thereof from a plurality of interferences, and most of the interferences are filtered.
The characteristics of the capsule neurons are very helpful for the task of collaborative target segmentation and identification, on one hand, real targets can be analyzed from the segmentation results and the characteristics of the targets can be extracted, and most of interference in the segmentation process is filtered out, and the characteristics can be used for reconstructing or generating the real targets. On the other hand, the capsule-based deep neural network also has better expansibility.
According to the invention, a network of a coder-decoder framework is built based on the capsule neural unit and is introduced into a model for collaborative target segmentation and recognition, so that the learning, expression and generation of target shape knowledge are realized, and further the mutual collaboration of a segmentation task and a recognition task is realized.
Disclosure of Invention
The invention aims to provide an image segmentation and identification method based on capsule neurons. The method utilizes a capsule neuron-based deep neural network to learn, model and express the shape of the target. The deep neural network comprises two basic modules: an encoder and a decoder. The encoder uses a capsule neural unit, and the capsule neural unit is used for extracting and identifying target features in the current segmentation result; the decoder is used for generating an object shape for being referenced by the segmentation model based on the extracted object features and the recognition result. The two modules enable the two tasks to communicate with each other, communicate information with each other, work in concert to achieve better performance, and make the segmentation and recognition processes more explanatory.
The invention adopts the following technical scheme: a segmentation identification method based on capsule neuron images comprises the following steps:
step 1: based on two tuple data { target shape m containing L different classesiObject class label yiWhere i 1, N is the sample number, mi∈{0,1}H×WH and W are respectively an image miUsing capsule neurons to build and train an encoder network Enc for learning, extracting each target shape miIs characterized by
Figure BDA0001899318550000021
Wherein D is the dimension of the top capsule neuron of the encoder network; at the same time, based on the extracted features ViTraining a decoder network Dec for generating the target shape;
step 2: for the image to be segmented and identified
Figure BDA0001899318550000022
Where there is one and only one object, C is the number of channels in the image I, E (q, t) using an energy function based on the image datadata(I, q) carrying out initial segmentation on I, and obtaining an initial result q belonging to [0, 1 ] by segmentation according to the principle of optimal energy]H×WThe value q (x) of the pixel point position x characterizes the probability that the pixel belongs to the target;
and step 3: analyzing and identifying the initial result q by using an encoder network Enc to obtain the target shape feature V, wherein the label t of the identified target category is arglmax||vlL, where vlFor the l line, | V in the target feature VlAnd | is its modulo length. The characteristics of the capsule neurons determine | | | vl||∈[0,1]Thus | | | vlI also represents the probability that the target belongs to class I;
and 4, step 4: generating a reference shape of the target using a decoder network Dec based on V and the recognition result t
Figure BDA0001899318550000031
The energy function in update step 2 is as follows:
E(q,t)=α×Edata(I,q)+(1-α)×Eshape(q,t)
Eshape(q, t) is a reference shape
Figure BDA0001899318550000032
With the frontQ is a loss function, and alpha is weight; and obtaining an updated segmentation result q by using the updated energy function according to the principle of optimal energy.
And 5: and (5) repeating the steps 2, 3 and 4 until q converges or reaches the maximum iteration number, and outputting a segmentation result q and the identified target class label t.
The invention has the beneficial effects that:
(1) analyzing a target segmentation result by using a network established by a capsule neural unit, capturing a geometric relation from a target local part to a target whole, and filtering redundant interference information in the process of executing a cooperative task;
(2) the features extracted by the capsule network have stronger semantic information, and each dimensional feature can represent one attribute of a target, which brings interpretability to the identification process;
(3) the network of the structure of the encoder-decoder in the collaborative model has better expansibility, and can be replaced by other proper neural network modules, thereby expanding the application range of the collaborative model.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is an image to be segmented and identified;
fig. 3 to 7 show segmentation recognition results obtained in iterations 1, 20, 40, 60, and 80, where L is 30;
fig. 8-12 are reference shapes generated during iterations 1, 20, 40, 60, 80.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
On the contrary, the invention is intended to cover alternatives, modifications, equivalents and alternatives which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of the present invention, certain specific details are set forth in order to provide a better understanding of the present invention. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details.
Referring to fig. 1, a flowchart of steps of a capsule neuron-based collaborative segmentation and recognition model according to an embodiment of the present invention is shown.
Given a training dataset { target shape miObject class label yiTesting a target image ItestThe method comprises the following steps:
1. training shape expression models and appearance expression models
(1.1) based on the data set D0(target shape m)iObject class label yiAnd expanding the target shape appropriately (namely, expanding a data set), and performing displacement, deformation, rotation and perspective transformation of different degrees on part of the training shapes to generate more shapes for training. Which is defined with its tags as a data set
Figure BDA0001899318550000041
Figure BDA0001899318550000042
All target shapes are photographed
Figure BDA0001899318550000043
Normalized to 80 x 80 size.
(1.2) adding D1Sample pair of
Figure BDA0001899318550000044
Inputting the shape into a coder-decoder network for shape learning, and establishing a shape recognition model Enc and a shape generation model Dec;
(1.3) the encoder-decoder network structure is:
Figure BDA0001899318550000045
Figure BDA0001899318550000051
based on losses
Figure BDA0001899318550000052
And training the network.
2. For test image ItestFor example, FIG. 2
(2.1) in this embodiment, the image data energy term is created by using the following method, where f (x) -logp (i (x)) iq (x)) is ≧ τ, g (x) -logp (i (x)) iq (x)) < τ, where T is a foreground probability confidence threshold, and i (x) is image data (e.g., a gray scale value) of a pixel point x. p (I (x) q (x) is equal to tau) represents the pixel color distribution of the foreground area, and p (I (x) q (x) is less than tau) represents the pixel color distribution of the background area. The data item is thus Edata(I;q)=∑xq (x) f (x) (1-q (x)) g (x); for energy function E (q, t) ═ Edata(I, q), dividing to obtain an initial result q according to the principle of energy optimization0
(2.2) analyzing and identifying the target shape q by using the encoder network Enc to obtain the target shape feature V, and identifying the target class label t ═ arglmax||vlL, where vlFor the l line, | V in the target feature VlAnd | is its modulo length. The characteristics of the capsule neurons determine | | | vl||∈[0,1]Thus | | | vlI also represents the probability that the target belongs to class I;
(2.3) generating a reference shape of the object using the decoder network Dec based on V and the recognition result t
Figure BDA0001899318550000053
The energy function in the update (2.1) is as follows:
E(q,t)=α×Edata(I,q)+(1-α)×Eshape(q,t)
Eshape(q, t) is a reference shape
Figure BDA0001899318550000054
A loss function with q, α being the weight; and obtaining an updated segmentation result q by using the updated energy function according to the principle of optimal energy.
(2.4) repeating the step 2.1-2.3 until q converges or reaches the maximum iteration number, and outputting the segmented target q and the identified target class label t. The iterative process is as follows:
(a) in the k-th sub-optimization iteration process, the segmentation result q of the k-1-th time is divided by using Enck-1Extracting and identifying the shape to obtain the target characteristic VkAnd the recognition result tk
(b) Based on target characteristics VkAnd the recognition result tkExcept for VkSatisfies max | | vlThe other is the feature of interference information, so all the other rows except the t-th row are set to be 0, and then V is setkExpanded into a vector as input to the generative model Dec to generate a reference shape
Figure BDA0001899318550000061
(c) Based on reference shape
Figure BDA0001899318550000062
The loss function defined is:
Figure BDA0001899318550000063
(d) weighting the two energy terms to obtain the final energy
E(q,t)=α×Edata(q)+(1-α)×Eshape(q,t)
Adding edge constraint terms
Figure BDA0001899318550000064
Based on the splitbegman method, the total energy can be converted to the following form:
Figure BDA0001899318550000065
wherein r isdata(x)=f(x)-g(x),
Figure BDA0001899318550000066
8-12 are reference shapes generated during iterations 1, 20, 40, 60, 80, and it can be seen that the reference shape of the target generated is slightly rough because the confidence of segmentation and recognition at the initial iteration is not very high, but the capsule network still analyzes a rough target from the reference shape, filters partial interference information, and retains a rough outline of the target; as the iteration progresses, the generated reference shape becomes finer and more specific, and conforms more and more to the target area in the actual test image.
Meanwhile, the features extracted in the identification process have strong interpretability, on one hand, the extracted features can be used for reconstructing a target shape; on the other hand, due to the nature of the capsule neurons themselves, each dimensional feature represents some deformation property of the target.
And because the capsule neuron is only a special neural unit in the deep neural network, the encoder network and the decoder network module also have expansibility naturally, and the scale and the number of layers of the network can be reduced or increased, so that the encoder-decoder module in the method can be replaced by other suitable network modules to meet different resource constraints and application requirements.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (1)

1. A segmentation identification method of an image based on capsule neurons is characterized by comprising the following steps:
step 1: based on two tuple data { target shape m containing L different classesiObject class label yiWhere i is 1, …, N is the sample number, mi∈{0,1}H×WH and W are respectively an image miUsing capsule neurons to build and train an encoder network Enc for learning, extracting each target shapemiIs characterized by
Figure FDA0002564419990000011
Wherein D is the dimension of the top capsule neuron of the encoder network; at the same time, based on the extracted features ViTraining a decoder network Dec for generating the target shape;
step 2: for the image to be segmented and identified
Figure FDA0002564419990000012
Where there is one and only one object, C is the number of channels in the image I, respectively, using an energy function E based on the image datadata(I, q) carrying out preliminary segmentation on the I, and obtaining an initial segmentation result q by segmentation according to the principle of optimal energy0∈[0,1]H×WThe value q (x) of the pixel point position x characterizes the probability that the pixel belongs to the target;
and step 3: splitting the result q using the encoder network Enc0Analyzing and identifying to obtain the target shape feature V, and identifying the target type label t ═ argmax | | | VlL, where vlFor the l line, | V in the target feature Vl| | is its modulo length; the characteristics of the capsule neurons determine | | | vl||∈[0,1]Thus | | | vlI also represents the probability that the target belongs to class I;
and 4, step 4: generating a reference shape of the target using a decoder network Dec based on V and the recognition result t
Figure FDA0002564419990000013
The update energy function is as follows:
E(q0,t)=α×Edata(I,q0)+(1-α)×Eshape(q0,t)
Eshape(q0and t) is a reference shape
Figure FDA0002564419990000014
With the initial segmentation result q0α is the weight; using the updated energy function in terms of energyOptimal principle, obtaining updated segmentation result
Figure FDA0002564419990000015
And 5: according to step 3-4, the updated segmentation result
Figure FDA0002564419990000016
Performing iterative optimization until
Figure FDA0002564419990000017
And converging or reaching the maximum iteration times, and outputting a final segmentation result q' and the identified target class label t.
CN201811505408.9A 2018-12-10 2018-12-10 Image segmentation and identification method based on capsule neurons Active CN109711411B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811505408.9A CN109711411B (en) 2018-12-10 2018-12-10 Image segmentation and identification method based on capsule neurons

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811505408.9A CN109711411B (en) 2018-12-10 2018-12-10 Image segmentation and identification method based on capsule neurons

Publications (2)

Publication Number Publication Date
CN109711411A CN109711411A (en) 2019-05-03
CN109711411B true CN109711411B (en) 2020-10-30

Family

ID=66255596

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811505408.9A Active CN109711411B (en) 2018-12-10 2018-12-10 Image segmentation and identification method based on capsule neurons

Country Status (1)

Country Link
CN (1) CN109711411B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298844B (en) * 2019-06-17 2021-06-29 艾瑞迈迪科技石家庄有限公司 X-ray radiography image blood vessel segmentation and identification method and device
CN110570394B (en) * 2019-08-01 2023-04-28 深圳先进技术研究院 Medical image segmentation method, device, equipment and storage medium
CN111161280B (en) * 2019-12-18 2022-10-04 浙江大学 Contour evolution segmentation method based on neural network
CN113065394B (en) * 2021-02-26 2022-12-06 青岛海尔科技有限公司 Method for image recognition of article, electronic device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062753A (en) * 2017-12-29 2018-05-22 重庆理工大学 The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study
CN108921227A (en) * 2018-07-11 2018-11-30 广东技术师范学院 A kind of glaucoma medical image classification method based on capsule theory

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11074495B2 (en) * 2013-02-28 2021-07-27 Z Advanced Computing, Inc. (Zac) System and method for extremely efficient image and pattern recognition and artificial intelligence platform
CN104537676B (en) * 2015-01-12 2017-03-22 南京大学 Gradual image segmentation method based on online learning
JP2020510463A (en) * 2017-01-27 2020-04-09 アーテリーズ インコーポレイテッド Automated segmentation using full-layer convolutional networks
WO2018156778A1 (en) * 2017-02-22 2018-08-30 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Detection of prostate cancer in multi-parametric mri using random forest with instance weighting & mr prostate segmentation by deep learning with holistically-nested networks
EP3629898A4 (en) * 2017-05-30 2021-01-20 Arterys Inc. Automated lesion detection, segmentation, and longitudinal identification
CN108846384A (en) * 2018-07-09 2018-11-20 北京邮电大学 Merge the multitask coordinated recognition methods and system of video-aware

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062753A (en) * 2017-12-29 2018-05-22 重庆理工大学 The adaptive brain tumor semantic segmentation method in unsupervised domain based on depth confrontation study
CN108921227A (en) * 2018-07-11 2018-11-30 广东技术师范学院 A kind of glaucoma medical image classification method based on capsule theory

Also Published As

Publication number Publication date
CN109711411A (en) 2019-05-03

Similar Documents

Publication Publication Date Title
CN109409222B (en) Multi-view facial expression recognition method based on mobile terminal
CN108491880B (en) Object classification and pose estimation method based on neural network
CN109711411B (en) Image segmentation and identification method based on capsule neurons
CN108304357B (en) Chinese character library automatic generation method based on font manifold
CN110046671A (en) A kind of file classification method based on capsule network
CN103425996B (en) A kind of large-scale image recognition methods of parallel distributed
CN113989890A (en) Face expression recognition method based on multi-channel fusion and lightweight neural network
CN104268593A (en) Multiple-sparse-representation face recognition method for solving small sample size problem
Mittelman et al. Weakly supervised learning of mid-level features with Beta-Bernoulli process restricted Boltzmann machines
CN106503661B (en) Face gender identification method based on fireworks deepness belief network
CN112307714A (en) Character style migration method based on double-stage deep network
CN109325513B (en) Image classification network training method based on massive single-class images
CN112069900A (en) Bill character recognition method and system based on convolutional neural network
Xu et al. Face expression recognition based on convolutional neural network
CN110263855B (en) Method for classifying images by utilizing common-basis capsule projection
CN107358172B (en) Human face feature point initialization method based on human face orientation classification
Al-Zubaidi et al. Two-dimensional optical character recognition of mouse drawn in Turkish capital letters using multi-layer perceptron classification
CN110598022A (en) Image retrieval system and method based on robust deep hash network
CN114743133A (en) Lightweight small sample video classification and identification method and system
CN105809200A (en) Biologically-inspired image meaning information autonomous extraction method and device
CN112163605A (en) Multi-domain image translation method based on attention network generation
CN113269235B (en) Assembly body change detection method and device based on unsupervised learning
CN112488238B (en) Hybrid anomaly detection method based on countermeasure self-encoder
CN113128624B (en) Graph network face recovery method based on multi-scale dictionary
Dembani et al. UNSUPERVISED FACIAL EXPRESSION DETECTION USING GENETIC ALGORITHM.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant