WO2018010434A1 - 一种图像分类方法及装置 - Google Patents
一种图像分类方法及装置 Download PDFInfo
- Publication number
- WO2018010434A1 WO2018010434A1 PCT/CN2017/074427 CN2017074427W WO2018010434A1 WO 2018010434 A1 WO2018010434 A1 WO 2018010434A1 CN 2017074427 W CN2017074427 W CN 2017074427W WO 2018010434 A1 WO2018010434 A1 WO 2018010434A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- neural network
- convolutional neural
- network model
- layer
- max criterion
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 187
- 238000012549 training Methods 0.000 claims abstract description 95
- 238000012360 testing method Methods 0.000 claims abstract description 20
- 230000006870 function Effects 0.000 claims description 49
- 230000035945 sensitivity Effects 0.000 claims description 36
- 239000011159 matrix material Substances 0.000 claims description 30
- 238000012545 processing Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 238000011478 gradient descent method Methods 0.000 claims description 6
- 238000000638 solvent extraction Methods 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims 2
- 238000010586 diagram Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 6
- 230000009466 transformation Effects 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 238000012795 verification Methods 0.000 description 5
- 210000004556 brain Anatomy 0.000 description 4
- 210000005036 nerve Anatomy 0.000 description 3
- 210000002569 neuron Anatomy 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000010494 dissociation reaction Methods 0.000 description 2
- 230000005593 dissociations Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
Definitions
- the present invention relates to the field of computer vision image classification technology, and in particular, to an image classification method and apparatus.
- the training method for convolutional neural network image classification is to simply adopt the back propagation (English: Back Propagation, abbreviated: BP) algorithm based on stochastic gradient descent (abbreviation: SGD). Since the constraints on the features learned by the convolutional neural network are not included in this training method, the classification accuracy of the trained convolutional neural network image classification system is not good enough, and the performance is within the class of the learned features. Compactness and separation between classes is not good enough.
- BP Back Propagation, abbreviated: BP
- SGD stochastic gradient descent
- the present application provides an image classification method and apparatus for improving image classification accuracy. To solve the above technical problems, the present application discloses the following technical solutions:
- an image classification method comprising:
- the test set of the classified images is classified.
- the present application is based on an invariant feature of object recognition, which refers to an object that undergoes a conservation transformation (eg, positional translation, illumination change, shape change, change in angle of view, etc.), and its corresponding feature in the feature space.
- the vector will also change, projecting the feature vector into a high-dimensional feature space due to high-dimensional features.
- the dimension of the space is the same as the dimension of the eigenvector, so all the eigenvectors corresponding to all the conserved transformations in the high-dimensional feature space will form a low-dimensional manifold, and the target manifolds belonging to the same class become compact.
- the present invention provides a shunting method for the deep convolutional neural network image based on the Min-Max criterion, and the constraint of the selected layer feature of the convolutional neural network is based on the Min-Max criterion.
- the characteristics that are learned by explicit forcing are satisfied: the target manifolds belonging to the same class have better intra-class compactness, and the target manifolds belonging to different classes have larger inter-class spacing, which can significantly improve image classification. Precision.
- the regular constraint operation of the Min-Max criterion is performed, which makes it possible to simplify the operation when training large-scale networks, avoiding the increase in network size and training data size.
- the calculation is large and the efficiency is low.
- the selecting a convolutional neural network model comprises:
- n the size of the mini-batch
- X i the original input data
- c i X i the category labels
- c i ⁇ ⁇ 1,2, ..., C ⁇ C represents the total number of categories of the training set
- W (W (1) ,...,W (M) ;b (1) ,...,b (M) ), where W represents all parameters of the convolutional neural network model, Representing the loss function of the training sample, M represents the total number of layers of the convolutional neural network model, W (m) represents the weight parameter of the mth layer of the convolutional neural network model, and b (m) represents the convolutional nerve
- the offset parameter of the mth layer of the network model m ⁇ ⁇ 1, 2, ..., M ⁇ .
- the method further includes: dividing the convolutional neural network model into a hierarchy; wherein the dividing the hierarchy
- the recursive representation of each layer feature of the convolutional neural network model is:
- X i (m) represents the feature of the mth layer of the convolutional neural network model
- * denotes a convolution operation
- f( ⁇ ) denotes a nonlinear activation function
- the method before the regular constraint operation based on the Min-Max criterion is performed on the selected layer, the method further includes: acquiring the Min-Max criterion;
- the obtaining the Min-Max criterion includes: acquiring an inner graph and a penalty graph of Min-Max, respectively, the internal graph characterizing an internal compactness of the target manifold, the penalty graph characterizing an interval between the target manifolds;
- the intrinsic graph and the penalty graph are obtained by the Min-Max criterion of the k-th layer feature, wherein the k-th layer is the selected layer; wherein the Min-Max criterion of the k-th layer feature is expressed as
- L 1 (X (k) , c) represents the intrinsic graph
- L 2 (X (k) , c) represents the penalty graph
- X (k) represents a mini-batch training sample a collection of features at the kth layer, Represents a set of category labels corresponding to the mini-batch, i ⁇ ⁇ 1, 2, ..., n ⁇ .
- the second convolutional neural network model is represented by an objective function as:
- L(X (k) , c) is the Min-Max criterion of the k-th layer feature.
- the training the second convolutional neural network model using the training set comprises: according to an objective function of the second convolutional neural network model, Obtaining sensitivity of the second convolutional neural network model with respect to the k-th layer feature; using the training set for the second convolutional neural network model according to the sensitivity of the k-th layer feature and the mini-batch stochastic gradient descent method Carry out training;
- the sensitivity of the k-th layer feature is calculated as follows:
- the Min-Max criterion is a Min-Max criterion of a core version
- the Min-Max criterion of the core version is the Min-Max criterion
- the generation criteria are defined by a Gaussian kernel function.
- the regular constraint operation based on the Min-Max criterion for the selected layer includes Obtaining a sensitivity of the Min-Max criterion of the kernel version with respect to a feature of the k-th layer; according to a sensitivity of the Min-Max criterion of the kernel version with respect to a feature of the k-th layer, the k-th layer is based on the nuclear version Constrained operation of the Min-Max criterion;
- the sensitivity of the nuclear version of the Min-Max criterion regarding the feature of the kth layer is expressed as:
- the classifying the test set of the classified image using the third convolutional neural network model comprises: using the third convolutional neural network model The model parameters are classified into test sets of the classified images.
- the selected layer is between the output layer and the output layer in the convolutional neural network model The distance is no more than two layers.
- an image classification apparatus comprising means for performing the method steps of the first aspect and the implementations of the first aspect.
- an image classification device comprising: a processor and a memory
- the processor is configured to acquire a training set of the image to be classified; select a multi-layer convolutional neural network model; perform a regular constraint based on the Min-Max criterion on the selected layer, and form a second convolutional neural network model, using The training set trains the second convolutional neural network model and generates a third convolutional neural network model; and uses the third convolutional neural network model to classify a test set of the classified image, wherein the The selection layer is a layer in the convolutional neural network model;
- the memory is configured to store a training set of the image to be classified, the multi-layer convolutional neural network model, The Min-Max criteria and the classified images.
- a computer storage medium can store a program, and execution of the program when executed can include some or all of the steps in various implementations of the image classification method and apparatus. .
- FIG. 1 is a schematic flowchart diagram of an image classification method according to an embodiment of the present application
- FIG. 2 is a schematic diagram of a process of forming an object manifold by a human brain vision system according to an embodiment of the present application
- FIG. 3 is a schematic diagram of a target feature invariance obtained by transform according to an embodiment of the present application.
- FIG. 4 is a schematic structural diagram of a multi-layer convolutional neural network model according to an embodiment of the present application.
- FIG. 5 is a schematic structural diagram of an internal diagram and a penalty diagram according to an embodiment of the present disclosure
- FIG. 6 is a structural block diagram of an image classification apparatus according to an embodiment of the present application.
- FIG. 7 is a schematic diagram of an image classification device according to an embodiment of the present application.
- An image classification method and apparatus provided by the present application are used to improve the accuracy of image classification. Specifically, the method utilizes the manifold dissociation characteristics of the target recognition of the human brain visual channel, and combines it with a convolutional neural network. A method and device for image classification of deep convolutional neural networks based on Min-Max criterion is proposed.
- the manifold dissociation characteristics of the ventral channel of the human brain vision system with respect to target recognition are introduced.
- the key of the target recognition is the invariant feature, which can be accurately identified under various visual conditions.
- the ability of a particular object For a visual stimulus, the activation response of a neuron in the ventral channel can be regarded as a response vector, and the dimension of the vector space is the number of neurons in the region.
- the generated response vector forms a low-dimensional object manifold in the high-dimensional vector space (in English: object manifold). 2, where r1, r2, ..., rN represent each neuron.
- Each target manifold in the lower brain region is highly curved, and the manifolds of different target objects are intertwined with each other.
- the ventral channel transforms the manifolds of different targets into flat and separated from each other by stepwise nonlinear transformation.
- the different target manifolds become linearly separable, as shown in Figure 3.
- the eigenvectors are projected into a high-dimensional feature space (the dimension of the high-dimensional feature space is the same as the dimension of the eigenvector), and all the eigenvectors corresponding to all the symmetry transformations in the high-dimensional feature space will form a low dimension.
- the manifolds are more compact when the target manifolds belonging to the same class become more compact, and the manifolds of different types of target objects have larger intervals.
- An image classification system comprising: an image set, a convolutional neural network model, and a Min-Max criterion.
- the image set refers to an image to be classified, and the image set is divided into a training set, a verification set, and a test set in advance before classifying the image set.
- the convolutional neural network model can in principle be any convolutional neural network model, such as Quick-CNN, NIN, AlexNet, and the like.
- FIG. 1 is a schematic flowchart diagram of an image classification method according to an embodiment of the present disclosure, where the method includes the following steps:
- Step 101 Acquire a training set of an image to be classified, where the image to be processed is pre-divided into a training set, a verification set, and a test set.
- Step 102 Select a multi-layer convolutional neural network model.
- the convolutional neural network model includes at least two levels.
- Step 103 Perform a regular constraint operation based on the Min-Max criterion on the selected layer, and form a second convolutional neural network model, wherein the selected layer is a layer in the convolutional neural network model, for example, The selected layer is the kth layer in the convolutional neural network model.
- Min-Max criterion is constructed based on an intrinsic map of the target manifold and a penalty map that characterizes the internal compactness of the target manifold, the penalty map characterizing the spacing between the target manifolds.
- Step 104 Train the second convolutional neural network model using the training set, and generate a third convolutional neural network model.
- Step 105 Classify the test set of the classified image by using the third convolutional neural network model to complete the classification test of the image to be classified.
- the image classification method provided by the embodiment is based on the observation of the invariant feature of the target recognition, and the constraint of the selected layer of the convolutional neural network is based on the Min-Max criterion, so that the explicit training is performed explicitly (English: explicit)
- the characteristics learned are: the target manifolds belonging to the same class have better intra-class compactness, and the target manifolds belonging to different classes have larger inter-class spacing (ie, the interval between different target manifolds is as The large size can in turn significantly improve the accuracy of image classification.
- the process of selecting a multi-layer convolutional neural network model includes:
- the mini-batch training sample is expressed as n represents the size of the mini-batch
- X i represents the original input data, i.e., X i is the i-th set of training images web
- C i represents the category for classification label corresponding to an image
- c i ⁇ ⁇ 1, 2,...,C ⁇
- c i represents a category label of X i
- C represents the total number of categories of the training set image
- the category label of each image is a specific one selected from ⁇ 1, 2, ..., C ⁇ value.
- the objective function of the selected convolutional neural network model is expressed as:
- W (W (1) ,...,W (M) ;b (1) ,...,b (M) ), where W represents all parameters of the selected convolutional neural network model, Representing the loss function of the training sample, M represents the total number of layers of the convolutional neural network model, W (m) represents the weight parameter of the mth layer of the convolutional neural network model, and b (m) represents the convolutional nerve
- the offset parameter of the mth layer of the network model m ⁇ ⁇ 1, 2, ..., M ⁇ .
- the method further includes:
- X i (m) represents the feature of the mth layer of the convolutional neural network model
- * denotes a convolution operation
- f( ⁇ ) denotes a nonlinear activation function
- the selected layer is set to the kth layer.
- the layer close to the output (ie, the upper layer of the model) in the convolutional neural network model, for example, the selected layer is no more than two layers from the output layer in the convolutional neural network model, as shown in FIG. Show.
- Min-Max criterion it is better to apply the Min-Max criterion to the upper layers of convolutional neural network models (such as CNN models).
- the optimization effect is because the CNN model is optimized by the BP (English: Error Back-Propagation, Chinese: Error Back Propagation) algorithm, and the derivative of the Min-Max criterion about the feature can be influenced by the BP process from top to bottom in the CNN model. The learning of each layer of features.
- the regular constraint operation of the Min-Max criterion is performed, so that when training large-scale networks, the operation can be simplified, and the network size and training can be avoided.
- the size of the data results in a large amount of computation and low efficiency, and it also avoids the time and labor required to construct a large-scale training label dataset.
- the method further includes: acquiring the Min-Max criterion.
- the obtaining the Min-Max criterion includes:
- the k- th layer of the sample X i is characterized by For the convenience of description, Straighten the column vector and abbreviate it as x 1 , as shown in Figure 5.
- Min-Max criterion of the k-th layer feature is expressed as
- L 1 (X (k) , c) represents the intrinsic graph
- L 2 (X (k) , c) represents the penalty graph
- X (k) represents a mini-batch training sample a collection of features at the kth layer, Represents a set of category labels corresponding to the mini-batch, i ⁇ ⁇ 1, 2, ..., n ⁇ .
- the intrinsic graph is constructed by considering ⁇ x 1 , x 2 , . . . , x n ⁇ as the vertices of the intrinsic graph, each vertex passing through the undirected edge with the k 1 nearest neighbor vertex having the same label Connected.
- the penalty graph is constructed by considering ⁇ x 1 , x 2 , . . . , x n ⁇ as the vertices of the penalty graph, and edge vertex pairs from different types of manifolds are connected by undirected edges.
- the edge vertex pairs of the c-type manifold are defined as k 2 nearest vertex pairs between the c-type manifold and all other manifolds.
- the compactness inside the manifold can be expressed as:
- the interval between manifolds can be expressed as:
- step 103 a regular constraint operation based on the Min-Max criterion is performed on the selected layer, and a second convolutional neural network model is formed, and the second convolutional neural network model is expressed by the objective function as:
- L(X (k) , c) is the Min-Max criterion of the k-th layer feature
- ⁇ is the weight coefficient greater than 0.
- the values of ⁇ need to be adjusted for different data sets. After the value of ⁇ is adjusted, the whole training process is kept constant.
- training the second convolutional neural network model by using the training set includes:
- the training set is used to train the second convolutional neural network model, and the pre-divided verification set of the image to be classified is used to adjust the learning rate and other parameters.
- the backward propagation BP algorithm it is necessary to calculate the derivative of the objective function with respect to the model parameters. Since it is difficult to directly calculate the derivative of the objective function with respect to the model parameters, it is necessary to first calculate the sensitivity of the objective function with respect to the characteristics of each layer, that is, the loss function The derivative or gradient of the layer features can then be derived from the sensitivity with respect to the derivative of the corresponding parameter.
- the sensitivity of the classification loss function with respect to the features of the kth layer can be calculated in accordance with the back propagation algorithm of the conventional neural network.
- the method provided by the present application only needs to calculate the gradient of the Min-Max criterion about the feature of the kth layer, and does not need to calculate the sensitivity of the objective function with respect to each layer feature.
- the specific calculation process is as follows:
- the second convolutional neural network model is trained using the training set according to the sensitivity of the k-th layer feature and the mini-batch random gradient descent method; wherein the sensitivity of the k-th layer feature is as follows Calculated:
- the sensitivity of the feature of the kth layer is the gradient from the classification loss function of the second convolutional neural network model with respect to the kth layer feature plus the Min-Max criterion for the kth layer feature, and then according to the standard back propagation algorithm.
- the error sensitivity back-transmission can be performed before.
- the trained model By training the model by adding the objective function of the Min-Max criterion, the trained model can be satisfied: the image features belonging to the same class are separated by a small interval, and the image features belonging to different classes have a large interval, thereby facilitating the image. classification.
- Min-Max criterion of the kernel version when using a Gaussian kernel function to define with The corresponding Min-Max criterion is called the Min-Max criterion of the kernel version.
- the regular constraint operation based on the Min-Max criterion for the selected layer includes:
- the sensitivity of the nuclear version of the Min-Max criterion regarding the feature of the kth layer is expressed as:
- classifying the test set of the classified image using the third convolutional neural network model includes: classifying a test set of the classified image by using the model parameter in the third convolutional neural network model.
- the model parameter is W
- the verification set of the image to be classified is used to adjust parameters such as a learning rate, which is a parameter used in the training process (not a model parameter), and this parameter can be adjusted by using a verification set.
- the present application is based on the observation of the invariant feature of the target recognition.
- the learned features are explicitly forced to satisfy: the target manifolds belonging to the same class are compared. Good intra-class compactness, target manifolds belonging to different classes have large inter-class spacing.
- the features are directly and explicitly constrained by the Min-Max criterion, so that the Min-Max criterion can technically ensure that the convolutional neural network learns the best invariant features.
- the image classification accuracy of the improved model is significantly improved compared with the model trained by the traditional BP method, so that the image classification accuracy of a convolutional network model with less model complexity can reach depth and complexity. Image classification accuracy of a higher convolutional neural network model.
- the selected convolutional neural network model is experimentally verified.
- the characteristic map learned by the improved convolutional network model will show better intra-class compactness and inter-class separation, that is, the distance between the features of images belonging to the same class is small, belonging to different classes.
- the distance between the features of the image is large, and this feature of the feature map is very obvious compared to the baseline model.
- the present application provides a method for explicitly constraining the Min-Max criterion regularity of the features learned by the convolutional neural network, and avoids that the regular constraints on the model are all constraining the model parameters.
- the Min-Max criterion can be used for many types of convolutional neural networks, and the resulting additional computational cost is negligible relative to the training of the entire network.
- the present application further provides an image classification device corresponding to the foregoing embodiment of the image classification method.
- the device 600 includes: an acquisition unit 601, a selection unit 602, a processing unit 603, a training unit 604, and a classification.
- Unit 605 the acquisition unit 601, a selection unit 602, a processing unit 603, a training unit 604, and a classification.
- An obtaining unit 601, configured to acquire a training set of an image to be classified
- the selecting unit 602 is configured to select a multi-layer convolutional neural network model
- the processing unit 603 is configured to perform a regular constraint operation based on the Min-Max criterion on the selected layer, and form a second convolutional neural network model, wherein the selected layer is a layer in the convolutional neural network model;
- the selected layer is a layer close to the output in the convolutional neural network model, that is, the selected layer is no more than two layers from the output layer in the convolutional neural network model.
- the training unit 604 is configured to train the second convolutional neural network model by using the training set, and generate a third convolutional neural network model;
- the classification unit 605 is configured to classify the test set of the classified image by using the third convolutional neural network model.
- the selecting unit 602 is further configured to: acquire a mini-batch training sample; and determine the convolutional neural network model according to the training sample and the objective function.
- n represents the size of the mini-batch
- X i represents the original input data
- c i X i represents the category labels
- c i ⁇ ⁇ 1,2, ..., C ⁇ , C represents the total number of categories of the training set ;
- the objective function is expressed as:
- W (W (1) ,...,W (M) ;b (1) ,...,b (M) ), where W represents all parameters of the convolutional neural network model, Representing the loss function of the training sample, M represents the total number of layers of the convolutional neural network model, W (m) represents the weight parameter of the mth layer of the convolutional neural network model, and b (m) represents the convolutional nerve
- the offset parameter of the mth layer of the network model m ⁇ ⁇ 1, 2, ..., M ⁇ .
- the device further includes: a layering unit 606,
- the layering unit 606 is configured to divide the convolutional neural network model into layers according to a feature recursive method.
- X i (m) represents the feature of the mth layer of the convolutional neural network model
- * denotes a convolution operation
- f( ⁇ ) denotes a nonlinear activation function
- the acquiring unit 601 is further configured to acquire the Min-Max criterion
- the obtaining unit 601 is specifically configured to respectively acquire an inner graph and a penalty graph of Min-Max, wherein the internal graph represents internal compactness of the target manifold, and the penalty graph represents an interval between the target manifolds;
- the intrinsic graph and the penalty graph are computed to obtain the Min-Max criterion for the kth layer feature.
- Min-Max criterion of the k-th layer feature is expressed as
- L 1 (X (k) , c) represents the intrinsic graph
- L 2 (X (k) , c) represents the penalty graph
- X (k) represents a mini-batch training sample
- the kth layer is the selected layer, Represents a set of category labels corresponding to the mini-batch, i ⁇ ⁇ 1, 2, ..., n ⁇ .
- the second convolutional neural network model is represented by an objective function as:
- L(X (k) , c) is the Min-Max criterion of the k-th layer feature.
- training unit 604 is specifically configured to:
- the second convolutional neural network model is trained using the training set according to the sensitivity of the k-th layer feature and the mini-batch random gradient descent method.
- the sensitivity of the k-th layer feature is calculated as follows:
- Min-Max criterion is a Min-Max criterion of a core version
- Min-Max criterion of the core version defines a generation criterion by a Gaussian kernel function for the Min-Max criterion.
- Min-Max criterion is a core version of the Min-Max criterion, then the processing unit 603 further uses to,
- a constraint operation based on the Min-Max criterion of the kernel version is performed on the kth layer according to the sensitivity of the kernel version of the Min-Max criterion with respect to the kth layer feature.
- the sensitivity of the nuclear version of the Min-Max criterion regarding the feature of the kth layer is expressed as:
- the classifying unit is specifically configured to classify the test set of the classified image by using the model parameter in the third convolutional neural network model.
- this paper proposes a deep convolutional neural network image classification device based on the Min-Max criterion.
- the features learned by forced training are explicitly (fully expressed clearly): the target manifolds belonging to the same class have better intra-class compactness. Sexuality, target manifolds belonging to different classes have large inter-class spacing.
- the embodiment of the present application also proposes a nuclear version of the Min-Max criterion, which is verified in the experiment.
- the image classification system trained by the method provided by the present application can significantly improve the image classification accuracy.
- the image classification accuracy of the improved model is significantly improved, and the feature map learned by the improved model will show better intra-class compactness and inter-class separation, ie The distance between features of images belonging to the same class is small, and the distance between features belonging to different classes of images is large.
- the embodiment further provides an image classification device.
- the device 700 includes a processor 701 and a memory 702.
- the processor 701 is configured to acquire a training set of an image to be classified; select a multi-layer convolutional neural network model; perform a regular constraint based on the Min-Max criterion on the selected layer, and form a second convolutional neural network model, Training the second convolutional neural network model using the training set, and generating a third convolutional neural network model; using the third convolutional neural network model to classify the test set of the classified image, wherein
- the selection layer is a layer in the convolutional neural network model;
- the memory 702 is configured to store a training set of the image to be classified, the multi-layer convolutional neural network model, the Min-Max criterion, and the classified image.
- processor 701 in the image classification device is further configured to perform various steps of the foregoing image classification method embodiment, and details are not described herein again.
- the processor 701 includes a graphics processing unit (English: Graphic Processing Unit, GPU), a central processing unit (CPU), a network processor (English: network processor, NP), or a CPU and NP.
- the processor 701 may further include a hardware chip.
- the hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof.
- ASIC application-specific integrated circuit
- PLD programmable logic device
- the above PLD can be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), and a general array logic (GAL). Or any combination thereof.
- the memory 702 can be a volatile memory, a non-volatile memory, or a combination thereof.
- the volatile memory may be a random-access memory (RAM);
- the non-volatile memory may be a read-only memory (ROM), a flash memory, or a hard disk ( Hard disk drive (HDD) or solid-state drive (SSD).
- RAM random-access memory
- ROM read-only memory
- HDD Hard disk drive
- SSD solid-state drive
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
本发明公开一种图像分类方法及装置,其特征在于,方法包括:获取待分类图像的训练集;选择一个多层的卷积神经网络模型;对选取层做基于最小-最大Min-Max准则的正则约束,并形成第二卷积神经网络模型,其中,所述选取层为卷积神经网络模型中的一层;使用所述训练集对第二卷积神经网络模型进行训练,并生成第三卷积神经网络模型;使用第三卷积神经网络模型对待分类图像的测试集进行分类。基于目标识别的不变性特征,通过对选取层特征做基于Min-Max准则的约束,使显式地强迫所学到的特征满足:属于同一类的目标流形有较好的类内紧凑性,属于不同类的目标流形有较大的类间间隔,进而能够显著地提高图像分类的精度。
Description
本发明涉及计算机视觉图像分类技术领域,特别是涉及一种图像分类方法及装置。
近来,卷积神经网络在计算机视觉领域和模式识别的多个领域都获得了巨大的成功,例如在目标识别、目标检测、语义分割、目标追踪和图像检索等方面均取得了较好的效果。这些巨大的成功主要归功于以下两个方面的原因:一方面,以通用计算图形处理器(英文:General Purpose GPU,缩写:GPGPU)和CPU集群为代表的现在计算技术的快速发展允许研究人员训练较大规模和较高复杂度的神经网络;另一方面,拥有数以百万计标记图像的大规模数据集的出现,能够在一定程度上减少训练较大规模的卷积神经网络的过拟合,使得训练大规模网络成为可能。
通常地,对卷积神经网络图像分类的训练方法是单纯地采用基于随机梯度下降(缩写:SGD)的反向传播(英文Back Propagation,缩写:BP)算法。由于在这种训练方法中没有加入对卷积神经网络所学习到的特征的约束条件,所以导致训练出来的卷积神经网络图像分类系统分类精度不够好,表现为所学习到的特征的类内紧凑性和类间分离性不够好。
发明内容
本申请中提供了一种图像分类方法及装置,以提高图像分类精度,为了解决上述技术问题,本申请公开了如下技术方案:
第一方面,提供了一种图像分类方法,所述方法包括:
获取待分类图像的训练集;选择一个多层的卷积神经网络模型;对选取层做基于Min-Max准则的正则约束,并形成第二卷积神经网络模型,其中,所述选取层为所述卷积神经网络模型中的一层;使用所述训练集对所述第二卷积神经网络模型进行训练,并生成第三卷积神经网络模型;使用所述第三卷积神经网络模型对待分类图像的测试集进行分类。
本申请基于目标识别的不变性特征,所述不变性特征是指,当一个物体经过保同变换(例如位置平移,光照变化,形状变化、视角变化等等),其在特征空间里对应的特征向量也会随之变化,将特征向量投影到一个高维的特征空间里,由于高维特征
空间的维数和特征向量的维数相同,所以这些所有保同变换在高维特征空间里对应的所有特征向量将会形成一个低维的流形,当属于同一类的目标流形变得比较紧凑,不同类目标物体的流形的间隔比较大时,就得到了较好的不变性特征。
本申请在对目标识别不变性特征的观察,提供了基于Min-Max准则改进的深度卷积神经网络图像的分流方法,通过对卷积神经网络的选取层特征做基于Min-Max准则的约束,使显式地强迫所学到的特征满足:属于同一类的目标流形有较好的类内紧凑性,属于不同类的目标流形有较大的类间间隔,进而能够显著地提高图像分类的精度。
此外,从选择的卷积神经网络模型中的选取层开始,对其进行Min-Max准则的正则约束运算,使得在训练大规模网络时,能够简化运算,避免通过增加网络规模和训练数据规模导致计算量大,效率低,同时也能够避免构建大规模的训练标注数据集时,投入耗费大量的时间、人力和财力。
结合第一方面,在第一方面第一种实现中,所述选择一个卷积神经网络模型包括:
获取一个mini-batch的训练样本;根据所述训练样本和目标函数,确定所述卷积神经网络模型;其中,所述训练样本表示为n表示所述mini-batch的大小,Xi表示原始的输入数据,ci表示Xi的类别标签,且ci∈{1,2,…,C},C表示所述训练集的类别总数;所述目标函数表示为:
W=(W(1),…,W(M);b(1),…,b(M)),W表示所述卷积神经网络模型的全部参数,表示训练样本的损失函数,M表示所述卷积神经网络模型的总层数,W(m)表示所述卷积神经网络模型第m层的权重参数,b(m)表示所述卷积神经网络模型第m层的偏置参数,m∈{1,2,…,M}中的任意一个值。
结合第一方面第一种实现,在第一方面第二种实现中,所述选择一个卷积神经网络模型之后还包括:对所述卷积神经网络模型划分层级;其中,所述划分层级后的卷积神经网络模型的每一层特征的递归表示为:
其中,Xi
(m)表示所述卷积神经网络模型第m层的特征,*表示卷积运算,f(·)
表示非线性激活函数。
结合第一方面,在第一方面第三种实现中,对选取层做基于Min-Max准则的正则约束运算之前还包括:获取所述Min-Max准则;
所述获取Min-Max准则包括:分别获取Min-Max的内在图和惩罚图,所述内在图表征目标流形的内部紧凑性,所述惩罚图表征目标流形之间的间隔;根据所述内在图和惩罚图,运算得到第k层特征的Min-Max准则,所述第k层为所述选取层;其中,所述第k层特征的Min-Max准则表示为
L(X(k),c)=L1(X(k),c)-L2(X(k),c)
其中,L1(X(k),c)表示所述内在图,L2(X(k),c)表示所述惩罚图;X(k)表示一个mini-batch的训练样本在第k层的特征的集合,表示与所述mini-batch相对应的类别标签集合,i∈{1,2,…,n}。
结合第一方面第三种实现,在第一方面第四种实现中,所述第二卷积神经网络模型用目标函数表示为:
结合第一方面第四种实现,在第一方面第五种实现中,使用所述训练集对所述第二卷积神经网络模型进行训练包括:根据第二卷积神经网络模型的目标函数,获取第二卷积神经网络模型关于第k层特征的灵敏度;根据所述第k层特征的灵敏度和mini-batch的随机梯度下降方法,使用所述训练集对所述第二卷积神经网络模型进行训练;
其中,所述第k层特征的灵敏度采用如下方式计算得出:
H表示第k层特征拼成的矩阵,Ψ=D-G,D=diag(d11,d22,…,dnn),
表示内在图中连接顶点xi和xj的边的权值,表示惩罚图中连接顶点xi和xj的边的权值,i=1,2,…,n,Ψ表示矩阵G=(Gij)n×n的拉普拉斯矩阵,下标(:,i)表示所述矩阵的第i列。
结合第一方面第五种实现,在第一方面第六种实现中,所述Min-Max准则为核版本的Min-Max准则,所述核版本的Min-Max准则为所述Min-Max准则通过高斯核函数来定义生成准则。
结合第一方面第六种实现,在第一方面第七种实现中,若所述Min-Max准则为核版本的Min-Max准则,那么对选取层做基于Min-Max准则的正则约束运算包括:获取所述核版本的Min-Max准则关于第k层特征的灵敏度;根据所述核版本的Min-Max准则关于第k层特征的灵敏度,对所述第k层做基于所述核版本的Min-Max准则的约束运算;
其中,所述核版本的Min-Max准则关于第k层特征的灵敏度表示为:
结合第一方面第七种实现,在第一方面第八种实现中,使用所述第三卷积神经网络模型对待分类图像的测试集进行分类包括:使用所述第三卷积神经网络模型中的模型参数对待分类图像的测试集进行分类。
结合第一方面或第一方面第一种至第八种实现的任意一种,在第一方面第八种实现中,所述选取层为与所述卷积神经网络模型中的输出层之间的距离不超过两个层。
第二方面,还提供了一种图像分类装置,该装置包括用于执行第一方面及第一方面各实现方式的中方法步骤的单元。
第三方面,还提供了一种图像分类设备,所述设备包括:处理器和存储器,
所述处理器,用于获取待分类图像的训练集;选择一个多层的卷积神经网络模型;对选取层做基于Min-Max准则的正则约束,并形成第二卷积神经网络模型,使用所述训练集对所述第二卷积神经网络模型进行训练,并生成第三卷积神经网络模型;使用所述第三卷积神经网络模型对待分类图像的测试集进行分类,其中,所述选取层为所述卷积神经网络模型中的一层;
所述存储器,用于存储所述待分类图像的训练集,所述多层的卷积神经网络模型,
所述Min-Max准则和分类后的图像。
第四方面,还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,该程序执行时执行可包括本发明提供一种图像分类方法及装置的各实现方式中的部分或全部步骤。
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种图像分类方法的流程示意图;
图2为本申请实施例提供的一种人脑视觉系统形成物体流形的过程示意图;
图3为本申请实施例提供的一种通过变换达到目标特征不变性的示意图;
图4为本申请实施例提供的一种多层卷积神经网络模型的结构示意图;
图5为本申请实施例提供的一种内在图和惩罚图的结构示意图;
图6本申请实施例提供的一种图像分类装置的结构框图;
图7为本申请实施例提供的一种图像分类设备的示意图。
本申请提供的一种图像分类方法及装置用于提高图像分类的精度,具体而言,本方法借鉴人脑视觉通道的目标识别的流形解离特性,将其与卷积神经网络相结合,提出了一种基于Min-Max准则改进的深度卷积神经网络图像分类方法及装置。
首先,介绍人脑视觉系统腹侧通道关于目标识别的流形解离特性,所述目标识别的关键是不变性特征,所述不变性特征是指在各种视觉条件下,都能够准确识别出特定物体的能力。对于某一视觉刺激,其在腹侧通道某脑区神经元的激活响应可以看成是一个响应向量,向量空间的维数就是该区域神经元的个数。当目标物体经过“保同变换”(例如,位置、尺度、姿势等变化)后,所产生的响应向量在高维向量空间中形成一个低维的物体流形(英文:object manifold),如图2所示,其中,r1,r2,......,rN表示每个神经元。
低层脑区的每一个目标流形都高度弯曲,且不同目标物体的流形相互缠绕在一起。腹侧通道通过逐级非线性变换,将不同目标的流形逐步变换为平坦和相互分离。在最后阶段,不同的目标流形变得线性可分,如图3所示。
当一个物体经过保同变换,其在特征空间里对应的特征向量也会随之变化,将特
征向量投影到一个高维的特征空间里(高维特征空间的维数和特征向量的维数相同),这些所有保同变换在高维特征空间里对应的所有特征向量将会形成一个低维的流形,当属于同一类的目标流形变得比较紧凑,不同类目标物体的流形的间隔比较大时,就得到了较好的不变性特征。
为了提高图像分类的精度,使得分类后的图像中属于同一类的目标流形具有较好的内紧凑性,属于不同类目标物体的流形具有较大的类间间隔,本申请实施例提供了一种图像分类系统,该系统包括:图像集、一个卷积神经网络模型和Min-Max准则。
其中,所述图像集是指待分类的图像,在对该图像集进行分类之前,预先对该图像集划分为训练集、验证集和测试集。所述卷积神经网络模型原则上可以是任何卷积神经网络模型,例如Quick-CNN、NIN、AlexNet等。
如图1所示为本申请实施例提供的一种图像分类方法的流程示意图,该方法包括如下步骤:
步骤101:获取待分类图像的训练集,其中所述待处理图像预先划分为训练集、验证集和测试集。
步骤102:选择一个多层的卷积神经网络模型。该卷积神经网络模型至少包括2个层级。
步骤103:对选取层做基于Min-Max准则的正则约束运算,并形成第二卷积神经网络模型,其中,所述选取层为所述卷积神经网络模型中的一层,例如,设所述选取层为卷积神经网络模型中的第k层。
其中,所述Min-Max准则是基于目标流形的内在图和惩罚图来构造的,所述内在图表征目标流形的内部紧凑性,所述惩罚图表征目标流形之间的间隔。
步骤104:使用所述训练集对所述第二卷积神经网络模型进行训练,并生成第三卷积神经网络模型。
步骤105:使用所述第三卷积神经网络模型对待分类图像的测试集进行分类,以完成对待分类图像的分类测试。
本实施例提供的图像分类方法在对目标识别不变性特征的观察基础上,通过对卷积神经网络的选取层特征做基于Min-Max准则的约束,使显式地(英文:explicitly)强迫训练所学到的特征满足:属于同一类的目标流形有较好的类内紧凑性,属于不同类的目标流形有较大的类间间隔,(即不同目标流形之间的间隔尽可能的大)进而能够显著地提高图像分类的精度。
在一个具体的实施例中,上述步骤102中,选择一个多层的卷积神经网络模型的过程包括:
获取一个mini-batch的训练样本;
根据所述训练样本和目标函数,确定所述卷积神经网络模型;
其中,设该mini-batch训练样本表示为n表示所述mini-batch的大小,Xi表示原始的输入数据,即Xi是第i幅训练集图像,ci表示与所述待分类图像相对应类别标签,且ci∈{1,2,…,C},即ci表示Xi的类别标签,C表示所述训练集图像的类别总数,每一个图像的类别标签是选自{1,2,…,C}中的一个特定值。
其中,W=(W(1),…,W(M);b(1),…,b(M)),W表示选定的所述卷积神经网络模型的全部参数,表示训练样本的损失函数,M表示所述卷积神经网络模型的总层数,W(m)表示所述卷积神经网络模型第m层的权重参数,b(m)表示所述卷积神经网络模型第m层的偏置参数,m∈{1,2,…,M}中的任意一个值。
进一步地,在选择完一个卷积神经网络模型之后还包括:
对所述卷积神经网络模型划分层级;
其中,所述划分层级后的卷积神经网络模型的每一层特征的递归表示为:
其中,Xi
(m)表示所述卷积神经网络模型第m层的特征,*表示卷积运算,f(·)表示非线性激活函数。
在对选定的卷积神经网络模型分层之后,选择一个层对其进行基于Min-Max准则的改进,优选的,所述选择的层(即选取层),设为第k层,为所述卷积神经网络模型中靠近输出的层(即模型的高层),例如所述选取层为与所述卷积神经网络模型中的输出层之间的距离不超过两个层,如图4所示。
将Min-Max准则作用在卷积神经网络模型(例如CNN模型)高层可以起到更好
的优化效果,这是由于CNN模型通过BP(英文:Error Back-Propagation,中文:误差反向传播)算法来优化,Min-Max准则关于特征的导数可以通过BP过程自上而下地影响CNN模型中的每一层特征的学习。
另外,从选择的卷积神经网络模型中的高层(选取层)开始,对其进行Min-Max准则的正则约束运算,使得在训练大规模网络时,能够简化运算,避免通过增加网络规模和训练数据规模导致计算量大,效率低,同时也能够避免构建大规模的训练标注数据集时,投入耗费大量的时间、人力和财力。
进一步地,在上述步骤103中,对选取层做基于Min-Max准则的正则约束之前还包括:获取所述Min-Max准则。
具体地,所述获取Min-Max准则包括:
分别获取Min-Max的内在图和惩罚图,所述内在图表征目标流形的内部紧凑性,所述惩罚图表征目标流形之间的间隔;根据所述内在图和惩罚图,运算得到第k层特征的Min-Max准则,所述第k层为所述选取层。
其中,所述第k层特征的Min-Max准则表示为
L(X(k),c)=L1(X(k),c)-L2(X(k),c)
其中,L1(X(k),c)表示所述内在图,L2(X(k),c)表示所述惩罚图;X(k)表示一个mini-batch的训练样本在第k层的特征的集合,表示与所述mini-batch相对应的类别标签集合,i∈{1,2,…,n}。
所述内在图的构造方式为:将{x1,x2,…,xn}看成内在图的顶点,每个顶点与和其具有相同标签的k1个最近邻的顶点通过无向边相连接。
所述惩罚图的构造方式为:将{x1,x2,…,xn}看成惩罚图的顶点,来自不同类流形的边缘顶点对通过无向边相连接。第c类流形的边缘顶点对的定义为第c类流形与其他所有类的流形之间的k2个最近的顶点对。
根据内在图的构造方式,流形内部的紧凑性可以表示为:
根据惩罚图的构造方式,流形之间的间隔可以表示为:
其中,表示内在图中连接顶点xi和xj的边的权值,||□||表示向量的l2范数,表示所述训练样本Xi的和其具有相同类别标签的k1个最近邻的顶点的下标标号的集合,表示惩罚图中连接顶点xi和xj的边的权值,是集合中k2个最近邻的顶点对的集合,πc表示所述mini-batch中属于第c类的样本的下标标号的集合。L1(X(k),c)越小表示流行内部越紧凑,L2(X(k),c)越大表示流形之间的间隔越大。
在上述步骤103中,对选取层做基于Min-Max准则的正则约束运算,并形成第二卷积神经网络模型,所述第二卷积神经网络模型用目标函数表示为:
其中,为第二卷积神经网络模型的分类损失函数,L(X(k),c)为第k层特征的Min-Max准则,λ为大于0的权值系数。实际应用中,针对不同的数据集,λ的值需要进行调节,当λ的值调节好之后,整个训练过程一直保持为恒定不变的值。
进一步地,在上述步骤104中,使用所述训练集对所述第二卷积神经网络模型进行训练包括:
根据第二卷积神经网络模型的目标函数,获取第二卷积神经网络模型关于第k层特征的灵敏度;
按照基于mini-batch的随机梯度下降方法,利用训练集来训练第二卷积神经网络模型,利用预先划分的待分类图像的验证集用来调节学习率等参数。
在用反向传播BP算法,需要计算目标函数关于模型参数的导数,由于直接计算目标函数关于模型参数的导数比较困难,所以需要先计算出目标函数关于各层特征的灵敏度,即损失函数关于相应层特征的导数或梯度,然后根据灵敏度可以求出来关于相应参数的导数。分类损失函数关于第k层的特征的灵敏度可以按照传统神经网络的反向传播算法进行计算。本申请提供的方法只需计算出Min-Max准则关于第k层的特征的梯度即可,不需要计算出目标函数关于各层特征的灵敏度,具体计算过程如下:
根据所述第k层特征的灵敏度和mini-batch的随机梯度下降方法,使用所述训练集对所述第二卷积神经网络模型进行训练;其中,所述第k层特征的灵敏度采用如下方式计算得出:
H表示第k层特征拼成的矩阵,Ψ=D-G,D=diag(d11,d22,…,dnn),表示内在图中连接顶点xi和xj的边的权值,表示惩罚图中连接顶点xi和xj的边的权值,i=1,2,…,n,Ψ表示矩阵G=(Gij)n×n的拉普拉斯矩阵,下标(:,i)表示所述矩阵的第i列。
第k层的特征的灵敏度为从第二卷积神经网络模型的分类损失函数关于第k层特征的梯度加上Min-Max准则关于第k层特征的梯度,然后按照标准的反向传播算法向前进行误差灵敏度反传即可。
通过加入Min-Max准则的目标函数对模型进行训练,可以使训练后的模型满足:属于同一类的图像特征之间间隔较小,属于不同类的图像特征具有较大的间隔,从而有利于图像分类。
若所述Min-Max准则为核版本的Min-Max准则,那么对选取层做基于Min-Max准则的正则约束运算包括:
获取所述核版本的Min-Max准则关于第k层特征的灵敏度;
根据所述核版本的Min-Max准则关于第k层特征的灵敏度,对所述第k层做基于所述核版本的Min-Max准则的约束运算;
其中,所述核版本的Min-Max准则关于第k层特征的灵敏度表示为:
进一步地,使用所述第三卷积神经网络模型对待分类图像的测试集进行分类
包括:使用所述第三卷积神经网络模型中的模型参数对待分类图像的测试集进行分类。其中,所述模型参数为W,待分类图像的验证集用来调节学习率等参数,所述学习率是训练过程中使用的一个参数(并非模型参数),这个参数可以通过验证集来调节。
本申请基于对目标识别不变性特征的观察,通过对卷积神经网络的高层特征进行基于Min-Max准则的约束,显式地强迫所学到的特征满足:属于同一类的目标流形有较好的类内紧凑性,属于不同类的目标流形有较大的类间间隔。通过Min-Max准则直接显式地对特征进行约束,从而使得该Min-Max准则从技术上能够保证卷积神经网络学习到尽可能好的不变性特征。
通过Min-Max准则约束,使得改进后的模型的图像分类精度比用传统BP方法训练得到的模型有显著提高,使得一个模型复杂度较小的卷积网络模型的图像分类精度能够达到深度和复杂度更高的卷积神经网络模型的图像分类精度。
在另一个具体的实施例中,为了验证上述方法的有效性,对选择的卷积神经网络模型进行实验验证。
例如,在CIFAR-10数据集上的实验比较结果如下表1所示:
方法 | 模型参数数量 | 错误率(%) |
Quick-CNN | 0.145M | 23.47 |
Quick-CNN+Min-Max | 0.145M | 18.06 |
Quick-CNN+k(Min-Max) | 0.145M | 17.59 |
表1
在CIFAR-100数据集上的实验比较结果如下表2所示:
方法 | 模型参数数量 | 错误率(%) |
Quick-CNN | 0.15M | 55.87 |
Quick-CNN+Min-Max | 0.15M | 51.38 |
Quick-CNN+k(Min-Max) | 0.15M | 50.83 |
表2
在SVHN数据集上的实验比较结果如下表3所示:
方法 | 模型参数数量 | 错误率(%) |
Quick-CNN | 0.145M | 8.92 |
Quick-CNN+Min-Max | 0.145M | 5.42 |
Quick-CNN+k(Min-Max) | 0.145M | 4.85 |
表3
通过以上实验结果和特征可视化可以得出:
相比较于各自的baseline模型,改进后的模型的图像分类精度有非常显著的提高。
改进后的卷积网络模型所学习到的特征图feature map会表现出较好的类内紧凑性和类间分离性,即属于同一类的图像的特征之间的距离较小,属于不同类的图像的特征之间的距离较大,相比于baseline模型,feature map的这种特点表现的非常明显。
本申请提供了方法显式地对卷积神经网络学习到的特征进行Min-Max准则正则约束,避免以往对模型的正则约束都是对模型参数进行约束。并且该Min-Max准则能够用于多种类型的卷积神经网络,由此带来的额外计算代价相对于整个网络的训练是可以忽略的。
此外,本申请还提供了一种图像分类装置,对应于前述图像分类方法的实施例,如图6所示,装置600包括:获取单元601、选取单元602、处理单元603、训练单元604和分类单元605,
获取单元601,用于获取待分类图像的训练集;
选取单元602,用于选择一个多层的卷积神经网络模型;
处理单元603,用于对选取层做基于Min-Max准则的正则约束运算,并形成第二卷积神经网络模型,其中,所述选取层为所述卷积神经网络模型中的一层;优选的,所述选取层为所述卷积神经网络模型中靠近输出的层,即所述选取层为与所述卷积神经网络模型中的输出层之间的距离不超过两个层。
训练单元604,用于使用所述训练集对所述第二卷积神经网络模型进行训练,并生成第三卷积神经网络模型;
分类单元605,用于使用所述第三卷积神经网络模型对待分类图像的测试集进行分类。
进一步地,选取单元602还用于:获取一个mini-batch的训练样本;根据所述训练样本和目标函数,确定所述卷积神经网络模型。
W=(W(1),…,W(M);b(1),…,b(M)),W表示所述卷积神经网络模型的全部参数,表示训练样本的损失函数,M表示所述卷积神经网络模型的总层数,W(m)表示所述卷积神经网络模型第m层的权重参数,b(m)表示所述卷积神经网络模型第m层的偏置参数,m∈{1,2,…,M}中的任意一个值。
进一步地,所述装置还包括:分层单元606,
所述分层单元606,用于按照特征递归的方法对所述卷积神经网络模型划分层级。
其中,所述划层级分后的卷积神经网络模型的每一层特征的递归表示为:
其中,Xi
(m)表示所述卷积神经网络模型第m层的特征,*表示卷积运算,f(·)表示非线性激活函数。
进一步地,所述获取单元601,还用于获取所述Min-Max准则;
所述获取单元601,具体用于分别获取Min-Max的内在图和惩罚图,所述内在图表征目标流形的内部紧凑性,所述惩罚图表征目标流形之间的间隔;根据所述内在图和惩罚图,运算得到第k层特征的Min-Max准则。
其中,所述第k层特征的Min-Max准则表示为
L(X(k),c)=L1(X(k),c)-L2(X(k),c)
其中,L1(X(k),c)表示所述内在图,L2(X(k),c)表示所述惩罚图;X(k)表示一个mini-batch的训练样本在第k层的特征的集合,所述第k层为所述选取层,表示与所述mini-batch相对应的类别标签集合,i∈{1,2,…,n}。
进一步地,所述第二卷积神经网络模型用目标函数表示为:
进一步地,所述训练单元604具体用于:
根据第二卷积神经网络模型的目标函数,获取第二卷积神经网络模型关于第k层特征的灵敏度;
根据所述第k层特征的灵敏度和mini-batch的随机梯度下降方法,使用所述训练集对所述第二卷积神经网络模型进行训练。
其中,所述第k层特征的灵敏度采用如下方式计算得出:
H表示第k层特征拼成的矩阵,Ψ=D-G,D=diag(d11,d22,…,dnn),表示内在图中连接顶点xi和xj的边的权值,表示惩罚图中连接顶点xi和xj的边的权值,i=1,2,…,n,Ψ表示矩阵G=(Gij)n×n的拉普拉斯矩阵,下标(:,i)表示所述矩阵的第i列。
进一步地,所述Min-Max准则为核版本的Min-Max准则,所述核版本的Min-Max准则为所述Min-Max准则通过高斯核函数来定义生成准则。
若所述Min-Max准则为核版本的Min-Max准则,那么所述处理单元603还用
于,
获取所述核版本的Min-Max准则关于第k层特征的灵敏度;
根据所述核版本的Min-Max准则关于第k层特征的灵敏度,对所述第k层做基于所述核版本的Min-Max准则的约束运算。
其中,所述核版本的Min-Max准则关于第k层特征的灵敏度表示为:
进一步地,所述分类单元具体用于,使用所述第三卷积神经网络模型中的模型参数对待分类图像的测试集进行分类。
本申请基于对目标识别不变性特征的观察,提出了基于Min-Max准则改进的深度卷积神经网络图像分类装置。通过对卷积神经网络的高层特征进行基于Min-Max准则的约束,显式地(充分表达清楚地)强迫训练所学到的特征满足:属于同一类的目标流形有较好的类内紧凑性,属于不同类的目标流形有较大的类间间隔。
为进一步提高所提方法的有效性,本申请实施例还提出了核版本的Min-Max准则,并在实验中得到了验证。
与采用传统方法训练得到的深度卷积神经网络图像分类系统相比,用本申请提供的方法训练得到的图像分类系统能够显著地提高图像分类精度。相比较于各自的baseline模型,改进后的模型的图像分类精度有非常显著的提高,并且改进后的模型所学习到的feature map会表现出较好的类内紧凑性和类间分离性,即属于同一类的图像的特征之间的距离较小,属于不同类的图像的特征之间的距离较大。
本实施例还提供了一种图像分类设备,如图7所示,所述设备700包括:处理器701和存储器702,
所述处理器701,用于获取待分类图像的训练集;选择一个多层的卷积神经网络模型;对选取层做基于Min-Max准则的正则约束,并形成第二卷积神经网络模型,使用所述训练集对所述第二卷积神经网络模型进行训练,并生成第三卷积神经网络模型;使用所述第三卷积神经网络模型对待分类图像的测试集进行分类,其中,所述选取层为所述卷积神经网络模型中的一层;
所述存储器702,用于存储所述待分类图像的训练集,所述多层的卷积神经网络模型,所述Min-Max准则和分类后的图像。
进一步地,该图像分类设备中的处理器701还用于执行前述一种图像分类方法实施例的各个步骤,在此不再赘述。
处理器701包括图形处理器(英文:Graphic Processing Unit,缩写:GPU),还可以是中央处理器(英文:central processing unit,CPU),网络处理器(英文:network processor,NP)或者CPU和NP的组合。处理器701还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(英文:application-specific integrated circuit,ASIC),可编程逻辑器件(英文:programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(英文:complex programmable logic device,CPLD),现场可编程逻辑门阵列(英文:field-programmable gate array,FPGA),通用阵列逻辑(英文:generic array logic,GAL)或其任意组合。
存储器702可以为易失性存储器(volatile memory),非易失性存储器(non-volatile memory)或其组合。其中,易失性存储器可以是随机存取存储器(random-access memory,RAM);非易失性存储器可以是只读存储器(read-only memory,ROM)、快闪存储器(flash memory)、硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD)。
本说明书中各个实施例之间相同相似的部分互相参见即可。尤其,对于装置或系统实施例而言,由于其基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。以上所描述的装置及系统实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
本发明的保护范围应以权利要求的保护范围为准。
Claims (21)
- 一种图像分类方法,其特征在于,所述方法包括:获取待分类图像的训练集;选择一个多层的卷积神经网络模型;对选取层做基于最小-最大Min-Max准则的正则约束,并形成第二卷积神经网络模型,其中,所述选取层为所述卷积神经网络模型中的一层;使用所述训练集对所述第二卷积神经网络模型进行训练,并生成第三卷积神经网络模型;使用所述第三卷积神经网络模型对待分类图像的测试集进行分类。
- 根据权利要求1所述的方法,其特征在于,所述选择一个卷积神经网络模型包括:获取一个mini-batch的训练样本;根据所述训练样本和目标函数,确定所述卷积神经网络模型;W=(W(1),…,W(M);b(1),…,b(M)),W表示所述卷积神经网络模型的全部参数,l(W,Xi,ci)表示训练样本的损失函数,M表示所述卷积神经网络模型的总层数,W(m)表示所述卷积神经网络模型第m层的权重参数,b(m)表示所述卷积神经网络模型第m层的偏置参数,m∈{1,2,…,M}中的任意一个值。
- 根据权利要求1所述的方法,其特征在于,对选取层做基于Min-Max准则的正则约束之前还包括:获取所述Min-Max准则;所述获取Min-Max准则包括:分别获取Min-Max的内在图和惩罚图,所述内在图表征目标流形的内部紧凑性,所述惩罚图表征目标流形之间的间隔;根据所述内在图和惩罚图,运算得到第k层特征的Min-Max准则,所述第k层为所述选取层;其中,所述第k层特征的Min-Max准则表示为L(X(k),c)=L1(X(k),c)-L2(X(k),c)
- 根据权利要求5所述的方法,其特征在于,使用所述训练集对所述第二卷积神经网络模型进行训练包括:根据第二卷积神经网络模型的目标函数,获取第二卷积神经网络模型关于第k层特征的灵敏度;根据所述第k层特征的灵敏度和mini-batch的随机梯度下降方法,使用所述训练集对所述第二卷积神经网络模型进行训练;其中,所述第k层特征的灵敏度采用如下方式计算得出:
- 根据权利要求6所述的方法,其特征在于,所述Min-Max准则为核版本的Min-Max准则,所述核版本的Min-Max准则为所述Min-Max准则通过高斯核函数来定义生成的准则。
- 根据权利要求8所述方法,其特征在于,使用所述第三卷积神经网络模 型对待分类图像的测试集进行分类包括:使用所述第三卷积神经网络模型中的模型参数对待分类图像的测试集进行分类。
- 根据权利要求1至9中任一项所述的方法,其特征在于,所述选取层为与所述卷积神经网络模型中的输出层之间的距离不超过两个层。
- 一种图像分类装置,其特征在于,所述装置包括:获取单元,用于获取待分类图像的训练集;选取单元,用于选择一个多层的卷积神经网络模型;处理单元,用于对选取层做基于Min-Max准则的正则约束,并形成第二卷积神经网络模型,其中,所述选取层为所述卷积神经网络模型中的一层;训练单元,用于使用所述训练集对所述第二卷积神经网络模型进行训练,并生成第三卷积神经网络模型;分类单元,用于使用所述第三卷积神经网络模型对待分类图像的测试集进行分类。
- 根据权利要求10所述的装置,其特征在于,选取单元还用于:获取一个mini-batch的训练样本;根据所述训练样本和目标函数,确定所述卷积神经网络模型;W=(W(1),…,W(M);b(1),…,b(M)),W表示所述卷积神经网络模型的全部参数,l(W,Xi,ci)表示训练样本的损失函数,M表示所述卷积神经网络模型的总层数,W(m)表示所述卷积神经网络模型第m层的权重参数,b(m)表示所 述卷积神经网络模型第m层的偏置参数,m∈{1,2,…,M}中的任意一个值。
- 根据权利要求11所述的装置,其特征在于,所述获取单元,还用于获取所述Min-Max准则;所述获取单元,具体用于分别获取Min-Max的内在图和惩罚图,所述内在图表征目标流形的内部紧凑性,所述惩罚图表征目标流形之间的间隔;根据所述内在图和惩罚图,运算得到第k层特征的Min-Max准则,所述第k层为所述选取层;其中,所述第k层特征的Min-Max准则表示为L(X(k),c)=L1(X(k),c)-L2(X(k),c)
- 根据权利要求16所述的装置,其特征在于,所述Min-Max准则为核版本的Min-Max准则,所述核版本的Min-Max准则为所述Min-Max准则通过高斯核函数来定义生成准则。
- 根据权利要求18所述装置,其特征在于,所述分类单元具体用于,使用所述第三卷积神经网络模型中的模型参数对待分类图像的测试集进行分类。
- 根据权利要求11至19中任一项所述的装置,其特征在于,所述选取层为与所述卷积神经网络模型中的输出层之间的距离不超过两个层。
- 一种图像分类设备,其特征在于,所述设备包括:处理器和存储器,所述处理器,用于获取待分类图像的训练集;选择一个多层的卷积神经网络模型;对选取层做基于Min-Max准则的正则约束,并形成第二卷积神经网络模型,使用所述训练集对所述第二卷积神经网络模型进行训练,并生成第三卷积神经网络模型;使用所述第三卷积神经网络模型对待分类图像的测试集进行分类,其中,所述选取层为所述卷积神经网络模型中的一层;所述存储器,用于存储所述待分类图像的训练集,所述多层的卷积神经网络模型,所述Min-Max准则和分类后的图像。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610553942.1A CN107622272A (zh) | 2016-07-13 | 2016-07-13 | 一种图像分类方法及装置 |
CN201610553942.1 | 2016-07-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018010434A1 true WO2018010434A1 (zh) | 2018-01-18 |
Family
ID=60952706
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/074427 WO2018010434A1 (zh) | 2016-07-13 | 2017-02-22 | 一种图像分类方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107622272A (zh) |
WO (1) | WO2018010434A1 (zh) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108460772A (zh) * | 2018-02-13 | 2018-08-28 | 国家计算机网络与信息安全管理中心 | 基于卷积神经网络的广告骚扰传真图像检测系统及方法 |
CN109241903A (zh) * | 2018-08-30 | 2019-01-18 | 平安科技(深圳)有限公司 | 样本数据清洗方法、装置、计算机设备及存储介质 |
CN109886161A (zh) * | 2019-01-30 | 2019-06-14 | 江南大学 | 一种基于可能性聚类和卷积神经网络的道路交通标识识别方法 |
CN109934270A (zh) * | 2019-02-25 | 2019-06-25 | 华东师范大学 | 一种基于局部流形判别分析投影网络的分类方法 |
CN110347789A (zh) * | 2019-06-14 | 2019-10-18 | 平安科技(深圳)有限公司 | 文本意图智能分类方法、装置及计算机可读存储介质 |
CN110490227A (zh) * | 2019-07-09 | 2019-11-22 | 武汉理工大学 | 一种基于特征转换的少样本图像分类方法 |
CN110516728A (zh) * | 2019-08-20 | 2019-11-29 | 西安电子科技大学 | 基于去噪卷积神经网络的极化sar地物分类方法 |
WO2020055910A1 (en) * | 2018-09-10 | 2020-03-19 | Drisk, Inc. | Systems and methods for graph-based ai training |
CN111090764A (zh) * | 2019-12-20 | 2020-05-01 | 中南大学 | 基于多任务学习和图卷积神经网络的影像分类方法及装置 |
CN111160301A (zh) * | 2019-12-31 | 2020-05-15 | 同济大学 | 基于机器视觉的隧道病害目标智能识别及提取方法 |
CN111401473A (zh) * | 2020-04-09 | 2020-07-10 | 中国人民解放军国防科技大学 | 基于注意力机制卷积神经网络的红外目标分类方法 |
US10713258B2 (en) | 2013-07-26 | 2020-07-14 | Drisk, Inc. | Systems and methods for visualizing and manipulating graph databases |
CN111429005A (zh) * | 2020-03-24 | 2020-07-17 | 淮南师范学院 | 一种基于少量学生反馈的教学评估方法 |
US10776965B2 (en) | 2013-07-26 | 2020-09-15 | Drisk, Inc. | Systems and methods for visualizing and manipulating graph databases |
CN111797882A (zh) * | 2019-07-30 | 2020-10-20 | 华为技术有限公司 | 图像分类方法及装置 |
CN111814898A (zh) * | 2020-07-20 | 2020-10-23 | 上海眼控科技股份有限公司 | 图像分割方法、装置、计算机设备和存储介质 |
CN112699957A (zh) * | 2021-01-08 | 2021-04-23 | 北京工业大学 | 一种基于darts的图像分类优化方法 |
CN112990315A (zh) * | 2021-03-17 | 2021-06-18 | 北京大学 | 基于偏微分算子的等变3d卷积网络的3d形状图像分类方法 |
CN113779236A (zh) * | 2021-08-11 | 2021-12-10 | 齐维维 | 一种基于人工智能的问题分类的方法及装置 |
CN113850274A (zh) * | 2021-09-16 | 2021-12-28 | 北京理工大学 | 一种基于hog特征及dmd的图像分类方法 |
CN113936148A (zh) * | 2021-09-16 | 2022-01-14 | 北京理工大学 | 一种基于随机傅里叶特征变换的图像分类方法 |
CN118115803A (zh) * | 2024-03-11 | 2024-05-31 | 安徽大学 | 基于Diffusion模型的极化SAR图像分类方法 |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108684043B (zh) * | 2018-05-15 | 2021-09-28 | 南京邮电大学 | 基于最小风险的深度神经网络的异常用户检测方法 |
CN108764306B (zh) | 2018-05-15 | 2022-04-22 | 深圳大学 | 图像分类方法、装置、计算机设备和存储介质 |
TWI705414B (zh) * | 2018-05-29 | 2020-09-21 | 長庚醫療財團法人林口長庚紀念醫院 | 自體免疫抗體免疫螢光影像分類系統及其分類方法 |
EP3575986B1 (en) * | 2018-05-30 | 2024-07-10 | Robert Bosch GmbH | A lossy data compressor for vehicle control systems |
CN110580487A (zh) | 2018-06-08 | 2019-12-17 | Oppo广东移动通信有限公司 | 神经网络的训练方法、构建方法、图像处理方法和装置 |
CN108961267B (zh) * | 2018-06-19 | 2020-09-08 | Oppo广东移动通信有限公司 | 图片处理方法、图片处理装置及终端设备 |
CN108898082B (zh) * | 2018-06-19 | 2020-07-03 | Oppo广东移动通信有限公司 | 图片处理方法、图片处理装置及终端设备 |
CN110795976B (zh) | 2018-08-03 | 2023-05-05 | 华为云计算技术有限公司 | 一种训练物体检测模型的方法、装置以及设备 |
CN109376786A (zh) * | 2018-10-31 | 2019-02-22 | 中国科学院深圳先进技术研究院 | 一种图像分类方法、装置、终端设备及可读存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104102919A (zh) * | 2014-07-14 | 2014-10-15 | 同济大学 | 一种有效防止卷积神经网络过拟合的图像分类方法 |
CN105160400A (zh) * | 2015-09-08 | 2015-12-16 | 西安交通大学 | 基于l21范数的提升卷积神经网络泛化能力的方法 |
CN105243398A (zh) * | 2015-09-08 | 2016-01-13 | 西安交通大学 | 基于线性判别分析准则的改进卷积神经网络性能的方法 |
CN105637540A (zh) * | 2013-10-08 | 2016-06-01 | 谷歌公司 | 用于强化学习的方法和设备 |
-
2016
- 2016-07-13 CN CN201610553942.1A patent/CN107622272A/zh active Pending
-
2017
- 2017-02-22 WO PCT/CN2017/074427 patent/WO2018010434A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105637540A (zh) * | 2013-10-08 | 2016-06-01 | 谷歌公司 | 用于强化学习的方法和设备 |
CN104102919A (zh) * | 2014-07-14 | 2014-10-15 | 同济大学 | 一种有效防止卷积神经网络过拟合的图像分类方法 |
CN105160400A (zh) * | 2015-09-08 | 2015-12-16 | 西安交通大学 | 基于l21范数的提升卷积神经网络泛化能力的方法 |
CN105243398A (zh) * | 2015-09-08 | 2016-01-13 | 西安交通大学 | 基于线性判别分析准则的改进卷积神经网络性能的方法 |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10713258B2 (en) | 2013-07-26 | 2020-07-14 | Drisk, Inc. | Systems and methods for visualizing and manipulating graph databases |
US10776965B2 (en) | 2013-07-26 | 2020-09-15 | Drisk, Inc. | Systems and methods for visualizing and manipulating graph databases |
CN108460772A (zh) * | 2018-02-13 | 2018-08-28 | 国家计算机网络与信息安全管理中心 | 基于卷积神经网络的广告骚扰传真图像检测系统及方法 |
CN109241903A (zh) * | 2018-08-30 | 2019-01-18 | 平安科技(深圳)有限公司 | 样本数据清洗方法、装置、计算机设备及存储介质 |
CN109241903B (zh) * | 2018-08-30 | 2023-08-29 | 平安科技(深圳)有限公司 | 样本数据清洗方法、装置、计算机设备及存储介质 |
US12043280B2 (en) | 2018-09-10 | 2024-07-23 | Drisk, Inc. | Systems and methods for graph-based AI training |
US11507099B2 (en) | 2018-09-10 | 2022-11-22 | Drisk, Inc. | Systems and methods for graph-based AI training |
WO2020055910A1 (en) * | 2018-09-10 | 2020-03-19 | Drisk, Inc. | Systems and methods for graph-based ai training |
CN109886161B (zh) * | 2019-01-30 | 2023-12-12 | 江南大学 | 一种基于可能性聚类和卷积神经网络的道路交通标识识别方法 |
CN109886161A (zh) * | 2019-01-30 | 2019-06-14 | 江南大学 | 一种基于可能性聚类和卷积神经网络的道路交通标识识别方法 |
CN109934270B (zh) * | 2019-02-25 | 2023-04-25 | 华东师范大学 | 一种基于局部流形判别分析投影网络的分类方法 |
CN109934270A (zh) * | 2019-02-25 | 2019-06-25 | 华东师范大学 | 一种基于局部流形判别分析投影网络的分类方法 |
CN110347789A (zh) * | 2019-06-14 | 2019-10-18 | 平安科技(深圳)有限公司 | 文本意图智能分类方法、装置及计算机可读存储介质 |
CN110490227A (zh) * | 2019-07-09 | 2019-11-22 | 武汉理工大学 | 一种基于特征转换的少样本图像分类方法 |
CN110490227B (zh) * | 2019-07-09 | 2023-02-03 | 武汉理工大学 | 一种基于特征转换的少样本图像分类方法 |
CN111797882A (zh) * | 2019-07-30 | 2020-10-20 | 华为技术有限公司 | 图像分类方法及装置 |
CN110516728A (zh) * | 2019-08-20 | 2019-11-29 | 西安电子科技大学 | 基于去噪卷积神经网络的极化sar地物分类方法 |
CN110516728B (zh) * | 2019-08-20 | 2022-12-06 | 西安电子科技大学 | 基于去噪卷积神经网络的极化sar地物分类方法 |
CN111090764B (zh) * | 2019-12-20 | 2023-06-23 | 中南大学 | 基于多任务学习和图卷积神经网络的影像分类方法及装置 |
CN111090764A (zh) * | 2019-12-20 | 2020-05-01 | 中南大学 | 基于多任务学习和图卷积神经网络的影像分类方法及装置 |
CN111160301B (zh) * | 2019-12-31 | 2023-04-18 | 同济大学 | 基于机器视觉的隧道病害目标智能识别及提取方法 |
CN111160301A (zh) * | 2019-12-31 | 2020-05-15 | 同济大学 | 基于机器视觉的隧道病害目标智能识别及提取方法 |
CN111429005B (zh) * | 2020-03-24 | 2023-06-02 | 淮南师范学院 | 一种基于少量学生反馈的教学评估方法 |
CN111429005A (zh) * | 2020-03-24 | 2020-07-17 | 淮南师范学院 | 一种基于少量学生反馈的教学评估方法 |
CN111401473A (zh) * | 2020-04-09 | 2020-07-10 | 中国人民解放军国防科技大学 | 基于注意力机制卷积神经网络的红外目标分类方法 |
CN111814898A (zh) * | 2020-07-20 | 2020-10-23 | 上海眼控科技股份有限公司 | 图像分割方法、装置、计算机设备和存储介质 |
CN112699957A (zh) * | 2021-01-08 | 2021-04-23 | 北京工业大学 | 一种基于darts的图像分类优化方法 |
CN112699957B (zh) * | 2021-01-08 | 2024-03-29 | 北京工业大学 | 一种基于darts的图像分类优化方法 |
CN112990315A (zh) * | 2021-03-17 | 2021-06-18 | 北京大学 | 基于偏微分算子的等变3d卷积网络的3d形状图像分类方法 |
CN112990315B (zh) * | 2021-03-17 | 2023-10-20 | 北京大学 | 基于偏微分算子的等变3d卷积网络的3d形状图像分类方法 |
CN113779236A (zh) * | 2021-08-11 | 2021-12-10 | 齐维维 | 一种基于人工智能的问题分类的方法及装置 |
CN113936148A (zh) * | 2021-09-16 | 2022-01-14 | 北京理工大学 | 一种基于随机傅里叶特征变换的图像分类方法 |
CN113850274A (zh) * | 2021-09-16 | 2021-12-28 | 北京理工大学 | 一种基于hog特征及dmd的图像分类方法 |
CN118115803A (zh) * | 2024-03-11 | 2024-05-31 | 安徽大学 | 基于Diffusion模型的极化SAR图像分类方法 |
Also Published As
Publication number | Publication date |
---|---|
CN107622272A (zh) | 2018-01-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018010434A1 (zh) | 一种图像分类方法及装置 | |
Cao et al. | DenseUNet: densely connected UNet for electron microscopy image segmentation | |
CN106203432B (zh) | 一种基于卷积神经网显著性图谱的感兴趣区域的定位系统 | |
CN111191514B (zh) | 一种基于深度学习的高光谱图像波段选择方法 | |
US9730643B2 (en) | Method and system for anatomical object detection using marginal space deep neural networks | |
CN107316013B (zh) | 基于nsct变换和dcnn的高光谱图像分类方法 | |
CN113408605B (zh) | 基于小样本学习的高光谱图像半监督分类方法 | |
CN105138973B (zh) | 人脸认证的方法和装置 | |
Ströfer et al. | Data-driven, physics-based feature extraction from fluid flow fields | |
EP3029606A2 (en) | Method and apparatus for image classification with joint feature adaptation and classifier learning | |
CN107103338B (zh) | 融合卷积特征和集成超限学习机的sar目标识别方法 | |
CN109063719B (zh) | 一种联合结构相似性和类信息的图像分类方法 | |
CN107066559A (zh) | 一种基于深度学习的三维模型检索方法 | |
CN109029363A (zh) | 一种基于深度学习的目标测距方法 | |
CN103699874B (zh) | 基于surf流和lle稀疏表示的人群异常行为识别方法 | |
Arshad et al. | A progressive conditional generative adversarial network for generating dense and colored 3D point clouds | |
He et al. | SAR target recognition and unsupervised detection based on convolutional neural network | |
Li et al. | Nonlocal band attention network for hyperspectral image band selection | |
CN106250918B (zh) | 一种基于改进的推土距离的混合高斯模型匹配方法 | |
Nong et al. | Hypergraph wavelet neural networks for 3D object classification | |
CN111639697B (zh) | 基于非重复采样与原型网络的高光谱图像分类方法 | |
Cui et al. | Double-branch local context feature extraction network for hyperspectral image classification | |
CN106778579A (zh) | 一种基于累计属性的头部姿态估计方法 | |
Bansal et al. | A post-processing fusion framework for deep learning models for crop disease detection | |
CN116977265A (zh) | 缺陷检测模型的训练方法、装置、计算机设备和存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17826773 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17826773 Country of ref document: EP Kind code of ref document: A1 |