CN109993221B

CN109993221B - Image classification method and device

Info

Publication number: CN109993221B
Application number: CN201910227705.XA
Authority: CN
Inventors: 徐启南
Original assignee: New H3C Big Data Technologies Co Ltd
Current assignee: New H3C Big Data Technologies Co Ltd
Priority date: 2019-03-25
Filing date: 2019-03-25
Publication date: 2021-02-09
Anticipated expiration: 2039-03-25
Also published as: CN109993221A

Abstract

The application provides an image classification method and device, wherein the method comprises the following steps: acquiring a target image to be classified; inputting the target image into a classification model trained in advance, and outputting the category of the target image; the processing process of the target image in the classification model comprises the following steps: extracting a target direction gradient Histogram (HOG) feature matrix of a target image; respectively calculating mutual information values between the target HOG characteristic matrix and the reference HOG characteristic matrix corresponding to each category; determining a reference HOG characteristic matrix with the mutual information values arranged at the front K bits from big to small between the reference HOG characteristic matrix and the target HOG characteristic matrix based on the calculated mutual information values; k is a positive integer; and according to the categories respectively corresponding to the reference HOG feature matrixes arranged at the top K bits, taking the category with the largest occurrence frequency as the category of the target image. By adopting the mode, under the condition of ensuring higher accuracy, hardware processing resources can be saved, and the image processing efficiency can be improved.

Description

Image classification method and device

Technical Field

The application relates to the technical field of big data, in particular to an image classification method and device.

Background

Images are seen everywhere in modern life, information carried by the images is more and more, and image processing and recognition technology is more and more important. At present, image classification problems are involved in a plurality of application scenes, such as object detection and segmentation, license plate recognition, workpiece classification and the like.

For the image classification problem, some deep learning models are usually adopted to extract and classify image features, such as a convolutional neural network model, but such deep learning models may have the following limitations due to a complex network structure: on one hand, the computation is large in the application process, more hardware processing resources are consumed, the image processing efficiency is low, and on the other hand, a model needs to be trained by using a huge training sample set, so that the model prediction can be guaranteed to have certain accuracy.

Disclosure of Invention

In view of this, the present application provides an image classification method and apparatus, which can save hardware processing resources and improve image processing efficiency under the condition of ensuring higher accuracy.

In a first aspect, the present application provides an image classification method, including:

acquiring a target image to be classified;

inputting the target image into a pre-trained classification model, and outputting the category of the target image; wherein, the processing procedure of the target image in the classification model comprises the following steps:

extracting a target direction gradient Histogram (HOG) feature matrix of the target image;

respectively calculating mutual information values between the target HOG characteristic matrix and the reference HOG characteristic matrix corresponding to each category;

determining a reference HOG feature matrix with the mutual information values of the target HOG feature matrix arranged at the front K bits from big to small based on the calculated mutual information values; k is a positive integer;

and according to the categories respectively corresponding to the reference HOG feature matrixes arranged at the top K bits, taking the category with the largest occurrence frequency as the category of the target image.

In a second aspect, the present application provides an image classification apparatus, comprising:

the acquisition module is used for acquiring a target image to be classified;

the processing module is used for inputting the target image into a pre-trained classification model and outputting the category of the target image; wherein the processing module executes the processing procedure in the classification model, and the processing procedure comprises:

In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the image classification method of the first aspect described above, or any possible implementation of the first aspect.

In a fourth aspect, this application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the image classification method described in the above first aspect, or any one of the possible implementation manners of the first aspect.

According to the image classification method and device, when the classification of the target image is predicted based on the classification model trained in advance, the target HOG characteristic matrix of the target image to be classified can be extracted firstly, and the extracted image characteristics are more accurate and the interference of other image information is reduced because the target HOG characteristic matrix can reflect the edge characteristics of the target image; further, a mutual information value between the target HOG feature matrix and the reference HOG feature matrix corresponding to each category can be calculated, and since the size of the mutual information value can reflect the similarity degree between the target HOG feature matrix and the reference HOG feature matrix, the category of the target image can be finally predicted by selecting the first K reference HOG feature matrices with the highest mutual information values and based on the categories corresponding to the first K reference HOG feature matrices respectively.

Compared with the conventional mode of classifying images based on a deep learning model, the mode of classifying images based on the classification model has the advantages that the process of extracting image features and predicting categories is less in calculation amount, hardware processing resources can be saved under the condition of ensuring high accuracy, and the image processing efficiency is improved. And because the classification model does not need a complex convolutional neural network model relative to a deep learning model, the model structure is simpler, so that the classification model can be obtained by training with limited training samples, and higher accuracy can be ensured.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

FIG. 1 is a schematic diagram illustrating a process for training a classification model according to an embodiment of the present disclosure;

FIG. 2 is a flow chart illustrating a training process for a basic feature extraction module according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram illustrating a process for training a base class prediction module according to an embodiment of the present application;

FIG. 4 is a flowchart illustrating an image classification method provided by an embodiment of the present application;

fig. 5 is a schematic structural diagram illustrating an image classification apparatus provided in an embodiment of the present application;

fig. 6 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.

At present, for the problem of image classification, some deep learning models with complex network structures are generally adopted to extract image features and perform classification prediction based on the extracted image features, such as a convolutional neural network model. Generally, the more complex the network structure is, the finer the extracted image features are, and the higher the accuracy of the obtained classification prediction result is, but the higher the complexity of the network structure is, the larger the amount of computation at the time of extracting the image features is, and the more hardware processing resources are consumed, and the image processing efficiency is also lowered.

In addition, as the complexity of the network structure is increased, the number of model parameters to be trained is large, a large training sample set is needed to train the deep learning model, so that a certain accuracy can be achieved, and it is difficult to train a deep learning model with high prediction accuracy through a limited training sample, so that a large amount of manpower and material resources are consumed to obtain enough training samples in the early stage, much time is consumed when the deep learning model is trained, and the problems of high hardware processing resource consumption and low training process efficiency exist.

In order to solve the above problem, the present application provides an image classification method and apparatus, which can predict the class of a target image based on a classification model trained in advance. For example, the target image is a text image, a workpiece image, or the like, and recognition of a text category or a workpiece category may be implemented, and recognition of a category of another object, a person, or a scene may also be implemented, which is not limited in the present application.

When the category of the target image is predicted based on a pre-trained classification model, a mutual information value between an HOG feature matrix of the target image to be classified and a reference HOG feature matrix is calculated by combining a Histogram of Oriented Gradients (HOG) feature matrix of the image, the degree of similarity between different images can be better reflected by the size of the mutual information value, a reference HOG feature matrix with the mutual information value arranged at the front K bits from big to small is determined by further combining an improved K-Nearest neighbor (KNN) algorithm, and then the category of the target image can be finally predicted according to the categories respectively corresponding to the reference HOG feature matrices at the front K bits.

Compared with the existing mode of image classification based on the deep learning model, the process of extracting the image features and predicting the categories has less calculation amount, so that the hardware processing resources can be saved and the image processing efficiency can be improved under the condition of ensuring higher accuracy. And because the classification model does not need complex convolution layer, pooling layer and the like compared with a deep learning model, the model structure is simpler, so that the classification model can be obtained by training with limited training samples, and higher accuracy can be ensured.

In order to facilitate understanding of the technical solutions provided in the present application, the following describes the image classification method provided in the present application in detail with reference to specific embodiments.

First, a training process of the classification model in the embodiment of the present application will be described. The classification model comprises a feature extraction module for extracting HOG features, and a class prediction module for predicting classes based on the improved KNN algorithm. The feature extraction module and the category prediction module can be obtained by respectively training a training sample set and a testing sample set, the HOG features extracted by the feature extraction module can be more accurate through the training process, and the accuracy of the category results predicted by the category prediction module is higher.

Referring to fig. 1, a schematic flowchart of a process for training a classification model provided in an embodiment of the present application includes the following steps:

101, obtaining a training sample set and a test sample set, wherein the training sample set comprises training sample subsets of different categories, the training sample subset of each category comprises a plurality of training sample images belonging to the category, the test sample set comprises a plurality of test sample images, and each test sample image corresponds to a preset category label.

In the embodiment of the application, in order to reduce the influence of environmental factors or image quality on the extraction of image features and the prediction of categories, when a training sample set and a test sample set are constructed, each training sample image in the training sample set and each test sample image in the test sample set can be obtained after preprocessing. For convenience of describing the preprocessing process, the training sample image and the test sample image are collectively referred to as sample images hereinafter, and by preprocessing the sample images, irrelevant information in the sample images can be eliminated, and useful information in the enhanced sample images can be retained, so that the subsequent process of extracting image features and prediction categories can be more accurate.

Wherein the preprocessing performed on each sample image includes, but is not limited to:

(1) and adjusting the size of the sample image to a preset size.

By uniformly adjusting each sample image into an image with the same size, feature extraction and analysis can be conveniently carried out on each sample image subsequently.

(2) And carrying out graying processing on the sample image.

If the sample image is a color image, that is, an RGB image including three color channels of red (R), green (G), and blue (B), the RGB image may be converted into a grayscale image after performing a graying process.

For example, for the ith pixel point in the sample image, i is a positive integer, i is taken over from 1 to N, and N is the total number of pixel points in the sample image, the following formula (1) may be used to calculate the pixel value Gray of the ith pixel point after the graying processing_i：

Gray_i＝R_i*a1+G_i*a2+B_iA3 formula (1)

Wherein R is_i、G_i、B_iAnd respectively representing the pixel values of the ith pixel point of the sample image on the RGB three color channels. In one example, a1, a2, a3 may take values of 0.299, 0.587, 0.114, respectively.

(3) And carrying out normalization processing on the sample image.

The normalization process can be understood as the normalization of the color space, and aims to adjust the contrast of the sample image, reduce the influence caused by the local shadow and illumination of the sample image, and simultaneously suppress the interference of noise.

For example, the Gamma correction method may be used to normalize the gray value of each pixel point of the sample image. For example, the following formula (2) may be employed:

I(x,y)＝I(x,y)^gammaformula (2)

Wherein, (x, y) represents the coordinate position of the pixel point, and I represents the gray value of the pixel point. For example, gamma may be 1/2, that is, the gray values of all pixel points of the original sample image are squared to achieve gray normalization of the sample image.

In addition to the above three preprocessing processes (1) to (3), preprocessing such as image enhancement, image sharpening, and the like may be performed, which is not limited in the present application.

In the embodiment of the application, the sample images obtained after the preprocessing can be respectively divided into the training sample set and the test sample set according to a preset proportion, for example, 80% of the sample images can be used as the training sample images in the training sample set and 20% of the sample images can be used as the test sample images in the test sample set according to a preset proportion of 4: 1.

For the training sample set, since the category of each training sample image is known, all the training sample images may be classified into different training sample subsets according to the category, where the training sample subset of each category includes a plurality of training sample images belonging to the category.

For the test sample set, since the category of each test sample image is known, a corresponding preset category label can be configured for each test sample image, and the preset category label is used for representing the category to which the test sample image belongs, so that the predicted classification result can be verified by using the preset category label subsequently.

102, training a basic feature extraction module and a basic category prediction module in a basic classification model respectively based on training sample images in training sample subsets of different categories and test sample images in a test sample set, and obtaining the feature extraction module and the category prediction module respectively after the training is determined to be finished.

Here, the basic classification model, i.e. the classification model without training, includes a basic feature extraction module and a basic category prediction module. Illustratively, the basic feature extraction module is, for example, an algorithm module for extracting HOG features, and the basic class prediction module is, for example, a modified KNN algorithm module.

In the process of training the basic classification model, the model parameters in the basic classification prediction module can be fixed firstly, the model parameters of the basic feature extraction module are adjusted through the iterative training process for multiple times, and the feature extraction module is obtained after the basic feature extraction module is determined to be trained. Further, model parameters of the feature extraction module can be fixed, and model parameters of the basic category prediction module are adjusted through iteration of multiple training processes until the basic category prediction module is trained. And finishing the training of the basic feature extraction module and the basic category prediction model, and finishing one round of training of the basic classification model.

Further, the model parameters of the basic feature extraction module obtained in the previous round of training can be fixed, the basic feature extraction module obtained in the previous round of training is retrained again, and after the training of the basic feature extraction module in the current round is completed, the model parameters of the basic feature extraction module obtained in the current round of training can be fixed, and the basic feature prediction module obtained in the previous round of training is retrained again until the training of the basic feature prediction module in the current round is completed. By analogy, after the basic classification model is subjected to P-round training, the basic classification model obtained after the P-round training is regarded as a trained classification model, and P is an integer greater than 1.

The following describes in detail the training process of the basic feature extraction module and the training process of the basic category prediction module with reference to the flow diagrams shown in fig. 2 and fig. 3, respectively.

Referring to fig. 2, a schematic flow chart of a training process for the basic feature extraction module is shown, which includes the following steps:

step 201, inputting each test sample image and the training sample image in the training sample subset of each category into a basic feature extraction module, and respectively extracting a first HOG feature matrix of each test sample image and a second HOG feature matrix of each training sample image of each category.

Exemplarily, for each pixel point in the test sample image, a gradient operator can be used to perform convolution operation along the horizontal direction (x direction) and the vertical direction (y direction) respectively to obtain gradient vectors of each pixel point in the x direction and the y direction respectively, and the gradient amplitude and the gradient direction of each pixel point are calculated. Then, a plurality of pixel points can be combined into a cell unit, and finally, a first HOG characteristic matrix of the test sample image is extracted by counting gradient distribution information of each cell unit. The above manner may also be adopted for extracting the second HOG feature matrix of the test sample image.

Step 202, inputting the extracted first HOG feature matrix and the extracted second HOG feature matrix into a basic category prediction module, and predicting the category of each test sample image.

The process of predicting the class of each test sample image by the basic class prediction module can be referred to the related description in the process of training the basic class prediction module shown in fig. 3.

And 203, determining the accuracy of the training process of the round based on the predicted category of each test sample image and the preset category label of each test sample image.

In this step, each test sample image corresponds to one preset category label, so that the accuracy of the training process of the round can be determined by comparing whether the predicted category of each test sample image is consistent with the corresponding preset category label. For example, if 100 test sample images are used in the training process, where the predicted categories of 80 test sample images are consistent with the corresponding preset category labels, and the predicted categories of 20 test sample images are inconsistent with the corresponding predicted category labels, the accuracy of the training process in the current round may be determined to be 80%.

And 204, judging whether the accuracy meets a first preset condition. If the determination result is negative, that is, the accuracy does not satisfy the first preset condition, go to step 205; if the determination result is yes, that is, the accuracy rate satisfies the first predetermined condition, step 206 is executed.

For example, the accuracy greater than the first set threshold may be used as the first preset condition, or the variation trend of the accuracy may be counted, and the critical point reaching the rising trend and the falling trend, that is, the extreme point of the accuracy, may be used as the first preset condition. Of course, the first preset condition may also be configured according to actual requirements, which is not limited in this application.

And step 205, adjusting model parameters of the basic feature extraction module, returning to step 201, re-acquiring the training sample image and the test sample image, and re-executing the training process until the accuracy rate is determined to meet the first preset condition.

Under the condition that the accuracy does not meet the first preset condition, the basic feature extraction module is not converged, and the adjusted basic feature extraction module can be used for carrying out the next training process by adjusting the model parameters of the basic feature extraction module.

The model parameters of the basic feature extraction module include, for example, the number of cell units divided when the HOG features are extracted, and the number of pixel points included in each cell unit.

And step 206, determining that the training of the basic feature extraction module is finished, and taking the model parameter after the last adjustment as the model parameter of the feature extraction module.

After the feature extraction module is trained, the basic category prediction module can be further trained. Referring to fig. 3, a schematic flowchart of a process for training the basic category prediction module includes the following steps:

step 301, inputting each test sample image and the training sample image in the training sample subset of each category into the feature extraction module after training is completed, and respectively extracting a first HOG feature matrix of each test sample image and a second HOG feature matrix of each training sample image of each category.

In this step, the trained feature extraction module is used to extract the first HOG feature matrix and the second HOG feature matrix. The process of extracting the first HOG feature matrix and the second HOG feature matrix may refer to the related description in step 201.

And step 302, inputting the extracted first HOG characteristic matrix and the extracted second HOG characteristic matrix into a basic category prediction module. Wherein, the processing executed in the basic category prediction module includes steps 3021 to 3023:

step 3021, determining a central HOG feature matrix corresponding to each category according to each N second HOG feature matrices corresponding to each category.

Considering that there are many training sample images of each category in the training sample set and correspondingly there are many second HOG feature matrices corresponding to each category, in order to reduce the amount of computation in the subsequent calculation of mutual information values and further reduce the processing pressure, in the embodiment of the present application, clustering processing may be performed on the training sample images corresponding to each category, that is, clustering processing may be performed on the extracted second HOG feature matrices corresponding to each category.

In some embodiments, for all second HOG feature matrices corresponding to each category, a central HOG feature matrix in the category may be determined according to every N second HOG feature matrices in the category. For example, assuming that 50 training sample images are respectively selected from the training sample subsets of each category in the training process of the current round, and if N is configured to be 10 in the initial state, for any category, a central HOG feature matrix may be determined by using the second HOG feature matrix of each 10 training sample images in the category, and finally, 5 central HOG feature matrices may be obtained for similarity comparison with the first HOG feature matrix of the test sample image, so that the amount of operation may be greatly reduced, and the processing efficiency may be improved.

It should be understood that each HOG feature matrix includes at least one feature parameter (that is, a matrix parameter), and each feature parameter is used to characterize an image feature of a corresponding image region in a training sample image, where an image region includes at least one pixel point. For example, suppose that a training sample image includes 9 × 9 pixel points, and the extracted HOG feature matrix is 3 × 3 (a matrix with 3 rows and 3 columns), one feature parameter in the HOG feature matrix may represent an image feature of a corresponding image region in the training sample image, and one image region includes 9 pixel points.

In one example, the central HOG feature matrix may be determined by: and carrying out weighted average on the values of the characteristic parameters at the same position in the N second HOG characteristic matrixes to obtain the value of the characteristic parameter of the central HOG characteristic matrix at the corresponding position, wherein the characteristic information of the N second HOG characteristic matrixes can be reflected by the central HOG characteristic matrix determined in the manner. Of course, in practical applications, other ways may be used to determine the central HOG feature matrix.

And step 3022, determining a central HOG feature matrix with mutual information values arranged at the front K bits from big to small between the central HOG feature matrix and each first HOG feature matrix.

In a specific implementation, after the central HOG feature matrix corresponding to each category is determined, for the first HOG feature matrix of each test sample image, a mutual information value between the first HOG feature matrix of the test sample image and each central HOG feature matrix corresponding to each category may be respectively calculated, so as to determine a degree of similarity between the first HOG feature matrix and each central HOG feature matrix corresponding to each category.

Mutual information is an important information measurement mode in information theory, and can represent the degree of association between two variables. By applying the method and the device, the correlation degree, namely the similarity degree, between the first HOG feature matrix and the center HOG feature matrix can be reflected by the mutual information value between the first HOG feature matrix and the center HOG feature matrix. The higher the mutual information value, the higher the degree of similarity between the central HOG feature matrix and the first HOG feature matrix, and accordingly, the test sample image corresponding to the first HOG feature matrix is more likely to belong to the category corresponding to the central HOG feature matrix. Therefore, based on the above characteristics, for each first HOG feature matrix, a central HOG feature matrix whose mutual information values with the first HOG feature matrix are arranged at the top K bits from large to small may be selected.

In some embodiments, for any one first HOG feature matrix a and any one center HOG feature matrix B, a mutual information value I (a, B) between the two may be calculated according to the following formula (3):

i (a, B) ═ H (a) + H (B) -H (a, B) formula (3)

Wherein H (a) represents a first information entropy of the first HOG feature matrix, H (B) represents a second information entropy of the center HOG feature matrix, and H (a, B) represents a third information entropy between the first HOG feature matrix and the center HOG feature matrix.

The following calculation methods for H (A), H (B), and H (A, B) are as follows:

(1) first information entropy of first HOG feature matrix H (A)

For the first HOG feature matrix, a first probability of each feature parameter appearing in the first HOG feature matrix may be calculated according to each feature parameter in the first HOG feature matrix, and a first information entropy of the first HOG feature matrix may be determined based on the calculated first probability.

For example, it is assumed that the first HOG feature matrix is represented by a, the first HOG feature matrix includes an ith feature parameter, the ith feature parameter may represent an image feature of a corresponding image region in the training sample image, and the training sample image is subjected to graying processing, and the image feature may be represented by a grayscale value, so i may take any one integer from 0 to 255.

The calculation formula (4) of the first information entropy h (a) of the first HOG feature matrix is as follows:

where p (i) is a first probability representing the probability that the ith feature parameter appears in the first HOG feature matrix. In one example, if the first HOG feature matrix is a 9-row and 9-column matrix including 81 feature parameters, and if i is 100 and the 100 th feature parameter has 9, p (i is 100) is 9/81.

H (a) can be finally calculated using the above equation (4) by counting the probability that each feature parameter appears in the first HOG feature matrix.

(2) Second information entropy of center HOG feature matrix H (B)

For the center HOG feature matrix, a second probability of each feature parameter appearing in the center HOG feature matrix may be calculated according to each feature parameter of the center HOG feature matrix, and a second information entropy of the center HOG feature matrix may be determined based on the calculated second probability.

For example, assuming that the central HOG feature matrix is denoted by B, the central HOG feature matrix includes jth feature parameters, and j may take any integer from 0 to 255.

The second information entropy h (b) of the central HOG feature matrix is calculated by the following formula (5):

wherein p (j) is a second probability representing the probability of the j-th feature parameter appearing in the central HOG feature matrix. For example, if the central HOG feature matrix is a 9-row and 9-column matrix including 81 feature parameters, and if j is 50 and the 50 th feature parameter has 9, p (i is 50) is 9/81.

H (b) can be finally calculated using the above equation (5) by counting the probability that each feature parameter appears in the central HOG feature matrix.

(3) Third entropy between first HOG feature matrix and center HOG feature matrix H (A, B)

A third probability of occurrence of the same kind of feature parameter at the same position in the first HOG feature matrix and the center HOG feature matrix may be calculated, and a third information entropy between the target HOG feature matrix and each of the reference HOG feature matrices may be determined based on the calculated third probability.

Illustratively, the third information entropy H (a, B) is calculated as follows:

wherein p (i, j) is a third probability, and represents that the values of the feature parameters at the same position of the first HOG feature matrix and the center HOG feature matrix are i and j, respectively, and i ═ j, that is, the probability of the occurrence of the same feature parameter at the same position. For example, assuming that the first HOG feature matrix and the center HOG feature matrix are both 9 rows and 9 columns matrices, then assuming that the values of the feature parameters of the first HOG feature matrix and the center HOG feature matrix at the position of the 1 st row and 2 nd column, the position of the 1 st row and 3 rd column, the position of the 1 st row and 4 th column, and the position of the 1 st row and 5 th column are 10 and 10, respectively, then p (i equals 10, j equals 10) equals 4/81. By taking out the value of p (i, j), H (a, B) can be calculated according to the above formula (6).

And step 3023, predicting the category of each test sample image based on the determined categories respectively corresponding to the central HOG feature matrixes arranged at the top K bits. Wherein K and N are positive integers.

For each test sample image, the manner of predicting the class of the test sample image is implemented based on a modified KNN algorithm. The basic idea of the so-called KNN algorithm is as follows: if most of the K nearest neighbors of a sample in the feature space belong to a certain class, then the sample also belongs to this class and has the characteristics of the sample on this class. Wherein the nearest neighboring sample may be understood as the most similar sample.

When the method is applied to the application, for the first HOG feature matrix of each test sample image, the determined central HOG feature matrix arranged at the top K bits is the K central HOG feature matrices most similar to the first HOG feature matrix, and then if most of the K central HOG feature matrices belong to a certain category, the first HOG feature matrix also belongs to the category, that is, the test sample image belongs to the category. Based on the above idea, the category with the largest number of occurrences may be used as the category of the test sample image according to the categories corresponding to the central HOG feature matrices arranged at the top K bits.

It should be noted that, in the category prediction process, since the two steps of determining the central HOG feature matrix and determining the K central HOG feature matrices most similar to the first HOG feature matrix are key steps affecting the prediction accuracy, how to reasonably formulate the values of the parameters N and K is also a key factor affecting the prediction accuracy. In the conventional KNN algorithm, the value of K is a preconfigured empirical value, and if the value of K is set unreasonably, the prediction accuracy is affected. In the embodiment of the application, in the process of training the basic category prediction module, the value of K can be adjusted in a training mode, and the value of N can also be adjusted in a training mode, so that the accuracy in category prediction is high.

In response to step 302, step 303 is further executed:

and step 303, determining the accuracy of the training process of the round based on the predicted category of each test sample image and the preset category label of each test sample image.

The manner of determining the accuracy in this step is the same as that in step 203 described above, and will not be described here.

And step 304, judging whether the accuracy meets a second preset condition. If the determination result is negative, that is, the accuracy does not satisfy the second preset condition, executing step 305; if the determination result is yes, that is, the accuracy rate satisfies the second predetermined condition, step 306 is executed.

The configuration of the second preset condition and the configuration of the first preset condition in step 204 may be based on the same technical concept. However, in consideration of different training processes, the requirement for the accuracy may be different, and if both the first preset condition and the second preset condition are set to have the accuracy greater than the set threshold, the set threshold of the first preset condition may be set as the first set threshold, and the set threshold of the second preset condition may be set as the second set threshold. Wherein the first set threshold and the second set threshold may be different.

And 305, adjusting model parameters K and N of the basic category prediction module, returning to the step 301, re-acquiring the training sample image and the test sample image, and re-executing the training process until the accuracy rate is determined to meet a second preset condition.

In the embodiment of the application, since the basic feature extraction module is trained, under the condition that the accuracy does not meet the second preset condition, the model parameters K and N of the basic category prediction module can be adjusted, so that the adjusted basic category prediction module is used for carrying out the next round of training process.

And step 306, determining that the training of the basic category prediction module is finished, and taking the model parameter after the last adjustment as the model parameter of the category prediction module.

So far, after the training process shown in fig. 2 and fig. 3, the feature extraction module and the category prediction module after training can be obtained.

In response to step 102, step 103 is further executed:

and 103, forming a classification model by the trained feature extraction module and the trained category prediction module.

After the flow diagrams shown in fig. 1 to fig. 3 are obtained, the training process of the classification model, that is, the adjustment of the model parameters of the feature extraction module and the adjustment of the model parameters K and N of the category prediction module, can be completed.

In the embodiment of the application, after the model parameter N is determined, a central HOG feature matrix is determined according to a second HOG feature matrix corresponding to each N training sample images for training sample images in a subset of training samples of each category, a plurality of central HOG feature matrices corresponding to each category are determined, and are stored in the classification model as reference HOG feature matrices corresponding to each category, wherein the reference HOG feature matrices corresponding to each category can reflect edge feature information of the images of the category, so that the categories of target images to be classified can be predicted by using the reference HOG feature matrices as references in the following process.

Next, a mode of predicting the category of the target image to be classified will be described in detail with reference to the classification model obtained by the above training.

Referring to fig. 4, a schematic flow chart of an image classification method provided in the embodiment of the present application is shown, including the following steps:

step 401, a target image to be classified is obtained.

And step 402, inputting the target image into a classification model trained in advance, and outputting the category of the target image.

In some embodiments of the present application, after obtaining a target image to be classified, the target image may be further preprocessed to obtain a preprocessed target image, where the preprocessing includes but is not limited to: adjusting the size of the target image to a preset size, and carrying out graying processing and normalization processing on the target image. The way of preprocessing the target image and the way of preprocessing the sample image are based on the same technical concept, and the specific process of preprocessing can refer to the related description in step 101 shown in fig. 1, and will not be described here. Furthermore, the preprocessed target image can be input into a classification model trained in advance, and the category of the target image is output.

The processing process of the target image in the classification model comprises the following steps 4021 to 4024:

step 4021, extracting the HOG characteristic matrix of the target image.

The way of extracting the HOG feature matrix of the target image and the way of extracting the HOG feature matrix of the sample image are based on the same technical concept, and the specific process of extracting the HOG feature matrix can refer to the related description in step 201 shown in fig. 2, and will not be further described here.

Step 4022, calculating mutual information values between the target HOG feature matrix and the reference HOG feature matrix corresponding to each category.

In a specific implementation, first, according to each feature parameter in the target HOG feature matrix, calculating a first probability of each feature parameter appearing in the target HOG feature matrix, and determining a first information entropy of the target HOG feature matrix based on the calculated first probability; the second probability of each characteristic parameter appearing in each reference HOG characteristic matrix can be calculated according to each characteristic parameter of each reference HOG characteristic matrix, and the second information entropy of each reference HOG characteristic matrix is determined based on the calculated second probability; and calculating a third probability of the same characteristic parameter appearing at the same position in the target HOG characteristic matrix and each reference HOG characteristic matrix, and determining a third information entropy between the target HOG characteristic matrix and each reference HOG characteristic matrix based on the calculated third probability.

Further, a mutual information value between the target HOG feature matrix and each reference HOG feature matrix is calculated based on the first information entropy, the second information entropy of each reference HOG feature matrix, and the third information entropy between the target HOG feature matrix and each reference HOG feature matrix.

The manner of determining the first information entropy, the second information entropy, the third information entropy and the mutual information value may refer to the related description in step 3022 shown in fig. 3, where the first information entropy of the target HOG feature matrix corresponds to H (a), the second information entropy of the reference HOG feature matrix corresponds to H (B), and the third information entropy between the target HOG feature matrix and the reference HOG feature matrix corresponds to H (a, B). The specific calculation process is not described further.

Step 4023, determining a reference HOG feature matrix with the mutual information values of the target HOG feature matrix arranged at the front K bits from big to small based on the calculated mutual information values; k is a positive integer.

And step 4024, according to the categories respectively corresponding to the reference HOG feature matrixes arranged at the top K, taking the category with the largest occurrence frequency as the category of the target image.

The specific implementation of step 4023 and step 4024 can refer to the related description in step 3023 shown in fig. 3, and will not be described here.

In an example, assuming that K is 30, the categories corresponding to 30 reference feature matrices may be determined, and if the categories corresponding to 25 reference feature matrices are all the categories M, the category M may be determined to be the category with the largest number of occurrences, and therefore, the category M may be taken as the category of the target image.

Compared with the conventional mode of classifying images based on a deep learning model, the image classification method based on the deep learning model has the advantages that the process operation amount of extracting image features and predicting the image types is small, hardware processing resources can be saved under the condition of ensuring high accuracy, and the image processing efficiency is improved. And because the classification model does not need a complex convolutional neural network model relative to a deep learning model, the model structure is simpler, so that the classification model can be obtained by training with limited training samples, and higher accuracy can be ensured.

Based on the same application concept, an image classification device corresponding to the image classification method is further provided in the embodiment of the present application, and as the principle of solving the problem of the device in the embodiment of the present application is similar to the image classification method in the embodiment of the present application, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.

Referring to fig. 5, which is a schematic structural diagram of an image classification apparatus provided in an embodiment of the present application, the image classification apparatus 500 includes: an acquisition module 501 and a processing module 502; wherein,

an obtaining module 501, configured to obtain a target image to be classified;

a processing module 502, configured to input the target image into a pre-trained classification model, and output a category of the target image; wherein the processing procedure executed by the processing module 502 in the classification model includes:

In some embodiments of the present application, after acquiring the target image to be classified, the acquiring module 501 is further configured to:

preprocessing the target image to obtain a preprocessed target image; wherein the pre-processing comprises: adjusting the size of the target image to a preset size, and carrying out graying processing and normalization processing on the target image;

the processing module 502, when inputting the target image into a pre-trained classification model and outputting the category of the target image, is specifically configured to:

and inputting the preprocessed target image into a pre-trained classification model, and outputting the category of the target image.

In some embodiments of the present application, when the processing module 502 calculates the mutual information values between the target HOG feature matrix and the reference HOG feature matrix corresponding to each category, it is specifically configured to:

calculating a first probability of each characteristic parameter appearing in the target HOG characteristic matrix according to each characteristic parameter in the target HOG characteristic matrix, and determining a first information entropy of the target HOG characteristic matrix based on the calculated first probability; and the number of the first and second groups,

calculating a second probability of each characteristic parameter appearing in each reference HOG characteristic matrix according to each characteristic parameter of each reference HOG characteristic matrix, and determining a second information entropy of each reference HOG characteristic matrix based on the calculated second probability; and the number of the first and second groups,

calculating a third probability of the same characteristic parameter appearing at the same position in the target HOG characteristic matrix and each reference HOG characteristic matrix, and determining a third information entropy between the target HOG characteristic matrix and each reference HOG characteristic matrix based on the calculated third probability;

calculating mutual information values between the target HOG feature matrix and each reference HOG feature matrix based on the first information entropy, the second information entropy of each reference HOG feature matrix and a third information entropy between the target HOG feature matrix and each reference HOG feature matrix.

In some embodiments of the present application, the obtaining module 501 is further configured to:

acquiring a training sample set and a test sample set, wherein the training sample set comprises training sample subsets of different categories, the training sample subset of each category comprises a plurality of training sample images belonging to the category, the test sample set comprises a plurality of test sample images, and each test sample image corresponds to a preset category label;

the image classification apparatus 500 further includes:

a model training module 503, configured to train a basic feature extraction module and a basic category prediction module in a basic classification model respectively based on the training sample images in the training sample subsets of different categories and the test sample images in the test sample set, and obtain the feature extraction module and the category prediction module respectively after it is determined that training is completed; and forming the classification model by the trained feature extraction module and the trained category prediction module.

In some embodiments of the present application, the model training module 503 is specifically configured to train the basic feature extraction module according to the following manners:

inputting each test sample image and the training sample image in the training sample subset of each category into the basic feature extraction module, and respectively extracting a first HOG feature matrix of each test sample image and a second HOG feature matrix of each training sample image of each category;

inputting the extracted first HOG characteristic matrix and the extracted second HOG characteristic matrix into the basic category prediction module, and predicting the category of each test sample image;

determining the accuracy of the training process of the round based on the predicted category of each test sample image and the preset category label of each test sample image;

if the accuracy rate does not meet a first preset condition, adjusting the model parameters of the basic feature extraction module and re-executing the training process until the accuracy rate is determined to meet the first preset condition; and if the accuracy meets the first preset condition, determining that the basic feature extraction module is trained.

In some embodiments of the present application, the model training module 503 is configured to train the base class prediction module according to the following ways:

inputting each test sample image and the training sample image in the training sample subset of each category into a feature extraction module after training is completed, and respectively extracting a first HOG feature matrix of each test sample image and a second HOG feature matrix of each training sample image of each category;

inputting the extracted first HOG feature matrix and the extracted second HOG feature matrix into the basic category prediction module, and determining a central HOG feature matrix corresponding to each category according to every N second HOG feature matrices corresponding to each category; determining a central HOG feature matrix with mutual information values arranged at the front K bits from big to small between the central HOG feature matrix and each first HOG feature matrix; predicting the category of each test sample image based on the determined categories respectively corresponding to the central HOG feature matrixes arranged at the front K bits; k and N are positive integers;

if the accuracy rate does not meet a second preset condition, adjusting model parameters K and N of the basic category prediction module, and repeating the training process until the accuracy rate meets the second preset condition; and if the accuracy meets the second preset condition, determining that the training of the basic category prediction module is finished.

The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.

As shown in fig. 6, a schematic structural diagram of an electronic device provided in the embodiment of the present application includes a processor 61, a memory 62, and a bus 63; the memory 62 is used for storing execution instructions and includes a memory 621 and an external memory 622; the memory 621 is also referred to as an internal memory, and is used for temporarily storing the operation data in the processor 61 and the data exchanged with the external memory 622 such as a hard disk, the processor 61 exchanges data with the external memory 622 through the memory 621, and when the electronic device 60 operates, the processor 61 communicates with the memory 62 through the bus 63, so that the processor 61 executes the following instructions:

acquiring a target image to be classified;

The description of the specific processing procedure of the processor 61 in the electronic device may refer to the related description in the above method embodiment, and will not be described in detail here.

Furthermore, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the image classification method described in the above method embodiment.

The computer program product of the image classification method provided in the embodiment of the present application includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the steps of the image classification method in the above method embodiment, which may be referred to specifically in the above method embodiment, and are not described herein again.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. An image classification method, comprising:

acquiring a target image to be classified;

calculating a first probability of each characteristic parameter appearing in a target HOG characteristic matrix according to each characteristic parameter in the target HOG characteristic matrix, and determining a first information entropy of the target HOG characteristic matrix based on the calculated first probability; each feature parameter is used for representing the image feature of a corresponding image area in the target image, the feature parameters with the same value are the same feature parameter, and the image area comprises at least one pixel point; and the number of the first and second groups,

calculating mutual information values between the target HOG feature matrix and each reference HOG feature matrix based on the first information entropy, the second information entropy of each reference HOG feature matrix and a third information entropy between the target HOG feature matrix and each reference HOG feature matrix;

2. The image classification method of claim 1, characterized in that the method further comprises:

training a basic feature extraction module and a basic category prediction module in a basic classification model respectively based on the training sample images in the training sample subsets of different categories and the test sample images in the test sample set, and obtaining the feature extraction module and the category prediction module respectively after the training is determined to be finished;

and forming the classification model by the trained feature extraction module and the trained category prediction module.

3. The image classification method of claim 2, characterized in that the base feature extraction module is trained according to the following:

4. The image classification method of claim 3, characterized in that the base class prediction module is trained according to:

5. An image classification apparatus, comprising:

the acquisition module is used for acquiring a target image to be classified;

6. The image classification device of claim 5, wherein the obtaining module is further configured to:

the device further comprises:

the model training module is used for respectively training a basic feature extraction module and a basic category prediction module in a basic classification model based on training sample images in the training sample subsets of different categories and test sample images in the test sample sets, and respectively obtaining the feature extraction module and the category prediction module after the training is determined to be finished; and forming the classification model by the trained feature extraction module and the trained category prediction module.

7. The image classification device of claim 6, wherein the model training module is specifically configured to train the base feature extraction module according to:

8. The image classification device of claim 7, wherein the model training module is configured to train the base class prediction module according to: