CN110020592B

CN110020592B - Object detection model training method, device, computer equipment and storage medium

Info

Publication number: CN110020592B
Application number: CN201910108522.6A
Authority: CN
Inventors: 巢中迪; 庄伯金; 王少军
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-02-03
Filing date: 2019-02-03
Publication date: 2024-04-09
Anticipated expiration: 2039-02-03
Also published as: CN110020592A; WO2020155518A1

Abstract

The invention discloses an object detection model training method, an object detection model training device, computer equipment and a storage medium, and relates to the field of artificial intelligence. The object detection model training method comprises the following steps: obtaining a training sample; inputting a training sample into an object detection model for model training, wherein the object detection model comprises a detection module, a classification module and a discrimination module; obtaining detection loss generated by the detection module, classification loss generated by the classification module and discrimination loss generated by the discrimination module in the model training process; and updating the object detection model according to the detection loss, the classification loss and the discrimination loss to obtain a target object detection model. The object detection model obtained by training by the object detection model training method can effectively improve the object detection accuracy.

Description

Object detection model training method, device, computer equipment and storage medium

[ field of technology ]

The present invention relates to the field of artificial intelligence, and in particular, to a method and apparatus for training an object detection model, a computer device, and a storage medium.

[ background Art ]

Object detection is one of the classical problems in computer vision, the task of which is to frame out the position of objects in the image and give the class of objects. Object Detection is improved step by step from the framework of a traditional manually designed feature shallow classifier to an end-to-end Detection framework based on deep learning, but the problem of low object Detection accuracy still exists in the Detection methods commonly used at present, such as YOLO (You Only Look Once) Detection methods, SSD (Single Shot Multi-Box Detection) and the like.

[ invention ]

In view of the above, the embodiments of the present invention provide a method, apparatus, computer device and storage medium for training an object detection model, so as to solve the problem that the accuracy of object detection is still low in the current general situation.

In a first aspect, an embodiment of the present invention provides an object detection model training method, including:

obtaining a training sample;

inputting the training sample into an object detection model for model training, wherein the object detection model comprises a detection module, a classification module and a discrimination module;

obtaining detection loss generated by the detection module, classification loss generated by the classification module and discrimination loss generated by the discrimination module in a model training process;

and updating the object detection model according to the detection loss, the classification loss and the discrimination loss to obtain a target object detection model.

In accordance with aspects and any one of the possible implementations described above, there is further provided an implementation, before the inputting the training sample into an object detection model for model training, the method further includes:

obtaining an object detection model to be processed, wherein the object detection model to be processed comprises the detection module and the classification module;

adding the judging module into the object detection model to be processed, wherein the judging module is used for judging the result output by the detection module and/or the classification module;

and initializing the model of the object detection model to be processed after the judging module is added, so as to obtain the object detection model.

In aspects and any possible implementation manner as described above, there is further provided an implementation manner, where the inputting the training sample into an object detection model for model training includes:

inputting the training sample, and extracting the feature vector of the training sample through the object detection model;

and carrying out normalization processing on the feature vector to obtain a normalized feature vector, wherein the expression of the normalization processing is as follows: y= (x-MinValue)/(MaxValue-MinValue), y is the normalized feature vector, x is the feature vector, maxValue is the maximum value of the feature values in the feature vector, and MinValue is the minimum value of the feature values in the feature vector;

and carrying out model training on the object detection model according to the normalized feature vector.

Aspects and any one of the possible implementations as described above, further providing an implementation that obtains a detection loss generated by the detection module, a classification loss generated by the classification module, and a discrimination loss generated by the discrimination module during model training, including:

in the model training process, a first training feature vector output by the detection module is obtained, and a preset detection loss function is adopted to calculate the loss between the first training feature vector and a pre-stored first label vector, so that the detection loss is obtained;

in the model training process, a second training feature vector output by the classification module is obtained, and a preset classification loss function is adopted to calculate the loss between the second training feature vector and a second label vector stored in advance, so that the classification loss is obtained;

and in the model training process, obtaining a third training feature vector output by the judging module, and calculating the judging loss by adopting a preset judging loss function according to the third training feature vector.

In the aspect and any possible implementation manner described above, there is further provided an implementation manner, where updating the object detection model according to the detection loss, the classification loss, and the discrimination loss, to obtain a target object detection model includes:

updating network parameters in the object detection model by adopting a back propagation algorithm according to the detection loss, the classification loss and the discrimination loss;

and stopping updating the network parameters when the variation values of the network parameters are smaller than the iteration stopping threshold value, so as to obtain the target object detection model.

In a second aspect, an embodiment of the present invention provides an object detection model training apparatus, including:

the training sample acquisition module is used for acquiring training samples;

the model training module is used for inputting the training sample into an object detection model for model training, wherein the object detection model comprises a detection module, a classification module and a discrimination module;

the loss acquisition module is used for obtaining the detection loss generated by the detection module, the classification loss generated by the classification module and the discrimination loss generated by the discrimination module in the model training process;

and the target object detection model acquisition module is used for updating the object detection model according to the detection loss, the classification loss and the discrimination loss to obtain a target object detection model.

In a third aspect, an embodiment of the present invention provides an object detection method, including:

acquiring an image to be detected;

inputting the image to be detected into a target object detection model to carry out object detection to obtain an object detection result of the image to be detected, wherein the target object detection model is obtained by adopting the object detection model training method.

In a fourth aspect, an embodiment of the present invention provides an object detection apparatus, including:

the image acquisition module to be detected is used for acquiring an image to be detected;

the object detection result acquisition module is used for inputting the image to be detected into a target object detection model for object detection to obtain an object detection result of the image to be detected, wherein the target object detection model is obtained by adopting the object detection model training method.

In a fifth aspect, a computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the object detection model training method described above when the computer program is executed; alternatively, the processor, when executing the computer program, implements the steps of the object detection method described above.

In a sixth aspect, an embodiment of the present invention provides a computer-readable storage medium, including: a computer program which, when executed by a processor, implements the steps of the object detection model training method described above; alternatively, the computer program when executed by a processor implements the steps of the object detection method described above.

In the object detection model training method, the device, the computer equipment and the storage medium, the training sample is obtained, the object detection model is subjected to model training by adopting the training sample, and the object detection model is updated according to the detection loss, the classification loss and the discrimination loss generated by the model training to obtain the target object detection model.

In the object detection method, the device, the computer equipment and the storage medium provided by the invention, the detection result with higher accuracy can be obtained by adopting the target object detection model to detect the image to be detected.

[ description of the drawings ]

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a training method based on an object detection model in an embodiment of the invention;

FIG. 2 is a schematic diagram of an object detection model-based training apparatus according to an embodiment of the present invention;

FIG. 3 is a flow chart of an object detection method according to an embodiment of the invention;

FIG. 4 is a schematic diagram of an object-based detection apparatus according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a computer device in accordance with an embodiment of the invention.

[ detailed description ] of the invention

For a better understanding of the technical solution of the present invention, the following detailed description of the embodiments of the present invention refers to the accompanying drawings.

It should be understood that the described embodiments are merely some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be understood that the term "and/or" as used herein is merely one of the same fields describing the associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.

It should be understood that although the terms first, second, third, etc. may be used to describe the preset ranges, etc. in the embodiments of the present invention, these preset ranges should not be limited to these terms. These terms are only used to distinguish one preset range from another. For example, a first preset range may also be referred to as a second preset range, and similarly, a second preset range may also be referred to as a first preset range without departing from the scope of embodiments of the present invention.

Depending on the context, the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to detection". Similarly, the phrase "if determined" or "if detected (stated condition or event)" may be interpreted as "when determined" or "in response to determination" or "when detected (stated condition or event)" or "in response to detection (stated condition or event), depending on the context.

Fig. 1 shows a flowchart of the object detection model training method in the present embodiment. The object detection model training method can be applied to an object detection model training system, and can be realized by adopting the object detection model training system when an object detection model is trained. The computer device is a device capable of performing man-machine interaction with a user, and comprises, but is not limited to, a computer, a smart phone, a tablet and the like. As shown in fig. 1, the object detection model training method includes the following steps:

s10: a training sample is obtained.

In one embodiment, training samples required for model training are obtained. Specifically, an image related to a certain type of scene may be selected as a training sample according to the need for object detection. For example, an image stored in the automobile data recorder may be used as a training sample. The pictures stored in the automobile data recorder can reflect the road condition in front of the automobile in the running process, the images can be used as training samples to train the target object detection model, so that the target object detection model obtained through training can detect the object in front of the automobile in the running process, and the automobile can make a preset reaction according to the received detection result. It can be understood that, before performing model training, the objects appearing in the images stored in the vehicle recorder need to be labeled in advance (the objects needing to be detected are labeled, and the objects not needing to be detected may not be labeled), and in addition, a deep neural network (such as a convolutional neural network) needs to be adopted in advance to extract deep features of the images belonging to the same category as the labeled objects, so as to identify the category of the objects when the object detection model (including the corresponding deep neural network for extracting the image features) detects.

S20: and inputting the training sample into an object detection model for model training, wherein the object detection model comprises a detection module, a classification module and a discrimination module.

The model training refers to training of a target object detection model. The detection module is used for detecting objects in the image, the classification module is used for identifying and classifying the detected objects, the discrimination module comprises a first discrimination module and/or a second discrimination module, the first discrimination module is used for discriminating whether the result output by the detection module is correct, the second discrimination module is used for discriminating whether the result output by the classification module is correct, wherein the first discrimination module and the second discrimination module can exist at the same time or only exist at the same time, and the second discrimination module is used as the discrimination module.

In one embodiment, the training samples are input into an object detection model for model training, wherein the object detection model comprises a discrimination model in addition to a detection model and a classification model. It will be appreciated that training samples are model trained, i.e., the process of inputting training samples into an object detection model for detection.

Further, before step S20, the method further includes:

s211: and obtaining an object detection model to be processed, wherein the object detection model to be processed comprises a detection module and a classification module.

In an embodiment, a to-be-processed object Detection model is obtained, and it can be understood that the to-be-processed object Detection model may be a YOLO (You Only Look Once) Detection model, an SSD (Single Shot Multi-Box Detection) Detection model, or the like. These models include a detection module and a classification module. The present embodiment is an improvement based on these object detection models to be processed.

S212: and adding a judging module into the object detection model to be processed, wherein the judging module is used for judging the result output by the detection module and/or the classification module.

In an embodiment, a discriminating module is added on the basis of the original object detection model to be processed, so as to discriminate the result output by the object detection model to be processed. The addition judging module can help to acquire the detection accuracy of the object detection model to be processed, so that the model is updated according to the detection error of the object detection model to be processed, and the detection accuracy is improved.

S213: and initializing the model of the object detection model to be processed after the judging module is added, so as to obtain the object detection model.

The initialization operation of the model refers to initializing network parameters in the model, and initial values of the network parameters may be preset according to experience.

It can be understood that if the model is not initialized, network parameters in the detection module and the classification module in the object detection model to be processed are actually updated through multiple training, and the effect of the discrimination module for discriminating and updating according to the output result of the detection module and/or the classification module is poor, because the detection module and the classification module are learned for a long time when training is started, and the discrimination module is adopted for updating in a short time, so that updating is not complete; in contrast, after the initialization operation, the judging module judges when the detecting module and/or the classifying module outputs a result in the training stage, and can update the network parameters according to the output result in time along with the training process, thereby achieving better detection accuracy.

In the steps S211 to S213, an embodiment of obtaining an object detection model is provided, specifically, a discriminating module is added to the object detection model to be processed, and the model is initialized, which is favorable for improving the accuracy of detecting the target object detection model obtained by subsequent training and updating according to the object detection model.

Further, in step S20, the training sample is input into the object detection model to perform model training, which specifically includes:

s221: and inputting a training sample, and extracting the feature vector of the training sample through the object detection model.

In an embodiment, the object detection model includes a deep neural network, which may be a convolutional neural network, for extracting feature vectors of training samples. When a training sample is input into the object detection model, the object detection model adopts a deep neural network to extract feature vectors of the training sample, and a technical basis is provided for model training.

S222: normalizing the feature vector to obtain a normalized feature vector, wherein the expression of the normalization process is: y= (x-MinValue)/(MaxValue-MinValue), y is a normalized feature vector, x is a feature vector, maxValue is a maximum value of feature values in the feature vector, and MinValue is a minimum value of feature values in the feature vector.

The feature value in the feature vector specifically refers to a pixel value.

In one embodiment, the feature vectors are normalized, i.e., the feature values in the feature vectors are normalized to [0,1 ]]Is in the interval of (2). It will be appreciated that the pixel value of the image has 2 ⁸ 、2 ¹² And 2 ¹⁶ The pixel value levels are equal, a large number of different pixel values can be contained in one image, so that the calculation efficiency is low, and therefore, the characteristic values in the characteristic vectors are compressed in the same range interval in a normalization mode, the calculation efficiency is improved, and the model training time is shorter.

S223: and carrying out model training on the object detection model according to the normalized feature vector.

In steps S221-S223, an embodiment is provided in which a training sample is input into an object detection model to perform model training, the extracted training sample features are normalized, and feature values in feature vectors are compressed in the same range interval, so that training duration can be obviously shortened, and training efficiency can be improved.

S30: the detection loss generated by the detection module, the classification loss generated by the classification module and the discrimination loss generated by the discrimination module are obtained in the model training process.

It will be appreciated that the detection module, the classification module and the discrimination module respectively represent functions, each of which is likely to be in error, i.e. generated loss, when implemented, and according to these detection loss generated by the detection module, classification loss generated by the classification module and discrimination loss generated by the discrimination module, the object detection model can be adjusted by reference to help the obtained object detection model so that the obtained object detection model can be free from error as much as possible when the detection module, the classification module and the discrimination module are implemented again, thereby improving the accuracy of detection of the object detection model.

Further, in step S30, the detection loss generated by the detection module, the classification loss generated by the classification module, and the discrimination loss generated by the discrimination module are obtained during the model training process, which specifically includes:

s31: in the model training process, a first training feature vector output by the detection module is obtained, and a preset detection loss function is adopted to calculate the loss between the first training feature vector and a pre-stored first label vector, so that the detection loss is obtained.

The first training feature vector is a feature vector for checking whether the first training feature vector is correct, and represents a real result.

In an embodiment, a predetermined detection loss function is used to calculate a loss between the first training feature vector and the first label vector stored in advance, so as to obtain a detection loss, and the network parameters of the model are updated according to the detection loss. Specifically, detecting the loss function may include a loss function for the predicted center coordinates, expressed as:

wherein λ represents an adjustment factor, is a preset parameter value, I represents grid cells obtained by segmentation during detection, I represents the total number of grid cells, J represents a predicted value of a boundary box, J represents the total number of predicted values of the boundary box, and +.>Indicating that the jth bounding box predictor is valid for the prediction when there is a target in the ith grid cell, ++>Taking 1; if no object is present in the ith grid cell, ->Take 0, (x) _i ，y _i ) Representing the position of the prediction bounding box,/->Representing the true position of the bounding box output by the training samples.

It can be understood that, for example, an object detection model such as yolo, an input training sample needs to be subjected to image segmentation to obtain I grid cells, and J prediction bounding boxes are obtained when predicting the object position of the training sampleAnd (2) represents an object, which represents detecting an object.

Further, the detection loss function may further include a loss function with respect to the width and height of the prediction bounding box, expressed as:

wherein (1)> Representing the square root of the predicted width and the square root of the predicted height,/-, respectively>The true values representing the square root of the width and the square root of the height output by the training samples (other recurring parameters will not be explained in detail).

It will be appreciated that the above provides a measure of the loss at detection from both the central coordinates of the model predictions and the width and height of the prediction bounding box, wherein the first training feature vector output by the detection module specifically comprises (x _i ，y _i ) Andthe first tag vector comprises in particular->And->The network parameters of the object detection model can be updated more accurately through the detection loss function.

S32: in the model training process, a second training feature vector output by the classification module is obtained, and a preset classification loss function is adopted to calculate the loss between the second training feature vector and a pre-stored second label vector, so that the classification loss is obtained.

The second training feature vector is a feature vector for checking whether the second training feature vector is correct, and represents a real result.

In an embodiment, a predetermined classification loss function is used to calculate a loss between the second training feature vector and the second label vector stored in advance, so as to obtain a classification loss, and the network parameters of the model are updated according to the classification loss. In particular, the classification loss function may be expressed in particular as: wherein I represents grid cells obtained by segmentation during detection, I represents total number of grid cells, and +.>Indicating that there is a target in the ith grid cell,get 1, otherwise->Taking 0, p _i Representing the classification of the prediction->Representing the real-world situation of the classification output by the training samples. Wherein the second training feature vector output by the classification module specifically comprises p _i The second tag vector comprises in particular +.>The network parameters of the object classification model can be updated more accurately through the classification loss function.

S33: in the model training process, a third training feature vector output by the judging module is obtained, and according to the third training feature vector, a preset judging loss function is adopted to calculate and obtain judging loss.

The third training feature vector is the result output by the judging module.

In an embodiment, according to the third training feature vector, a preset discriminant loss function is used to calculate and obtain a discriminant loss, so as to update the network parameters of the model according to the discriminant loss. Specifically, taking a discrimination module including only the second discrimination module as an example: the discriminant loss function can be expressed specifically as: wherein I represents the total number of grid cells, I represents grid cells obtained by division at the time of detection, and D (p _i ) Representing the result of the classification of the prediction output at the discrimination module. The loss generated by the judging module during training can be reflected through the judging loss function, so that the network parameters of the object judging model can be updated more accurately.

It should be understood that, when calculating the detection loss, the classification loss and the discrimination loss, the formulas in steps S31-S33 take the loss of one training sample as an example, and when model training is actually performed on a large number of samples, the detection loss, the classification loss and the discrimination loss of each training sample are respectively added to obtain the total detection loss, the classification loss and the discrimination loss, and the model is updated according to the total detection loss, the classification loss and the discrimination loss.

Steps S31-S33 provide embodiments for obtaining the detection loss, the classification loss, and the discrimination loss, by which the loss generated in the training process can be accurately described, so that the model can be updated more accurately.

S40: and updating the object detection model according to the detection loss, the classification loss and the discrimination loss to obtain a target object detection model.

Further, step S40 specifically includes:

s41: and updating network parameters in the object detection model by adopting a back propagation algorithm according to the detection loss, the classification loss and the discrimination loss.

The back propagation algorithm is a learning algorithm suitable for the multi-layer neuron network under the guidance of a teacher, and is based on a gradient descent method.

In an embodiment, updating the object detection model by using a back propagation algorithm can increase the update speed and improve the training efficiency of model training. Under the condition that the detection loss, the classification loss and the discrimination loss are more in total loss, the back propagation algorithm has a better effect.

S42: and stopping updating the network parameters when the change values of the network parameters are smaller than the iteration stopping threshold value, so as to obtain the target object detection model.

In an embodiment, when the variation values of the network parameters are smaller than the stop iteration threshold, that is, the variation values of the network parameters are within an acceptable error range, the updating process can be stopped, and the training is finished to obtain the target object detection model with higher detection accuracy.

The steps S41-S42 provide an implementation mode for updating the object detection model, so that the updating process can be completed quickly, and the target object detection model with high detection accuracy is obtained.

In the embodiment of the invention, a training sample is obtained, the object detection model is subjected to model training by adopting the training sample, and the object detection model is updated according to the detection loss, the classification loss and the discrimination loss generated by the model training to obtain the target object detection model, and the object detection model is updated by combining the detection loss, the classification loss and the discrimination loss, so that the target object detection model obtained by training has better detection and classification effects.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.

Based on the object detection model training method provided in the embodiment, the embodiment of the invention further provides a device embodiment for realizing the steps and the method in the method embodiment.

Fig. 2 shows a schematic block diagram of an object detection model training apparatus in one-to-one correspondence with the object detection model training method in the embodiment. As shown in fig. 2, the object detection model training apparatus includes a training sample acquisition module 10, a model training module 20, a loss acquisition module 30, and a target object detection model acquisition module 40. The implementation functions of the training sample acquiring module 10, the model training module 20, the loss acquiring module 30, and the target object detection model acquiring module 40 correspond to the steps corresponding to the object detection model training method in the embodiment one by one, and in order to avoid redundancy, the embodiment is not described in detail one by one.

A training sample acquisition module 10, configured to acquire a training sample.

The model training module 20 is configured to input a training sample into an object detection model for model training, where the object detection model includes a detection module, a classification module, and a discrimination module.

The loss acquisition module 30 is configured to obtain the detection loss generated by the detection module, the classification loss generated by the classification module, and the discrimination loss generated by the discrimination module during the model training.

The target object detection model obtaining module 40 is configured to update the object detection model according to the detection loss, the classification loss and the discrimination loss, and obtain a target object detection model.

Optionally, the object detection model training device further includes an object detection model acquisition unit to be processed, a discrimination module adding unit, and an initialization unit.

The device comprises a to-be-processed object detection model acquisition unit, a classification module and a detection module, wherein the to-be-processed object detection model acquisition unit is used for acquiring a to-be-processed object detection model, and the to-be-processed object detection model comprises a detection module and a classification module.

The judging module adding unit is used for adding a judging module into the object detection model to be processed, wherein the judging module is used for judging the result output by the detection module and/or the classification module.

And the initialization unit is used for carrying out model initialization operation on the object detection model to be processed added with the judging module to obtain the object detection model.

Optionally, the model training module 20 includes a feature vector extraction unit, a normalized feature vector acquisition unit, and a model training unit.

And the feature vector extraction unit is used for inputting the training sample and extracting the feature vector of the training sample through the object detection model.

The normalized feature vector obtaining unit is used for carrying out normalization processing on the feature vector to obtain a normalized feature vector, wherein the expression of the normalization processing is as follows: y= (x-MinValue)/(MaxValue-MinValue), y is a normalized feature vector, x is a feature vector, maxValue is a maximum value of feature values in the feature vector, and MinValue is a minimum value of feature values in the feature vector.

And the model training unit is used for carrying out model training on the object detection model according to the normalized feature vector.

Alternatively, the loss acquisition module 30 includes a detection loss acquisition unit, a classification loss acquisition unit, and a discrimination loss acquisition unit.

The detection loss acquisition unit is used for obtaining a first training feature vector output by the detection module in the model training process, calculating the loss between the first training feature vector and a pre-stored first label vector by adopting a preset detection loss function, and obtaining the detection loss.

The classification loss acquisition unit is used for obtaining a second training feature vector output by the classification module in the model training process, calculating the loss between the second training feature vector and a pre-stored second label vector by adopting a preset classification loss function, and obtaining the classification loss.

The judging loss obtaining unit is used for obtaining a third training feature vector output by the judging module in the model training process, and calculating the judging loss by adopting a preset judging loss function according to the third training feature vector.

Alternatively, the target object detection model acquisition module 40 includes a network parameter updating unit and a target object detection model acquisition unit.

And the network parameter updating unit is used for updating the network parameters in the object detection model by adopting a back propagation algorithm according to the detection loss, the classification loss and the discrimination loss.

And the target object detection model acquisition unit is used for stopping updating the network parameters to obtain the target object detection model when the change values of the network parameters are smaller than the iteration stopping threshold.

Fig. 3 shows a flowchart of the object detection method in the present embodiment. The object detection method can be applied to an object detection system, the object detection system can be used for realizing object detection, and the object detection system can be applied to computer equipment. The computer device is a device capable of performing man-machine interaction with a user, and comprises, but is not limited to, a computer, a smart phone, a tablet and the like. As shown in fig. 3, the object detection model training method includes the steps of:

s50: and acquiring an image to be detected.

S60: inputting the image to be detected into a target object detection model to carry out object detection, and obtaining an object detection result of the image to be detected, wherein the target object detection model is obtained by adopting the object detection model training method.

In steps S50-S60, a detection result with higher accuracy can be obtained by detecting the image to be detected by using the target object detection model.

Fig. 4 shows a schematic block diagram of an object detection apparatus in one-to-one correspondence with the object detection method in the embodiment. As shown in fig. 4, the object detection apparatus includes an image acquisition module to be detected 50 and an object detection result acquisition module 60. The implementation functions of the image obtaining module 50 to be detected and the object detection result obtaining module 60 correspond to the steps corresponding to the object detection method in the embodiment one by one, and in order to avoid redundancy, the embodiment is not described in detail one by one.

The image to be detected acquisition module 50 is configured to acquire an image to be detected.

The object detection result obtaining module 60 is configured to input an image to be detected into the object detection model for object detection, so as to obtain an object detection result of the image to be detected, where the object detection model is obtained by using the object detection model training method.

The present embodiment provides a computer readable storage medium, on which a computer program is stored, where the computer program when executed by a processor implements the object detection model training method in the embodiment, or where the computer program when executed by the processor implements the object detection method in the embodiment, and in order to avoid repetition, details are not described herein. Alternatively, the computer program may implement the functions of each module/unit in the object detection model training device in the embodiment when executed by the processor, or the computer program may implement the functions of each module/unit in the object detection device in the embodiment when executed by the processor, which is not described herein in detail for avoiding repetition.

Fig. 5 is a schematic diagram of a computer device according to an embodiment of the present invention. As shown in fig. 5, the computer device 70 of this embodiment includes: the processor 71, the memory 72, and the computer program 73 stored in the memory 72 and executable on the processor 71, the computer program 73 implements the object detection model training method in the embodiment when executed by the processor 71, or the computer program 73 implements the object detection method in the embodiment when executed by the processor 71, and the repetition is not described herein. Alternatively, the computer program 73 may be executed by the processor 71 to implement the functions of each model/unit in the object detection model training device in the embodiment, or the computer program 73 may be executed by the processor 71 to implement the functions of each model/unit in the object detection device in the embodiment, which are not described herein in detail for avoiding repetition.

The computer device 70 may be a desktop computer, a notebook computer, a palm top computer, a cloud server, or the like. Computer device 70 may include, but is not limited to, a processor 71, a memory 72. It will be appreciated by those skilled in the art that fig. 5 is merely an example of a computer device 70 and is not intended to limit the computer device 70, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., a computer device may also include an input-output device, a network access device, a bus, etc.

The processor 71 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 72 may be an internal storage unit of the computer device 70, such as a hard disk or memory of the computer device 70. The memory 72 may also be an external storage device of the computer device 70, such as a plug-in hard disk provided on the computer device 70, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like. Further, the memory 72 may also include both internal storage units and external storage devices of the computer device 70. The memory 72 is used to store computer programs and other programs and data required by the computer device. The memory 72 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. A method of training an object detection model, the method comprising:

obtaining a training sample;

inputting the training samples into an object detection model for model training, wherein the object detection model comprises a detection module, a classification module and a discrimination module, and the discrimination module is used for discriminating the result output by the detection module and the classification module for each training sample;

obtaining detection losses for all the training samples generated by the detection module, classification losses for all the training samples generated by the classification module and discrimination losses for all the training samples generated by the discrimination module in a model training process;

updating the object detection model according to the detection loss, the classification loss and the discrimination loss to obtain a target object detection model;

before the inputting the training sample into the object detection model for model training, the method further comprises:

adding the judging module into the object detection model to be processed;

2. The method of claim 1, wherein the inputting the training sample into an object detection model for model training comprises:

3. The method of claim 1, wherein the deriving detection losses for all of the training samples generated by the detection module, classification losses for all of the training samples generated by the classification module, and discrimination losses for all of the training samples generated by the discrimination module during model training comprises:

in the model training process, a first training feature vector which is output by the detection module and aims at each training sample is obtained, and a preset detection loss function is adopted to calculate the loss between the first training feature vector and a first label vector which is stored in advance, so that the detection loss aiming at all the training samples is obtained;

in the model training process, obtaining a second training feature vector which is output by the classification module and aims at each training sample, and calculating the loss between the second training feature vector and a second label vector which is stored in advance by adopting a preset classification loss function to obtain the classification loss aiming at all the training samples;

and in the model training process, obtaining a third training feature vector which is output by the judging module and aims at each training sample, and calculating the judging loss aiming at all the training samples by adopting a preset judging loss function according to the third training feature vector.

4. A method according to any one of claims 1 to 3, wherein said updating said object detection model based on said detection loss, said classification loss and said discrimination loss to obtain a target object detection model comprises:

5. An object detection method, comprising:

acquiring an image to be detected;

inputting the image to be detected into a target object detection model to carry out object detection to obtain an object detection result of the image to be detected, wherein the target object detection model is obtained by adopting the object detection model training method according to any one of claims 1-4.

6. An object detection model training apparatus, the apparatus comprising:

the training sample acquisition module is used for acquiring training samples;

the model training module is used for inputting the training samples into an object detection model for model training, wherein the object detection model comprises a detection module, a classification module and a discrimination module, and the discrimination module is used for discriminating the result output by the detection module and the classification module for each training sample;

the loss acquisition module is used for obtaining the detection loss of all the training samples generated by the detection module, the classification loss of all the training samples generated by the classification module and the discrimination loss of all the training samples generated by the discrimination module in the model training process;

the target object detection model acquisition module is used for updating the object detection model according to the detection loss, the classification loss and the discrimination loss to obtain a target object detection model;

the apparatus further comprises:

the device comprises a to-be-processed object detection model acquisition unit, a classification module and a classification module, wherein the to-be-processed object detection model acquisition unit is used for acquiring a to-be-processed object detection model;

the judging module adding unit is used for adding the judging module into the object detection model to be processed;

and the initialization unit is used for carrying out model initialization operation on the object detection model to be processed added into the judging module to obtain the object detection model.

7. An object detection device, the device comprising:

the object detection result obtaining module is configured to input the image to be detected into a target object detection model for object detection, so as to obtain an object detection result of the image to be detected, where the target object detection model is obtained by using the object detection model training method according to any one of claims 1-4.

8. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the object detection model training method according to any of claims 1 to 4 when the computer program is executed; alternatively, the processor, when executing the computer program, implements the steps of the object detection method according to claim 5.

9. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the object detection model training method according to any one of claims 1 to 4; alternatively, the computer program is executed by a processor to implement the steps of the object detection method according to claim 5.