Disclosure of Invention
The invention provides a curved surface QR code positioning method based on an SSD network model, aiming at solving the problem that the existing QR code positioning method is difficult to accurately position the area where a QR code is located in an image under the conditions of complex environment and curved surface distortion of the QR code.
In order to achieve the above purpose, the technical means adopted is as follows:
a curved surface QR code positioning method based on an SSD network model comprises the following steps:
s1, establishing a curved surface QR code data set and preprocessing the curved surface QR code data set;
s2, building a PPN-SSD frame, wherein the PPN-SSD frame takes an inclusion network as a basic network, and an auxiliary network layer is added to serve as a feature extraction layer and a classification layer of the network;
s3, performing model training by using the PPN-SSD frame to obtain a curved surface QR code positioning model;
and S4, positioning the curved surface QR code by using the trained curved surface QR code positioning model.
Preferably, the preprocessing in step S1 specifically includes: labeling the curved surface QR code dataset, and respectively filling the number, the length, the width, the target category and the target position information of each image into a csv format file, so as to convert the curved surface QR code dataset into the csv format file; and dividing the converted file data into a training set and a testing set.
Preferably, in the step S2, the PPN-SSD framework performs feature extraction by using a pre-trained model of the inclusion v3 network on the ImageNet dataset as a base network.
Preferably, in the step S2, the feature extraction layer of the PPN-SSD framework includes a convolutional layer "mixed 1_ c" layer and five maximum pooling layers "max _ pool _ 1" layer, "max _ pool _ 2" layer, "max _ pool _ 3" layer, "max _ pool _ 4" layer and "max _ pool _ 5" layer.
Preferably, in the step S2, the classification layer of the PPN-SSD framework employs a convolution shared across all scales to predict the category score and the candidate box position information of the target image.
Preferably, the step S3 specifically includes:
inputting the training set as training input data, and performing data augmentation operation on the training set;
initializing the network: using a pre-training model obtained by training the inclusion v3 network on the ImageNet data set as an initial value of the network parameter to obtain a basic network;
after passing through the middle layer of the basic network, performing feature extraction by using the five maximum pooling layers;
based on the extracted feature map, predicting the category score and the candidate frame position information of the target image through the classification layer to obtain a predicted target frame;
calculating the error between the predicted target frame and the real target frame of the target image, and then updating the weight of an error function by a back propagation algorithm;
and performing iterative training until the loss value of the error function is smaller than a preset threshold value or the maximum iteration times is reached, and stopping training to obtain the curved surface QR code positioning model.
Preferably, in the step S3, the data augmentation operation includes: randomly selecting the images of the training set to adjust the contrast, brightness and saturation, rotating, twisting, cutting or shielding the images in the training set, adding Gaussian noise to the images in the training set, and turning the images in the training set up and down or turning the images in a mirror image mode.
Preferably, in step S3, the error function is specifically:
where N is the number of matching predicted target boxes, LconfIs a loss of confidence, LlocIs the loss of position, x represents the target object, c represents the confidenceDegree, l represents the predicted target frame, g represents the real target frame, and α is the weight occupied by the positioning loss.
Preferably, the step S4 further includes:
carrying out curved surface QR code positioning on the test set by using a trained curved surface QR code positioning model, and setting IoU coefficient threshold values, namely predicting the area intersection ratio between the target frame and the real target frame by using the detection result; if the IoU coefficient of the detection result is larger than the set IoU coefficient threshold value, outputting the detection result; otherwise, the detection result is discarded.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the method is designed aiming at the curved surface QR code positioning in the complex environment, adopts a forward-propagation deep convolutional neural network PPN-SSD frame, and can extract the semantic features of deep images in a self-adaptive manner according to object categories due to the strong feature learning capability, so that the target detection algorithm has good adaptability and robustness. The curved surface QR code positioning model can accurately position the area of the QR code in the image under the influence of factors such as disordered background, over-long distance between image acquisition equipment and the QR code, partial shielding, poor illumination condition, motion blur and the like, and has strong adaptability when the QR code has curved surface distortion. The method solves the problem that the existing algorithm is difficult to position the curved surface QR code, considers the influence of complex environmental factors, has high positioning accuracy and high detection speed, and realizes the real-time detection of the curved surface QR code.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
A curved surface QR code positioning method based on SSD network model, as shown in fig. 1 and 2, includes the following steps:
s1, establishing a curved surface QR code data set and preprocessing the curved surface QR code data set;
because the curved surface QR code positioning method provided by this embodiment considers the influence of the complex environment factors, in this step, an image data set that includes various complex environments as much as possible is established, that is, a curved surface QR code image is captured in complex environments such as a cluttered background, an excessively long distance between an image acquisition device and the QR code, a partial occlusion, a poor illumination condition, and a motion blur, and the file name of the image data is named as the format from 10001. jpg.
Marking the established curved surface QR code dataset, and respectively filling the number, the length, the width, the target category and the target position information of each image into a csv format file, thereby converting the curved surface QR code dataset into the csv format file; and dividing the converted file data into a training set and a testing set.
S2, building a PPN-SSD frame, wherein the PPN-SSD frame takes an inclusion network as a basic network, and an auxiliary network layer is added to serve as a feature extraction layer and a classification layer of the network;
the PPN (Powing Pyramid network) -SSD (Single Shot Multi Box Detector) framework is a forward-propagation deep convolutional neural network, a series of default prediction boxes with different sizes and proportions are generated on each point of each extracted feature map to obtain a target candidate region, and then a convolution shared on all scales is used for predicting category scores and candidate box position information. The built PPN-SSD framework is composed of a network layer of an inclusion network model, a convolution layer and five pooling layers. The concrete construction is as follows:
a basic network stage: the dimension of input data is 480 × 3, after the data is input, the data is processed by an inclusion network, namely, a series of stage operations such as a convolution layer, a Batch Normalization layer, a ReLU layer and a pooling layer are performed, and the size of an output feature map is 35 × 35 × 512;
and (3) a maximum pooling stage: the stage comprises five maximum pooling layers, namely a "max _ pool _ 1" layer, a "max _ pool _ 2" layer, a "max _ pool _ 3" layer, a "max _ pool _ 4" layer and a "max _ pool _ 5" layer, wherein the maximum pooling operation of each layer is that the filter size is 2 × 2, the step size is 2, and the output feature map sizes are 17 × 17 × 512, 8 × 8 × 512, 4 × 4 × 512, 2 × 2 × 512 and 1 × 1 × 512 respectively;
in the PPN-SSD framework of this embodiment, a convolutional layer "mixed 1_ c" layer and five maximum pooling layers "max _ pool _ 1" layer, "max _ pool _ 2" layer, "max _ pool _ 3" layer, "max _ pool _ 4" layer and "max _ pool _ 5" layer are selected as feature extraction layers.
And generating a series of default prediction frames with different sizes and proportions on each point of each extracted feature map to obtain a target candidate region.
And predicting the obtained target candidate region by using convolution shared on all scales to obtain category scores and candidate frame position information, eliminating redundant prediction frames, and obtaining a detection result with the score being larger than a set threshold value.
The classification layer of the PPN-SSD framework of the present embodiment employs one convolution shared across all scales to predict the category score and candidate box position information of the target image.
S3, performing model training by using the PPN-SSD frame to obtain a curved surface QR code positioning model; in the training stage, firstly, matching a default prediction frame with a marked real target frame by the algorithm, and if the matching value is greater than a set threshold value, determining that the matching is successful, wherein the prediction target frame is a positive sample; otherwise, it is a negative sample. As shown in fig. 3, the specific training process is as follows:
inputting the training set as training input data, and performing data augmentation operation on the training set; wherein the data augmentation operation comprises: randomly selecting images of the training set to adjust the contrast, brightness and saturation, rotating, distorting, cutting or shielding the images in the training set, adding Gaussian noise to the images in the training set, and turning the images in the training set up and down or turning the images in a mirror image manner, so that the data set is improved.
Initializing the network: using a pre-training model obtained by training the inclusion v3 network on the ImageNet data set as an initial value of the network parameter to obtain a basic network;
after passing through the middle layer of the basic network, performing feature extraction by using the five maximum pooling layers;
based on the extracted feature map, predicting the category score and the candidate frame position information of the target image through the classification layer to obtain a predicted target frame;
calculating the error between the predicted target frame and the real target frame of the target image, and then updating the weight of an error function by a back propagation algorithm;
performing iterative training until the loss value of the error function is smaller than a preset threshold or the maximum iteration times is reached, and stopping training to obtain an optimal network model, namely a curved surface QR code positioning model;
where the loss of the PPN-SSD framework consists of two parts, localization and classification, the error function is thus defined as a weighted sum of localization error and confidence error. The calculation method is as follows:
where N is the number of matching predicted target boxes, LconfIs a loss of confidence, LlocThe method comprises the following steps of determining the positioning loss, wherein x represents a target object, c represents confidence coefficient, l represents a predicted target frame, g represents a real target frame, and alpha is the weight occupied by the positioning loss.
S4, positioning the curved surface QR code by using the trained curved surface QR code positioning model;
in this embodiment, a curved surface QR code positioning test is performed on the test set by using a trained curved surface QR code positioning model. And directly outputting the detection result of the curved surface QR code positioning model on the test set so as to evaluate the performance of the network. In this embodiment, the coefficient threshold IoU is set to 0.5, and if the coefficient IoU of the detection result (as shown in fig. 4, i.e. the intersection ratio of the regions between the predicted target frame and the actual target frame of the detection result) is greater than the set coefficient threshold IoU, the detection result is output; otherwise, the detection result is discarded.
Through tests, the curved surface QR code positioning method can cope with various complex environment conditions, such as situations of background clutter, too far distance between image acquisition equipment and a QR code, partial shielding, poor illumination condition, motion blur and the like, realizes real-time detection, achieves higher positioning accuracy, and has better adaptability and robustness.
The terms describing positional relationships in the drawings are for illustrative purposes only and are not to be construed as limiting the patent;
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.