CN112418214A

CN112418214A - Vehicle identification code identification method and device, electronic equipment and storage medium

Info

Publication number: CN112418214A
Application number: CN202011232933.5A
Authority: CN
Inventors: 关鹏; 赏宇; 孙玉川
Original assignee: Beijing 58 Information Technology Co Ltd
Current assignee: Beijing Love Car Technology Co ltd
Priority date: 2020-11-06
Filing date: 2020-11-06
Publication date: 2021-02-26
Anticipated expiration: 2040-11-06
Also published as: CN112418214B

Abstract

The invention provides a vehicle identification code identification method and device, electronic equipment and a storage medium. The method comprises the following steps: acquiring an original image to be identified, wherein the original image comprises a vehicle identification code; intercepting the area where the vehicle identification code is located from the original image to obtain a target image, and obtaining a classification identifier of the target image, wherein the classification identifier is used for representing the area attribute of the position, in the vehicle, of the vehicle identification code in the target image; identifying a vehicle identification code contained in the target image through a text identification model suitable for the classification identification; the text recognition model is obtained by training a plurality of sample pictures with known contents, the sample pictures are obtained by randomly combining at least one character sample, the character sample is obtained by intercepting characters in at least one vehicle identification code sample under the classification identification, and the character sample comprises a single character or a plurality of continuous characters. Therefore, the beneficial effects of improving the VIN code identification efficiency and the identification result accuracy are achieved.

Description

Vehicle identification code identification method and device, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of computers, in particular to a vehicle identification code identification method and device, electronic equipment and a storage medium.

Background

The Vehicle Identification Number (VIN), also called frame Number, is determined according to the national Vehicle management standard and includes information such as the manufacturer, the year, the type and the code of the Vehicle body, the engine code and the assembly place of the Vehicle. Correct interpretation of VIN codes is important for us to correctly identify vehicle models so that correct diagnosis and repair can be performed.

In the related art, when vehicle related information is identified based on the VIN code, a user needs to manually input 17-bit VIN code characters, and due to the fact that the irregularity of the VIN code needs to be carefully input by the user, the time spent by the user in the process of filling the VIN code is long, and the error rate is high, so that the identification efficiency and the identification result accuracy of the VIN code are affected.

Disclosure of Invention

The embodiment of the invention provides a vehicle identification code identification method, a vehicle identification code identification device, electronic equipment and a storage medium, and aims to solve the problems of VIN code identification efficiency and identification result accuracy caused by the fact that an existing VIN code needs to be manually input by a user.

In order to solve the technical problem, the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides a vehicle identification code identification method, including:

acquiring an original image to be identified, wherein the original image comprises a vehicle identification code;

intercepting the area where the vehicle identification code is located from the original image to obtain a target image, and obtaining a classification identifier of the target image, wherein the classification identifier is used for representing the area attribute of the position where the vehicle identification code is located in the vehicle in the target image;

identifying a vehicle identification code contained in the target image through a text identification model suitable for the classification identification;

the text recognition model is obtained by training a plurality of sample pictures with known contents, the sample pictures are obtained by randomly combining at least one character sample, the character sample is obtained by intercepting characters in at least one vehicle identification code sample under the classification identification, and the character sample contains a single character or a plurality of continuous characters.

Optionally, the step of capturing an area where the vehicle identification code is located from the original image to obtain a target image, and obtaining a classification identifier of the target image includes:

performing text detection on the original image to locate and intercept the area where the vehicle identification code is located from the original image to obtain a target image;

and classifying the target image to obtain a classification identifier of the target image.

classifying the original image to obtain a classification identifier of the original image, wherein the classification identifier of the original image is used for representing the region attribute of the position of the vehicle identification code in the original image in the vehicle;

performing text detection on the original image through a text detection model matched with the classification identifier to locate and intercept the area where the vehicle identification code is located from the original image to obtain a target image, and taking the classification identifier of the original image as the classification identifier of the target image;

the text detection model is obtained by training a plurality of training samples with known text detection results, and the classification identification of the training samples is consistent with that of the original image.

Optionally, the region attribute comprises at least one of a window glass, a car nameplate, a driver license.

Optionally, the step of recognizing the vehicle identification code included in the target image through a text recognition model adapted to the classification identifier includes:

in response to the region attribute represented by the classification identifier being a window glass or an automobile nameplate, identifying a vehicle identification code contained in the target image through a first text identification model;

and identifying the vehicle identification code contained in the target image through a second text identification model in response to the fact that the region attribute represented by the classification identification is the driving license.

Optionally, before the step of identifying the vehicle identification code included in the target image through the text recognition model adapted to the classification identifier, the method further includes:

aiming at any one classification mark, at least one vehicle identification code sample under the classification mark is obtained;

detecting and intercepting characters in each vehicle identification code sample to obtain a character sample set corresponding to the classification identification, wherein the character sample set comprises a plurality of character samples;

randomly combining to obtain a plurality of sample pictures according to the character samples contained in the character sample set;

and training a text recognition model suitable for the classification identification through the sample picture.

Optionally, the step of detecting and intercepting the characters in each vehicle identification code sample to obtain a character sample set corresponding to the classification identifier includes:

aiming at any one vehicle identification code sample, carrying out graying processing on the vehicle identification code sample, and splitting a target row where the vehicle identification code is located from the vehicle identification code sample subjected to graying processing by a horizontal projection method;

carrying out binarization processing on the target row, and carrying out contour detection and contour denoising on the target row subjected to binarization processing to obtain a character sample contained in the vehicle identification code sample;

and constructing a character sample set of the classification identification according to character samples contained in all vehicle identification code samples under the same classification identification.

Optionally, the lengths of characters included in sample pictures corresponding to the same classification identifier are not completely consistent, and the text recognition model includes a convolutional neural network, a cyclic neural network, and a join-sense time classification network, which are sequentially cascaded.

Optionally, the recurrent neural network comprises a gated recurrent cell network.

In a second aspect, an embodiment of the present invention provides a vehicle identification code recognition apparatus, including:

the system comprises an image acquisition module, a recognition module and a recognition module, wherein the image acquisition module is used for acquiring an original image to be recognized, and the original image comprises a vehicle recognition code;

the VIN code detection module is used for intercepting the area where the vehicle identification code is located from the original image to obtain a target image and acquiring the classification identifier of the target image, wherein the classification identifier is used for representing the area attribute of the position where the vehicle identification code is located in the vehicle in the target image;

a VIN code recognition module, which is used for recognizing the vehicle identification code contained in the target image through a text recognition model suitable for the classification identification;

Optionally, the VIN code detection module includes:

the first VIN code detection submodule is used for carrying out text detection on the original image so as to position and intercept the area where the vehicle identification code is located from the original image to obtain a target image;

and the first classification submodule is used for classifying the target image to obtain a classification identifier of the target image.

Optionally, the VIN code detection module includes:

the second classification submodule is used for classifying the original image to obtain a classification identifier of the original image, wherein the classification identifier of the original image is used for representing the regional attribute of the position of the vehicle identification code in the original image in the vehicle;

the second VIN code detection submodule is used for performing text detection on the original image through a text detection model matched with the classification identifier so as to position and intercept the area where the vehicle identification code is located from the original image to obtain a target image, and the classification identifier of the original image is used as the classification identifier of the target image;

Optionally, the VIN code identification module includes:

the first VIN code recognition submodule is used for recognizing a vehicle recognition code contained in the target image through a first text recognition model in response to the fact that the region attribute represented by the classification identification is the vehicle window glass or the vehicle nameplate;

and the second VIN code recognition submodule is used for recognizing the vehicle recognition code contained in the target image through a second text recognition model in response to the fact that the region attribute represented by the classification identification is the driving license.

Optionally, the apparatus further comprises:

the VIN code sample acquisition module is used for acquiring at least one vehicle identification code sample under any classification identifier;

the character sample set construction module is used for detecting and intercepting characters in each vehicle identification code sample to obtain a character sample set corresponding to the classification identifier, and the character sample set comprises a plurality of character samples;

the sample picture construction module is used for randomly combining to obtain a plurality of sample pictures according to the character samples contained in the character sample set;

and the model training module is used for training a text recognition model suitable for the classification identification through the sample picture.

Optionally, the character sample set constructing module includes:

the VIN code line extraction submodule is used for carrying out graying processing on the vehicle identification code sample aiming at any one vehicle identification code sample and splitting a target line where the vehicle identification code is located from the grayed vehicle identification code sample by a horizontal projection method;

the character sample intercepting submodule is used for carrying out binarization processing on the target row and carrying out contour detection and contour denoising on the target row after binarization processing to obtain a character sample contained in the vehicle identification code sample;

and the character sample set constructing submodule is used for constructing a character sample set of the classification identifier according to the character samples contained in all the vehicle identification code samples under the same classification identifier.

In a third aspect, an embodiment of the present invention additionally provides an electronic device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the vehicle identification code recognition method according to the first aspect.

In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when executed by a processor, the computer program implements the steps of the vehicle identification code identification method according to the first aspect.

In the embodiment of the invention, the region attributes of the positions of the VIN codes in the original image in the vehicle are distinguished, the text recognition models adapted to different region attributes are set, and when the text recognition models are trained, a plurality of sample pictures can be split and combined based on the existing vehicle recognition code samples. Therefore, the beneficial effect of accuracy of the VIN code identification result is obtained.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without inventive labor.

FIG. 1 is a flow chart illustrating steps of a method for identifying a vehicle identification code according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a VIN code identification process for an original image according to an embodiment of the present invention;

FIG. 3 is a flow chart of steps of another vehicle identification code identification method in an embodiment of the present invention;

fig. 4 is a schematic flow chart of intercepting characters in a VIN code sample according to an embodiment of the present invention;

fig. 5 is a schematic flowchart of combining sample pictures based on a character sample set according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a vehicle identification code recognition apparatus according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of another vehicle identification code recognition apparatus according to an embodiment of the present invention;

fig. 8 is a schematic diagram of a hardware structure of an electronic device in the embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, a flow chart illustrating steps of a vehicle identification code recognition method according to an embodiment of the present invention is shown.

Step 110, obtaining an original image to be identified, wherein the original image comprises a vehicle identification code;

step 120, intercepting the area where the vehicle identification code is located from the original image to obtain a target image, and obtaining a classification identifier of the target image, wherein the classification identifier is used for representing the area attribute of the position where the vehicle identification code is located in the vehicle in the target image;

step 130, identifying a vehicle identification code contained in the target image through a text identification model suitable for the classification identification; the text recognition model is obtained by training a plurality of sample pictures with known contents, the sample pictures are obtained by randomly combining at least one character sample, the character sample is obtained by intercepting characters in at least one vehicle identification code sample under the classification identification, and the character sample contains a single character or a plurality of continuous characters.

Compared with a traditional OCR (Optical Character Recognition) scene (print, scanned document, etc.), the Vin code Recognition scene mainly extracts and recognizes text information from images such as a shot photo, a video frame in a video, etc., so that the following challenges are mainly faced: imaging complexity, such as noise, blur, light variation, and appearance; the characters are complex, such as the influence of characters, word sizes, colors, abrasion, random stroke widths, random stroke directions and other factors; the background is complicated, such as the missing layout, the background interference and other factors.

In order to solve the above problem, in the embodiment of the present invention, the Vin code identification process may be optimized in the following two aspects: firstly, in the character line extraction process, the character line information is extracted by using the idea of universal target detection. Secondly, character line recognition, namely the conventional OCR recognition aims at character line recognition for single characters, although the recognition rate is effectively improved by training based on a convolutional neural network and the like, the compatibility is poor for the conditions of character adhesion, blurring, deformation and the like, and accurate recognition cannot be carried out for the conditions of inaccurate segmentation.

Specifically, the original image to be recognized may be any image that can be directly obtained and contains the VIN code, such as a picture taken by an electronic device such as a camera and a mobile phone, a video frame in the taken video, a screenshot of the video or the picture, and the original image may also contain other contents besides the VIN code. For example, the VIN code may be present in the front windshield, the nameplate and the driving license of the automobile, and the front windshield, the nameplate and the driving license of the automobile may also contain other information, so that when the picture is taken for the front windshield, the nameplate or the driving license of the automobile, other information is generally taken.

Therefore, in the embodiment of the invention, in order to improve the accuracy of the VIN code identification result, after the original image to be identified containing the vehicle identification code is obtained, the area where the vehicle identification code is located can be intercepted from the original image to obtain the target image, so that in the subsequent VIN code identification process, only the area where the VIN code is located is identified, and the interference of other information except the VIN code in the original image on the identification result is effectively avoided. Specifically, the area where the vehicle identification code is located may be intercepted from the original image in any available manner to obtain the target image, which is not limited in the embodiment of the present invention. For example, the idea of general object detection can be utilized to extract the text line of the VIN code in the original image, so as to obtain the object image. Specifically, the text line of the VIN code in the original image may be extracted by training a machine learning model for detecting the VIN code in advance, which is not limited in this embodiment of the present invention.

In addition, as mentioned above, the VIN code may exist at different positions such as a front windshield, a nameplate and a driving license of the vehicle, and due to the material at the different positions, the attributes such as the font size, font shape and color of the VIN code, and the attributes such as the background color of the VIN code may be different, and the above differences may affect the accuracy of the identification result of the VIN code. Therefore, in the embodiment of the present invention, different text recognition models may be set for the different positions. In order to assign a suitable text recognition model to the target image extracted from the current original image, a classification identifier of the target image is acquired, wherein the classification identifier is used for characterizing the region attribute of the position of the vehicle identification code in the target image in the vehicle, such as which position is located in the automobile front windshield, the automobile nameplate and the driving license.

In the embodiment of the present invention, the classification identifier of the target image may be obtained in any available manner, which is not limited to this embodiment of the present invention. For example, a classification model trained in advance may obtain the classification identifier of the target image, the classification model may be any one of machine learning models or a combination of multiple machine learning models, and the training sample of the classification model may include the target image under multiple different classification identifiers. For example, the ResNet50 network can be used as a classification model.

After the target image is obtained by intercepting the original image and the classification identifier of the target image is obtained, the vehicle identification code contained in the corresponding target image can be identified through the text identification model suitable for the corresponding classification identifier.

The text recognition model suitable for the corresponding classification mark is obtained by training a plurality of sample pictures with known contents (namely, specific contents of known VIN codes), in practical application, the most difficult training data of the text recognition model is training data, in order to ensure accuracy, a large amount of data is needed for training, for scenes such as window glass and the like, sample collection is difficult, in order to generate a large number of samples based on a small number of samples, characters in at least one vehicle recognition code sample under each classification mark can be intercepted to obtain character samples under the corresponding classification mark, and then the character samples are extracted to be randomly combined to obtain a sample picture for training the text recognition model suitable for the corresponding classification mark, so that a large number of sample pictures are generated based on a small number of vehicle recognition code samples.

In the embodiment of the present invention, the characters in the vehicle identification code sample may be intercepted in any available manner, and the embodiment of the present invention is not limited thereto. Moreover, in the process of intercepting the characters in the vehicle identification code sample, there may be cases of character sticking, etc., so the intercepted character sample may contain a single character or a plurality of continuous characters. In addition, the specific number of the character samples forming the sample picture can be set by self-definition according to requirements, and the specific number can be a fixed value or an indefinite value, and the embodiment of the present invention is not limited.

Fig. 2 is a schematic diagram illustrating a VIN code identification process in an original image. Firstly, the region where the vehicle identification code is located is intercepted from the original image, a target image is obtained, and the classification identifier of the target image is obtained, for example, the region circled in the original image by the rectangular frame in fig. 2 is the region where the target image is located, and then the vehicle identification code contained in the target image can be identified through a text identification model suitable for the classification identifier.

In addition, in practical applications, the VIN code has a certain service rule, for example, the VIN code includes a check bit to check whether the current VIN code is incorrect, so during the VIN code identification process, in order to improve the accuracy of the identification result and reduce the identification error of the VIN, the identification result can be checked for the second time based on the VIN service rule, if the VIN code identification result accords with the VIN service rule, the corresponding VIN code identification result can be confirmed to be the correct VIN code corresponding to the corresponding original image and the VIN code identification result can be output to the subsequent operation, if the VIN code identification result does not conform to the VIN service rule, the above step 120 or step 130 may be returned, and executing the VIN code identification process again, and identifying the VIN code again until a VIN code identification result which accords with the VIN service rule is obtained, or returning the VIN code identification failure result after the preset upper limit times are repeated.

In the embodiment of the invention, the region attributes of the positions of the VIN codes in the original image in the vehicle are distinguished, the text recognition models adapted to different region attributes are set, and when the text recognition models are trained, a plurality of sample pictures can be split and combined based on the existing vehicle recognition code samples. Therefore, the beneficial effects of improving the identification efficiency of the VIN code and the accuracy of the identification result are achieved.

Optionally, in the embodiment of the present invention, the step of capturing the area where the vehicle identification code is located from the original image to obtain the target image, and obtaining the classification identifier of the target image may specifically include:

step A121, performing text detection on the original image to locate and intercept the area where the vehicle identification code is located from the original image to obtain a target image;

step A122, classifying the target image to obtain a classification identifier of the target image.

step B121, classifying the original image to obtain a classification identifier of the original image, wherein the classification identifier of the original image is used for representing the region attribute of the position of the vehicle identification code in the original image in the vehicle;

step B122, performing text detection on the original image through a text detection model matched with the classification identifier to position and intercept the area where the vehicle identification code is located from the original image to obtain a target image, and taking the classification identifier of the original image as the classification identifier of the target image; the text detection model is obtained by training a plurality of training samples with known text detection results, and the classification identification of the training samples is consistent with that of the original image.

In the embodiment of the invention, the original image can be firstly subjected to text detection so as to locate and intercept the area where the vehicle identification code is located from the original image to obtain the target image. Further, the clipped target images are classified, for example, the above-mentioned ResNet50 network can be used to classify the clipped target images, and the classification labels of the target images are obtained. When the original image is subjected to text detection, the original image under different classification marks can be subjected to text detection through the same text detection model, and the original image is not classified, so that the text detection efficiency is improved.

In the embodiment of the present invention, the original image may be subjected to text detection in any available manner, which is not limited in the embodiment of the present invention. For example, text detection may be performed by advanced east, and so on.

In addition, the original image may be classified first, and a classification identifier of the original image is obtained, where the classification identifier of the original image is used to characterize a region attribute of a position of the vehicle identification code in the original image in a vehicle; then, text detection is carried out on the original image through a text detection model matched with the classification identification, so that the area where the vehicle identification code is located is positioned and intercepted from the original image, a target image is obtained, and the classification identification of the original image is used as the classification identification of the target image; the text detection model is obtained by training a plurality of training samples with known text detection results, and the classification identification of the training samples is consistent with that of the original image.

At this time, the original image is classified to obtain the classification identifier, so that when the target image is subjected to text detection, the original image can be subjected to text detection through a text detection model matched with the classification identifier, so that the area where the vehicle identification code is located is positioned and intercepted from the original image to obtain the target image, and the classification identifier of the original image is used as the classification identifier of the target image. Then, in order to adapt to different class labels, text detection models adapted to various class labels may be trained in advance, and the text detection model adapted to each class label may be obtained by training a training sample of a known text detection result under a plurality of corresponding class labels.

The text detection model may be any one of machine learning models or a combination of multiple machine learning models, and the embodiment of the present invention is not limited thereto. For example, the text detection model may be set as a combination of CNN (Convolutional Neural Networks) and RNN (Recurrent Neural Networks), and CNN and RNN are cascaded in sequence.

As can be seen from the above, in the two ways of capturing the area where the vehicle identification code is located from the original image, obtaining the target image, and obtaining the classification identifier of the target image, the first way may only train one text detection model, and the second way needs to train a text detection model suitable for each classification identifier, relatively speaking, the model training cost of the first way is low, but the accuracy of the detection result is easily affected because various classification identifiers cannot be adapted in a targeted manner, while the model training cost of the second way is relatively high, but the accuracy of the detection result is relatively high. In the embodiment of the present invention, any one of the above manners may be selected according to requirements, and the embodiment of the present invention is not limited thereto.

Optionally, in an embodiment of the present invention, the region attribute includes at least one of a window glass, a car nameplate, and a running license.

Referring to fig. 3, in an embodiment of the present invention, the step 130 may further include:

step 131, in response to the fact that the region attribute represented by the classification identifier is a vehicle window glass or a vehicle nameplate, recognizing a vehicle identification code contained in the target image through a first text recognition model;

step 132, in response to that the area attribute represented by the classification identifier is a driving license, identifying a vehicle identification code contained in the target image through a second text identification model.

The automobile nameplate containing the VIN code is also called as an automobile nameplate and is generally arranged at a place which is easy to observe at the front part of the automobile, for example, the automobile nameplate is arranged above a front passenger door in the automobile, and the automobile nameplate containing the VIN code is closer to the VIN code in the window glass compared with a driving certificate made of paper materials. Therefore, in the embodiment of the invention, in order to improve the accuracy of the VIN code recognition result and reduce the training cost of the text recognition model, only the window glass and the automobile nameplate can be applied to the same text recognition model, namely, the first text recognition model, and the driving license can be applied to the other text recognition model, namely, the second text recognition model.

If the region attribute represented by the classification identifier is a window glass or a nameplate of the automobile, the vehicle identification code contained in the target image can be identified through the first text recognition model, and if the region attribute represented by the classification identifier is a driving license, the vehicle identification code contained in the target image can be identified through the second text recognition model.

Correspondingly, combining the character samples of the sample pictures for training the first text recognition model may include intercepting characters in at least one vehicle identification code sample under the window glass and the automobile nameplate, and combining the character samples of the sample pictures for training the second text recognition model may include intercepting characters in at least one vehicle identification code sample under the driving license.

The first text recognition model and the second text recognition model are both a text recognition model, and the model structures of the first text recognition model and the second text recognition model may be the same or different, which is not limited in the embodiment of the present invention.

Referring to fig. 3, in the embodiment of the present invention, before step 130, the method may further include:

step 10, aiming at any one classification mark, obtaining at least one vehicle identification code sample under the classification mark;

step 20, detecting and intercepting characters in each vehicle identification code sample to obtain a character sample set corresponding to the classification identifier, wherein the character sample set comprises a plurality of character samples;

step 30, randomly combining to obtain a plurality of sample pictures according to the character samples contained in the character sample set;

and step 40, training a text recognition model suitable for the classification identification through the sample picture.

As described above, in order to improve the accuracy of the VIN code recognition result, text recognition models adapted to various classification identifiers may be trained respectively. Then, sample pictures suitable for training each text recognition model need to be respectively constructed, and in order to improve the diversity of the sample pictures, a character sample set corresponding to each classification identifier may be respectively set.

Specifically, for any classification identifier, at least one vehicle identification code sample under the corresponding classification identifier can be obtained, and then characters in each vehicle identification code sample are detected and intercepted, so that a character sample set corresponding to the classification identifier is obtained, wherein the character sample set comprises a plurality of character samples. In the embodiment of the present invention, the vehicle identification code sample under each category identifier may be obtained in any available manner, which is not limited to the embodiment of the present invention.

The vehicle identification code sample may be understood as a sample of a complete VIN code, or may be a sample of a partial segment in one VIN code, and the embodiment of the present invention is not limited thereto. For example, generally speaking, the VIN code is composed of seventeen characters, and the vehicle identification code sample may include a complete VIN code composed of seventeen characters, or may include an incomplete VIN code composed of less than seventeen characters, that is, a partial character in a VIN code.

Furthermore, in the embodiment of the present invention, the characters in each of the vehicle identification code samples may be detected and intercepted in any available manner, which is not limited to this embodiment of the present invention. And when detecting the characters, the detected characters can be marked by the rectangular frame, and then character interception is carried out according to the marked rectangular frame to obtain a character sample.

When the text recognition model adapted to a certain classification identifier is trained, a plurality of sample pictures can be obtained by random combination according to the character samples contained in the character sample set corresponding to the corresponding classification identifier, and then the text recognition model adapted to the classification identifier is trained through the corresponding sample pictures. In addition, in the process of combining the sample pictures, since the specific characters contained in the sample pictures can be simultaneously identified in the character detection process, when the sample pictures are combined, the characters contained in the sample pictures can be known under the condition that the character samples forming the sample pictures are known.

Certainly, in the embodiment of the present invention, the text recognition model applicable to the corresponding classification identifier may be trained only by using the above-mentioned character sample to recombine the sample picture for a part of the classification identifiers (for example, the classification identifier characterizing the vehicle front windshield whose VIN code is difficult to collect, that is, a vehicle window glass), and for other classification identifiers (for example, the classification identifier characterizing the driving license whose VIN code is easy to collect), the text recognition model applicable to the corresponding classification identifier may be directly trained by using the VIN code sample of the known character, which is not limited in the embodiment of the present invention.

Moreover, if a plurality of class identifiers are applicable to the same text recognition model, when the text recognition model is trained, the text recognition model can be trained through sample pictures under the corresponding plurality of class identifiers.

Optionally, in an embodiment of the present invention, the step 20 may further include:

step 21, aiming at any one vehicle identification code sample, carrying out graying processing on the vehicle identification code sample, and splitting a target row where the vehicle identification code is located from the vehicle identification code sample after graying processing by a horizontal projection method;

step 22, performing binarization processing on the target row, and performing contour detection and contour denoising on the binarized target row to obtain a character sample contained in the vehicle identification code sample;

and step 23, constructing a character sample set of the classification identifier according to character samples contained in all vehicle identification code samples under the same classification identifier.

In practical applications, the vehicle identification code sample may only include the vehicle identification code, or may include other reference information related to the vehicle identification code, such as a barcode, and in general, the vehicle identification code is a single line of characters, and when detecting the character sample included in the vehicle identification code sample, in order to improve the accuracy of the detection result, and thus improve the accuracy of the character sample, the target row where the vehicle identification code is located may be first split from the vehicle identification code sample.

By way of a behavior example, specifically, for any one vehicle identification code sample, graying processing may be performed on the vehicle identification code sample, and a target row where the vehicle identification code is located may be split from the grayed vehicle identification code sample by a horizontal projection method.

The so-called horizontal projection method assumes that there are many horizontal lines on the text image, some lines passing through the text region and some lines passing between the text lines. The number of black pixels (the text part is black) encountered when each line passes through the image is recorded to obtain a value, and a curve graph is obtained as the value of the line in the y coordinate, wherein the length of each point in the curve graph represents the number of black pixels in the y coordinate. In the text area, there is a word and therefore a value. In the blank area between lines of text, the value is 0 because there are no words. The resulting image will be a segment of valued, a segment of 0, a segment of valued, a segment of 0. Thus, the values can be traversed, encountering 0 indicating interline. So if after encountering a value (line of text), 0 (interline) is encountered and then a value (line of text) is encountered, this indicates that the image is a multi-line text, otherwise, it is not multi-line. Meanwhile, the positions of the segmentation points between the text lines can be judged according to the y coordinate points which are 0, and segmentation can be performed.

In addition, before the horizontal projection, morphological processing can be performed on the text image of the vehicle identification code sample, and the most common method is corrosion and expansion. The corrosion is to shrink the color area in the image to a certain degree to make the frizzy part of the edge of the color area 'rounded', and to shrink the characters to a certain degree when the color area is used on the characters, so that the dense characters are not mixed with each other. The expansion is to expand the color area in the image to a certain extent, so that the small holes in the image are filled, and the characters can be changed into whole blocks to a certain extent when being used on the characters. In addition, the open operation and the closed operation are actually combined with corrosion and expansion.

In addition, in order to detect characters conveniently, binarization processing can be performed on the target line, and binarization of the image is to set the gray value of a pixel point on the image to be 0 or 255, that is, the whole image has an obvious visual effect of only black and white. One image includes a target object, a background and noise, and in order to directly extract the target object from a multi-valued digital image, the most common method is to set a threshold T, and divide the data of the image into two parts by T: pixel groups larger than T and pixel groups smaller than T. This is the most specific method of studying gray scale transformation, called BINARIZATION (BINARIZATION) of the image. The value of T may be set by user according to requirements, which is not limited in this embodiment of the present invention. In the embodiment of the present invention, the vehicle identification code sample may be an image, and the target line split from the image may be understood as an image containing only the target line.

And performing contour detection and contour denoising on the target row subjected to binarization processing to obtain a character sample contained in the vehicle identification code sample. In the embodiment of the present invention, the contour detection and the contour denoising may be performed in any available manner, which is not limited in the embodiment of the present invention.

After intercepting the character sample containing a single character or continuous characters from the target line, before generating the character sample set, a correction standard can be manually carried out on each character sample, and the character sample set is formed by the corrected character samples.

Fig. 4 is a schematic flow chart illustrating a process of intercepting characters in VIN code samples by opencv or the like. At this time, the input VIN code sample contains VIN codes and bar codes, and then the VIN code sample can be subjected to graying, horizontal projection splitting, self-adaptive binarization, contour detection, contour denoising and other processing in sequence, so that the character sample containing single characters or continuous characters is obtained.

Optionally, in an embodiment of the present invention, the lengths of characters included in sample pictures corresponding to the same classification identifier are not completely consistent, and the text recognition model includes a convolutional neural network, a cyclic neural network, and a join-sense time classification network, which are sequentially cascaded.

In practical applications, the number of characters contained in the VIN code is limited, that is, the number of characters is also a recognition target of the text recognition model. When a text recognition model for recognizing the VIN code is trained, if the numbers of characters included in the adopted sample pictures are the same, the sensitivity of the trained text recognition model to the number of characters is easily affected, and when other non-VIN codes composed of a plurality of characters are input, the non-VIN codes can be recognized as the VIN codes, so that the accuracy of the VIN code recognition result is affected.

Therefore, in the embodiment of the present invention, in order to improve the sensitivity of the text recognition model to the number of characters, when constructing a sample picture corresponding to any classification identifier, sample pictures with different lengths and different contents can be generated by different combinations based on the character samples in the character sample set corresponding to the classification identifier, that is, the lengths of characters included in the sample pictures corresponding to the same classification identifier are not completely the same. In addition, when the text recognition model is trained, positive samples with the character length meeting the VIN code requirement can be marked, and negative samples can be marked in other cases.

Fig. 5 is a schematic flow chart of generating sample pictures with different lengths and different contents by different combinations based on a character sample set. The upper dotted line rectangular frame is a schematic diagram of a character sample contained in a character sample set, and the lower dotted line rectangular frame is a schematic diagram of a sample picture obtained based on a character sample combination shown by the upper dotted line rectangular frame.

Therefore, the sample picture finally achieves the advantages of random content, random length, rich sample amount and the like. In addition, in the embodiment of the present invention, sample enhancement processing, such as flip transformation, random clipping, scale transformation, contrast transformation, noise disturbance, and the like, may be performed on the combined sample picture to further improve the diversity of the sample picture and improve the model training effect.

The text recognition model may include a convolutional neural network CNN, a recurrent neural network RNN, and a join-sense time classification network, which are sequentially cascaded. That is, the text recognition model may adopt a CTC (connection semantic Temporal Classification) loss function, and the text recognition model may include L convolutional neural networks CNN, M cyclic neural networks RNN, and N connection semantic Temporal Classification networks which are sequentially cascaded, where L, N, M is a positive integer, and a specific value of L, N, M may be set by a user according to a requirement, which is not limited in the embodiment of the present invention.

Optionally, in an embodiment of the present invention, the recurrent neural network comprises a gated recurrent unit network.

In practical applications, an LSTM (Long Short Term Memory) network, a GRU (Gated Recurrent Unit) network, and the like are all cyclic neural networks, and GRU parameters are less, so convergence is easier.

Referring to fig. 6, a schematic structural diagram of a vehicle identification code recognition device according to an embodiment of the present invention is shown.

The vehicle identification code recognition device of the embodiment of the invention comprises: an image acquisition module 210, a VIN code detection module 220, and a VIN code identification module 230.

The functions of the modules and the interaction relationship between the modules are described in detail below.

The image obtaining module 210 is configured to obtain an original image to be identified, where the original image includes a vehicle identification code;

a VIN code detection module 220, configured to intercept a region where a vehicle identification code is located from the original image, obtain a target image, and obtain a classification identifier of the target image, where the classification identifier is used to characterize a region attribute of a position, in a vehicle, of the vehicle identification code in the target image;

a VIN code identification module 230, configured to identify a vehicle identification code included in the target image through a text recognition model applicable to the classification identifier;

Optionally, the VIN code detection module 220 may further include:

Referring to fig. 7, in the embodiment of the present invention, the VIN code identification module 230 further includes:

the first VIN code identification submodule 231 is configured to identify, through a first text identification model, a vehicle identification code included in the target image in response to that the region attribute represented by the classification identifier is a window glass or an automobile nameplate;

and the second VIN code identification submodule 232 is configured to identify, through a second text identification model, a vehicle identification code included in the target image in response to that the area attribute represented by the classification identifier is a driving license.

Referring to fig. 7, in the embodiment of the present invention, the apparatus may further include:

a VIN code sample obtaining module 310, configured to obtain, for any classification identifier, at least one vehicle identification code sample under the classification identifier;

the character sample set construction module 320 is configured to detect and intercept characters in each vehicle identification code sample to obtain a character sample set corresponding to the classification identifier, where the character sample set includes a plurality of character samples;

the sample picture construction module 330 is configured to randomly combine to obtain a plurality of sample pictures according to the character samples included in the character sample set;

and the model training module 340 is configured to train a text recognition model suitable for the classification identifier through the sample picture.

Optionally, in this embodiment of the present invention, the character sample set constructing module 320 may further include:

The vehicle identification code recognition device provided by the embodiment of the invention can realize each process realized in the method embodiments of fig. 1 and fig. 3, and is not repeated here to avoid repetition.

Preferably, an embodiment of the present invention further provides an electronic device, including: the processor, the memory, and the computer program stored in the memory and capable of running on the processor, when executed by the processor, implement the processes of the above-mentioned vehicle identification code identification method embodiment, and can achieve the same technical effects, and are not described herein again to avoid repetition.

The embodiment of the invention also provides a computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when being executed by a processor, the computer program realizes each process of the vehicle identification code identification method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

Fig. 8 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present invention.

The electronic device 500 includes, but is not limited to: a radio frequency unit 501, a network module 502, an audio output unit 503, an input unit 504, a sensor 505, a display unit 506, a user input unit 507, an interface unit 508, a memory 509, a processor 510, and a power supply 511. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 8 does not constitute a limitation of the electronic device, and that the electronic device may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. In the embodiment of the present invention, the electronic device includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer, and the like.

It should be understood that, in the embodiment of the present invention, the radio frequency unit 501 may be used for receiving and sending signals during a message sending and receiving process or a call process, and specifically, receives downlink data from a base station and then processes the received downlink data to the processor 510; in addition, the uplink data is transmitted to the base station. In general, radio frequency unit 501 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 501 can also communicate with a network and other devices through a wireless communication system.

The electronic device provides wireless broadband internet access to the user via the network module 502, such as assisting the user in sending and receiving e-mails, browsing web pages, and accessing streaming media.

The audio output unit 503 may convert audio data received by the radio frequency unit 501 or the network module 502 or stored in the memory 509 into an audio signal and output as sound. Also, the audio output unit 503 may also provide audio output related to a specific function performed by the electronic apparatus 500 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 503 includes a speaker, a buzzer, a receiver, and the like.

The input unit 504 is used to receive an audio or video signal. The input Unit 504 may include a Graphics Processing Unit (GPU) 5041 and a microphone 5042, and the Graphics processor 5041 processes image data of a still picture or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 506. The image frames processed by the graphic processor 5041 may be stored in the memory 509 (or other storage medium) or transmitted via the radio frequency unit 501 or the network module 502. The microphone 5042 may receive sounds and may be capable of processing such sounds into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 501 in case of the phone call mode.

The electronic device 500 also includes at least one sensor 505, such as light sensors, motion sensors, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 5061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 5061 and/or a backlight when the electronic device 500 is moved to the ear. As one type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of an electronic device (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), and vibration identification related functions (such as pedometer, tapping); the sensors 505 may also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which are not described in detail herein.

The display unit 506 is used to display information input by the user or information provided to the user. The Display unit 506 may include a Display panel 5061, and the Display panel 5061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

The user input unit 507 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 507 includes a touch panel 5071 and other input devices 5072. Touch panel 5071, also referred to as a touch screen, may collect touch operations by a user on or near it (e.g., operations by a user on or near touch panel 5071 using a finger, stylus, or any suitable object or attachment). The touch panel 5071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 510, and receives and executes commands sent by the processor 510. In addition, the touch panel 5071 may be implemented in various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. In addition to the touch panel 5071, the user input unit 507 may include other input devices 5072. In particular, other input devices 5072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein.

Further, the touch panel 5071 may be overlaid on the display panel 5061, and when the touch panel 5071 detects a touch operation thereon or nearby, the touch operation is transmitted to the processor 510 to determine the type of the touch event, and then the processor 510 provides a corresponding visual output on the display panel 5061 according to the type of the touch event. Although in fig. 8, the touch panel 5071 and the display panel 5061 are two independent components to implement the input and output functions of the electronic device, in some embodiments, the touch panel 5071 and the display panel 5061 may be integrated to implement the input and output functions of the electronic device, and is not limited herein.

The interface unit 508 is an interface for connecting an external device to the electronic apparatus 500. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 508 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the electronic apparatus 500 or may be used to transmit data between the electronic apparatus 500 and external devices.

The memory 509 may be used to store software programs as well as various data. The memory 509 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 509 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The processor 510 is a control center of the electronic device, connects various parts of the whole electronic device by using various interfaces and lines, performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 509 and calling data stored in the memory 509, thereby performing overall monitoring of the electronic device. Processor 510 may include one or more processing units; preferably, the processor 510 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 510.

The electronic device 500 may further include a power supply 511 (e.g., a battery) for supplying power to various components, and preferably, the power supply 511 may be logically connected to the processor 510 via a power management system, so as to implement functions of managing charging, discharging, and power consumption via the power management system.

In addition, the electronic device 500 includes some functional modules that are not shown, and are not described in detail herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A vehicle identification code recognition method, comprising:

2. The method according to claim 1, wherein the step of capturing the area where the vehicle identification code is located from the original image to obtain a target image and obtaining the classification identifier of the target image comprises:

3. The method according to claim 1, wherein the step of capturing the area where the vehicle identification code is located from the original image to obtain a target image and obtaining the classification identifier of the target image comprises:

4. The method of claim 1, wherein the regional attribute comprises at least one of a window glass, a vehicle nameplate, and a driver license.

5. The method of claim 4, wherein the step of identifying the vehicle identification code contained in the target image by a text recognition model adapted to the classification identifier comprises:

6. The method according to any one of claims 1-5, further comprising, before the step of identifying a vehicle identification code contained in the target image by a text recognition model adapted to the classification identification:

7. The method according to claim 6, wherein the step of detecting and intercepting the characters in each vehicle identification code sample to obtain the set of character samples corresponding to the classification identifier comprises:

8. The method according to any one of claims 1 to 5, wherein the lengths of characters contained in sample pictures corresponding to the same classification mark are not completely consistent, and the text recognition model comprises a convolutional neural network, a cyclic neural network and a join-sense time classification network which are sequentially cascaded.

9. The method of claim 8, wherein the recurrent neural network comprises a gated recurrent cell network.

10. A vehicle identification code recognition apparatus, comprising:

11. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, which computer program, when being executed by the processor, carries out the steps of the vehicle identification code recognition method according to one of claims 1 to 9.

12. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of the vehicle identification code recognition method according to one of claims 1 to 9.