CN112288372A

CN112288372A - Express bill identification method capable of simultaneously identifying one-dimensional bar code and three-section code character

Info

Publication number: CN112288372A
Application number: CN202011227735.XA
Authority: CN
Inventors: 赵楠楠; 邱林; 魏玉飞; 赵一帆; 张锋; 陈智博
Original assignee: Liaoning Heibeijian Technology Co ltd
Current assignee: Liaoning Heibeijian Technology Co ltd
Priority date: 2020-11-06
Filing date: 2020-11-06
Publication date: 2021-01-29
Anticipated expiration: 2040-11-06
Also published as: CN112288372B

Abstract

The invention provides an express bill identification method capable of simultaneously identifying one-dimensional bar codes and three-section code characters, which comprises the following steps: step one, image acquisition of an express bill; the surface of each express delivery piece is upward, the express delivery pieces are placed in the middle of the trolley, and only one express delivery piece is placed on each trolley; the industrial camera is fixed at a certain height from the plane of the trolley; when the trolley carrying the express delivery piece moves to a specified position, capturing an image and transmitting the image to the industrial personal computer through the Ethernet; step two, roughly positioning the express bill; thirdly, performing rectangular bar code positioning and inclination angle correction on the express bill; step four, positioning three sections of code characters under multiple scales; step five, three-section code character recognition; the method carries out inclination angle correction on the express delivery surface single picture with the complex background; positioning a one-dimensional bar code on the express bill; positioning and identifying three-segment code characters (hereinafter referred to as three-segment codes) on the express bill; all algorithms recognize speeds less than 150ms overall.

Description

Express bill identification method capable of simultaneously identifying one-dimensional bar code and three-section code character

Technical Field

The invention relates to the technical field of sorting express, in particular to an express bill identification method capable of identifying one-dimensional bar codes and three-section code characters simultaneously.

Background

With the development of electronic commerce, the consumption habits of people are greatly changed, the logistics service requirements are gradually increased, and the logistics automation level is promoted to be continuously improved. At present, the sorting of domestic logistics industry goods only adopts the automatic sorting mode at big letter sorting center, and little letter sorting center generally adopts the manual mode of carrying out manual sorting to the commodity circulation product one by one, and is inefficient, and the reliability is not high, can't satisfy present growing commodity circulation's demand. In order to reduce the defects of low sorting efficiency, high hiring cost, damage to express delivery and the like caused by the traditional manual sorting of the express delivery, the demand on a small automatic express delivery sorting system is more and more urgent.

Typically, sorting of couriers is based on automatic or manual identification of information on a courier slip. Here, the express waybill information includes three code characters (generally composed of printed numbers and capital english letters) and a one-dimensional bar code, which both indicate codes of an issuing location and a receiving location of the express, as shown in fig. 1. Generally, automatic sorting in domestic large sorting centers relies on image recognition technology to recognize one-dimensional bar codes, however, three-segment code characters are also an important information to be recognized, but are hardly used at present. Moreover, the small sorting center is limited by the field area and the cost, and most of the small sorting centers are used for manually identifying three sections of code characters on the express bill at present. Therefore, the research on how to simultaneously identify the one-dimensional bar code and the three-segment code character is an important development direction for improving the automatic express sorting efficiency no matter for a large sorting center or a small sorting station.

Disclosure of Invention

In order to solve the technical problem provided by the background technology, the invention provides an express waybill identification method capable of simultaneously identifying a one-dimensional bar code and three-segment code characters, and the method is used for correcting the inclination angle of an express waybill picture with a complex background; positioning a one-dimensional bar code on the express bill; positioning and identifying three-segment code characters (hereinafter referred to as three-segment codes) on the express bill; all algorithms recognize speeds less than 150ms overall.

In order to achieve the purpose, the invention adopts the following technical scheme:

an express bill identification method capable of simultaneously identifying one-dimensional bar codes and three-section code characters comprises the following steps:

step one, image acquisition of an express bill; the surface of each express delivery piece is upward, the express delivery pieces are placed in the middle of the trolley, and only one express delivery piece is placed on each trolley; the industrial camera is fixed at a height of 1.3-1.5m from the plane of the trolley; when the trolley carrying the express delivery piece moves to a specified position, capturing an image and transmitting the image to the industrial personal computer through the Ethernet;

step two, roughly positioning the express bill;

firstly, preliminarily positioning an express bill on an acquired image; finding out a maximum rectangle capable of tightly sleeving the outline by binarization, morphological operation and outline calculation of the image; performing Sobel edge detection on an image contained in the maximum rectangle, performing binarization, and calculating the number of edge points, wherein if the number of the edge points is more than 3000, the maximum rectangle is considered to be sufficiently complex, and the region is the rough position of the express waybill;

then, utilizing anti-reflection transformation to correct the inclination angle of the maximum rectangle to obtain a candidate image to be identified; the area to be identified is only the result of coarse positioning, and further inclination angle correction and waybill information positioning are needed;

thirdly, performing rectangular bar code positioning and inclination angle correction on the express bill;

the processing process comprises Sobel edge detection, morphological operation, binarization and region growing algorithm;

1) firstly, carrying out edge detection on an image in the horizontal and vertical directions by using a Sobel operator, and then adding the two results to better highlight edge information in the image; combining the characteristics of self-striping, blocking and dense distribution of the bar code, firstly using morphological closing operation with larger structural elements (such as 5 pixels) to change a bar code part on an image into a communicated area so that the characteristics of the bar code are more prominent, and then using morphological closing operation with smaller structural elements (such as 3 pixels) to remove the interference of some simple single lines on the image;

2) many tiny holes appear in the image after the opening and closing operation, and the interference is generated on the subsequent operation, so that the hole filling is performed;

3) calculating the minimum outsourcing rectangle of each white area in the image after the hole is filled, and obtaining the information of each outsourcing rectangle, wherein the information comprises the four-vertex coordinates of the rectangle, the inclination angle and the number of white pixel points in the rectangular area; counting the distribution of the rectangular inclination angles at 0-90 degrees according to the calculated rectangular inclination angles, and performing primary correction on the inclination angles by using the angles with concentrated distribution;

4) in the image after the primary inclination angle correction, the bar code is approximately positioned in the 0-degree direction or the 90-degree direction, and the error is within the range of plus or minus 10 degrees; screening candidate rectangles of the bar codes in the directions of 0 degree and 90 degrees respectively, determining the final position and final angle of the bar codes, performing the last rotation to obtain a bar code positioning image which is accurately segmented in the forward direction (0 degree) or the reverse direction (180 degrees), and sending the segmented bar code image to a Zbar recognition function for recognition;

step four, three-segment code character positioning under multi-scale

1) Firstly, by analyzing the information distribution on the express waybill, it can be known that three sections of code characters are possibly located at two positions above or below a bar code, so that the positions of the three sections of code characters are respectively searched in the upward direction and the downward direction by taking the position of the bar code as a reference, and an upper partial image and a lower partial image are respectively intercepted to be subjected to self-adaptive binarization processing;

2) then, calculating the minimum outsourcing rectangle, namely the outline of all the white parts in the two images, calculating the area of each outline, and screening the outlines to remove the outline of which the rectangular area is less than 10 pixels or more than 1200 pixels;

3) after the rectangular outline in the step 2) is obtained, firstly, calculation is carried out in the horizontal direction, namely, probability distribution statistics is carried out on the heights of all rectangles, wherein the heights of the rectangles correspond to the heights of three-segment code numbers; because the heights of three-segment code numbers on the express list image and other character heights such as addresses are concentrated, the concentrated heights are used as the searched scale; for example, on an image, the height of the three-segment code is 20 pixels, the height of the address information text is 15 pixels, and after probability statistics is performed on all the rectangle heights, two values of 20 and 15, namely two different scales, can be found; then respectively searching character rectangles on each scale; each digital rectangle of the three-segment code is rectangular and is distributed in one row or two rows in a centralized way, and according to the characteristic, the three-segment code can be accurately positioned;

the relative angles of the bar codes and the three sections of codes on some express waybills are in a 90-degree relation, and in order to adapt to the situation, the algorithm is carried out once again in the vertical direction, namely probability distribution statistics is carried out on the widths of all rectangles, wherein the widths of the rectangles correspond to the heights of the three sections of codes;

4) according to the final position of the three sections of codes determined finally, the characters are segmented and sent to a CNN neural network for recognition; step five and three-segment code character recognition

The three-segment code character recognition adopts a Tiny-CNN deep learning framework, and comprises an input layer, a convolution layer, a pooling layer and an output layer; the training set is more than 2 ten thousand of actually acquired character gray level images segmented from the quick forwarding list;

the number of input level nodes is equal to the width (18) x height (18) of the input image, for a total of 324; the number of nodes of convolution layer 1 is 6 × 16; the number of nodes of the pooling layer 1 is 6 × 8; the number of nodes of convolution layer 2 is 12 × 6; the number of nodes of the pooling layer 2 is 12 × 3; the number of nodes of the full connection layer is 120, and the number of nodes of the output layer is the sum of 10 Arabic numerals, 26 capital English letters and a connector "-", and the total number is 37.

Further, in the fifth step, the activation function adopted by each node is a Tanh function, the epoch value during training is 100, and the default parameters are used for other parameters.

Furthermore, in the fifth step, when the neural net is input for learning, when the characters are segmented, the sizes of the images of the characters are different due to different scales, so that the input pictures need to be unified to the same size, wherein the size is 18 × 18; and taking the character value with the maximum confidence as a result during recognition.

Further, in the fifth step, since kanji and other non-character images may exist in the divided characters when neural net learning is performed, they are not output as a recognition result at the time of recognition.

Furthermore, in the fifth step, when the neural net is input for learning, since the characters of the numbers "1", "0" and the capital letters "I" and "O" are very close, it is necessary to consider "1" and "I" as a class and "0" and "O" as a class.

Compared with the prior art, the invention has the beneficial effects that:

1) the express waybill identification method capable of simultaneously identifying the one-dimensional bar code and the three-segment code character can correct the inclination angle of an express waybill picture with a complex background; positioning a one-dimensional bar code on the express bill; positioning and identifying three-segment code characters (hereinafter referred to as three-segment codes) on the express bill; all algorithms recognize speeds less than 150ms overall.

2) In order to adapt to various complex backgrounds of express delivery surface single images, the invention firstly provides a method for counting the rectangular angle distribution to calculate the single inclination angle of the express delivery surface, and other related technologies mostly adopt a Hough transformation method. The method is faster than the Hough transformation method in calculation speed in the basic algorithm principle.

3) The invention provides a multi-scale-based three-segment code positioning algorithm for the first time, and three-segment code numbers with any background image, any inclination angle and any size can be accurately positioned and identified. And other technologies adopt a method of manually calibrating the positions of three codes or require that the image to be recognized is placed in the positive direction.

Drawings

FIG. 1 is a schematic diagram of a bar code and a three-segment code;

FIG. 2-a is an original view of image one and the coarse positioning result;

FIG. 2-b is a binary image of the coarse localization profile of image one;

FIG. 2-c is the original image of image two and the coarse positioning result;

FIG. 2-d is a two-valued plot of the coarse localization profile of image two;

3-a is the area to be identified cut out on the first image;

3-b is the area to be identified cut out on image two;

FIG. 4-a is a grayscale image to be recognized after coarse positioning;

FIG. 4-b is a view with fine holes;

FIG. 4-c is a graph of etch drying after hole filling;

FIG. 5 is a waybill image after the inclination angle correction;

FIG. 6 is a diagram of bar code positioning results;

FIG. 7 is a binarized image above a bar code;

FIG. 8 is a binarized image below a bar code;

FIG. 9 is a rectangular screening view above a bar code;

FIG. 10 is a rectangular screening view under a bar code;

FIG. 11-a is a first set of statistics in the horizontal direction;

FIG. 11-b is a second set of statistics in the horizontal direction;

FIG. 11-c is a first set of statistical plots in the vertical direction;

FIG. 11-d is a second set of statistical plots in the vertical direction;

FIG. 12 is a diagram of a final positioning of three-segment code characters;

FIG. 13 is a diagram of a CNN neural network model;

FIG. 14 is a CNN neural network training set image;

fig. 15 is a result presentation graph.

Detailed Description

The following detailed description of the present invention will be made with reference to the accompanying drawings.

An express waybill identification method capable of identifying one-dimensional bar codes and three-section code characters simultaneously is realized by the following steps:

step one, image acquisition of express waybills

The surface of each express delivery piece is upward, the express delivery pieces are placed in the middle of the trolley, and only one express delivery piece is placed on each trolley. The industrial camera is fixed at a height of 1.3-1.5m from the plane of the trolley. When the trolley carrying the express delivery piece moves to a specified position, an image is captured and transmitted to the industrial personal computer through the Ethernet.

Step two, coarse positioning of express delivery bill

1) Firstly, the express bill is preliminarily positioned on the acquired image. And finding the maximum rectangle capable of tightly sleeving the outline by binarization and morphological operation of the image. Sobel edge detection is carried out on the image contained in the maximum rectangle, binarization is carried out, the number of edge points is calculated, if the number of the edge points is larger than 3000, the maximum rectangle is considered to be complex enough, and the area is the rough position of the express waybill. As shown in fig. 2. In fig. 2(a) and 2(b), the coarse positioning effect of the first image is shown. Fig. 2(a) is an original gray-scale image, and after the screening by the preprocessing algorithm, the white area of fig. 2(b) is obtained, and then the minimum rectangle surrounding the white area is calculated, so as to obtain the white rectangular frame on fig. 2 (a). Fig. 2(c) and 2(d) show the coarse positioning effect of the image two, and unlike the image result, the extracted image may be a package image with waybill information due to the ambient brightness or the material problem of the express package itself, but it is also considered that a coarse position of the express waybill is found.

2) Then, candidate images to be recognized are obtained by affine transformation, as shown in fig. 3. Fig. 3(a) and 3(b) are affine transformation results for image one and image two, respectively. It is noted here that the region to be identified shown in fig. 3 is merely a result of coarse positioning, and further tilt angle correction and waybill information positioning are required.

Thirdly, performing bar code rectangular positioning and inclination angle correction on the express bill

In an express waybill image under a complex background, human eyes can quickly judge the position of a bar code, and most of reasons are that the image can be preliminarily processed by utilizing the characteristic because of the special property of black and white of the bar code. The processing process comprises algorithms such as Sobel edge detection, morphological operation, binarization, region growing and the like. The Sobel operator is used for carrying out edge detection on the image in the horizontal direction and the vertical direction, and the two results are added to better highlight the edge information in the image. Combining the characteristics of self-striping, blocking and dense distribution of the bar code, firstly using morphological closing operation with larger structural elements (such as 5 pixels) to change a bar code part on an image into a connected area, so that the characteristics of the bar code are more prominent, and then using morphological closing operation with smaller structural elements (such as 3 pixels) to remove the interference of some simple single lines on the image; at this time, many fine holes appear in the image after the opening and closing operation, which will interfere with the subsequent calculation to some extent, so the holes need to be filled up. For example, an image obtained by roughly positioning one of the original images through the courier bill in the second step is shown in fig. 4 (a). After processing to make the bar code rectangle more prominent, fig. 4(b) is obtained, and it is seen that the inside of fig. 4(b) is filled with fine holes, and then the holes are filled, so that fig. 4(c) is obtained.

Then, the minimum bounding rectangles of each white region in fig. 4(c) are calculated, and information of each bounding rectangle is obtained, wherein the information includes coordinates of four vertices of the rectangle, the inclination angle and the number of white pixel points in the rectangular region. The distribution of the rectangular inclination angles at 0-90 degrees is counted according to the calculated rectangular inclination angles, the inclined waybill information image is corrected by using the angles with concentrated distribution, and the waybill information image after the inclination angle correction is shown in fig. 5.

In the image after the primary inclination angle correction, the bar code is approximately positioned in the 0-degree direction or the 90-degree direction, and the error is within the range of plus or minus 10 degrees; screening candidate rectangles of the bar code in the directions of 0 degree and 90 degrees respectively, determining the final position and the final angle of the bar code, performing the last rotation to obtain a bar code positioning image which is accurately segmented in the forward direction (0 degree) or the reverse direction (180 degrees), and sending the segmented bar code image to a Zbar identification function for identification, wherein the final position of the bar code is shown in figure 6.

Step four, three-segment code character positioning under multi-scale

Three-segment code characters on the express waybill consist of ten numbers from 0 to 9 and 26 English letters. However, as can be seen from the photographed express waybill picture, the characters in the image under different sizes are various, which brings a great deal of interference and great difficulty to recognition. Therefore, a multi-scale method is adopted to position three code characters.

Firstly, it can be known by analyzing the distribution of information on the express waybill that three sections of code characters may be located at two positions above or below the barcode, so we need to search the positions of the three sections of code characters respectively in the upward and downward directions by taking the position of the barcode as a reference, and respectively intercept the upper and lower two parts of images to perform adaptive binarization processing, and the effect graph is shown in fig. 7 and 8.

Then, for all white portions in the two images, the minimum outsourcing rectangle, i.e. the outline, of the white portions is calculated, the area of each outline is calculated, and then the outlines are screened to remove the outline with the rectangle area smaller than 10 pixels or larger than 1200 pixels, and the outline screening result is shown in fig. 9 and 10.

Secondly, calculating the obtained minimum tightening rectangles in the horizontal direction, namely performing probability distribution statistics on the heights of all the rectangles, wherein the heights of the rectangles correspond to the heights of three-segment code numbers; because the heights of three-segment code numbers on the express bill image are different from the heights of other characters such as addresses and the like, but the height values are concentrated, and the concentrated heights are used as the searching scale. For example, if the height of three codes is 20 pixels and the height of address information text is 15 pixels on an image, probability statistics is performed on all the rectangle heights, and two values of 20 and 15, that is, two different scales, are found. Then find the character rectangle at each scale separately. Each digital rectangle of the three-segment code is rectangular and is distributed in one row or two rows in a centralized way, and according to the characteristic, the three-segment code can be accurately positioned. In addition, relative angles of the bar codes and the three sections of codes on some express waybills are in a 90-degree relation, in order to adapt to the situation, the algorithm is performed once again in the vertical direction, namely probability statistics is performed on the width of each rectangle, after multi-scale data are obtained, three sections of codes are respectively positioned through numbers, and the width of the rectangle corresponds to the height of the three sections of codes. For example, in one of the images, two scales are found in the horizontal direction, as shown in fig. 11(a) and 11(b), and two scales are found in the vertical direction, as shown in fig. 11(c) and 11(d), and it is obvious that the three-segment code in this image is in the horizontal direction.

And finally, determining the final position of the three-segment code according to the calculation result in the horizontal or vertical direction, segmenting the character, and sending the character to a CNN neural network for recognition. The three-segment code positioning result graph is shown in fig. 12.

Step five and three-segment code character recognition

The three-segment code character recognition adopts a Tiny-CNN deep learning framework, and comprises an input layer, a convolution layer, a pooling layer and an output layer; the training set is 2 thousands of actually acquired character gray level images segmented from the quick-passing surface sheet. The network model is shown in fig. 13, and the training set image is shown in fig. 14. The number of input level nodes is equal to the width (18) x height (18) of the input image, for a total of 324; the number of nodes of convolution layer 1 is 6 × 16; the number of nodes of the pooling layer 1 is 6 × 8; the number of nodes of convolution layer 2 is 12 × 6; the number of nodes of the pooling layer 2 is 12 × 3; the number of nodes of the full connection layer is 120, and the number of nodes of the output layer is the sum of 10 Arabic numerals, 26 capital English letters and a connector "-", and the total number is 37. The activation function adopted by each node is a Tanh function, the epoch value during training is 100, and the default parameters are used for other parameters.

When the neural net is sent to study, three points are remarkable. Firstly, when characters are segmented, because the sizes of each character image are different due to different scales, input pictures need to be unified to the same size, the size is 18 × 18, and the character value with the maximum confidence level is taken as a result during recognition; second, chinese characters and other non-character images may exist in the divided characters, and thus are not output as a recognition result at the time of recognition; third, since the numbers "1", "0" and the capital letters "I", "O" are very similar in character, it is necessary to consider "1" and "I" as one type and "0" and "O" as one type.

Fig. 15 is a result presentation graph.

The above embodiments are implemented on the premise of the technical solution of the present invention, and detailed embodiments and specific operation procedures are given, but the scope of the present invention is not limited to the above embodiments. The methods used in the above examples are conventional methods unless otherwise specified.

Claims

1. An express waybill identification method capable of identifying one-dimensional bar codes and three-section code characters simultaneously is characterized by comprising the following steps:

step two, roughly positioning the express bill;

1) firstly, carrying out edge detection on an image in the horizontal and vertical directions by using a Sobel operator, and then adding the two results to better highlight edge information in the image; combining the characteristics of self-striping, blocking and dense distribution of the bar code, firstly using morphological closing operation with larger structural elements to change a bar code part on the image into a communicated area, so that the characteristics of the bar code are more prominent, and then using morphological closing operation with smaller structural elements to remove some simple single-line interference on the image;

4) in the image after the primary inclination angle correction, the bar code is approximately positioned in the 0-degree direction or the 90-degree direction, and the error is within the range of plus or minus 10 degrees; screening candidate bar code rectangles in the directions of 0 degree and 90 degrees respectively, determining the final position and final angle of the bar code, performing the last rotation to obtain a forward or reverse accurately segmented bar code positioning image, and sending the segmented bar code image to a Zbar recognition function for recognition;

step four, three-segment code character positioning under multi-scale

3) after the rectangular outline in the step 2) is obtained, firstly, calculation is carried out in the horizontal direction, namely, probability distribution statistics is carried out on the heights of all rectangles, wherein the heights of the rectangles correspond to the heights of three-segment code numbers; the digital height of three-segment codes on the express list image, the height of other characters such as addresses and the like are concentrated, the concentrated heights are used as the searched scale, and the three-segment codes are accurately positioned according to the characteristic;

4) according to the final position of the three sections of codes determined finally, the characters are segmented and sent to a CNN neural network for recognition;

step five and three-segment code character recognition

2. The express waybill recognition method capable of recognizing one-dimensional bar codes and three-segment code characters simultaneously as claimed in claim 1, wherein in the fifth step, the activation function adopted by each node is a Tanh function, an epoch value during training is 100, and default parameters are used for other parameters.

3. The express waybill recognition method capable of simultaneously recognizing one-dimensional bar codes and three-segment code characters as claimed in claim 1, wherein in the fifth step, when the neural net is input for learning, when the characters are segmented, the size of each character image has a certain difference due to different scales, so that the input pictures need to be unified to the same size, the size is 18 x 18; and taking the character value with the maximum confidence as a result during recognition.

4. The method for identifying the express waybill capable of simultaneously identifying the one-dimensional bar code and the three-segment code character as claimed in claim 1, wherein in the fifth step, when neural net learning is carried out, Chinese characters and other non-character images may exist in the segmented characters, so that the Chinese characters and other non-character images are not output as an identification result during identification.

5. The express waybill recognition method capable of simultaneously recognizing the one-dimensional bar code and the three-segment code character as claimed in claim 1, wherein in the fifth step, when the neural net is input for learning, since the characters of the numbers "1", "0" and the capital letters "I" and "O" are very close, the numbers "1" and "I" are required to be recognized as one type, and the numbers "0" and "O" are required to be recognized as one type.