[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN110942057B - A container number identification method, device and computer equipment - Google Patents

A container number identification method, device and computer equipment Download PDF

Info

Publication number
CN110942057B
CN110942057B CN201811113365.XA CN201811113365A CN110942057B CN 110942057 B CN110942057 B CN 110942057B CN 201811113365 A CN201811113365 A CN 201811113365A CN 110942057 B CN110942057 B CN 110942057B
Authority
CN
China
Prior art keywords
container number
decoding result
image
identified
target area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811113365.XA
Other languages
Chinese (zh)
Other versions
CN110942057A (en
Inventor
桂一鸣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201811113365.XA priority Critical patent/CN110942057B/en
Publication of CN110942057A publication Critical patent/CN110942057A/en
Application granted granted Critical
Publication of CN110942057B publication Critical patent/CN110942057B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a container number identification method, a device and computer equipment. The container number identification method comprises the steps of positioning a target area where a container number is located in an image to be identified containing the container number, performing space transformation on the target area to obtain a transformed target area, performing feature extraction on the transformed target area to obtain a first feature map, inputting the first feature map into a container number identification model trained in advance, serializing the first feature map by the container number identification model to obtain a feature sequence, performing coding processing on the feature sequence to obtain a coding result, decoding the coding result, and outputting a decoding result, and determining the container number in the image to be identified according to the decoding result. The container number identification method provided by the application can accurately identify the container number from the image to be identified.

Description

Container number identification method and device and computer equipment
Technical Field
The present application relates to the field of image recognition, and in particular, to a method, an apparatus, and a computer device for recognizing a container number.
Background
In a gate operation, each container is typically assigned a box number to identify the individual container by the box number. In recent years, in order to reduce manual transcription errors and labor cost, the number of the container is often identified by an automatic identification technology.
The related art discloses a container number recognition method, which comprises the steps of positioning a target area where a container number is located from an image to be recognized, performing character segmentation on the target area, recognizing a plurality of characters obtained through segmentation respectively to obtain a plurality of recognition results, and combining the plurality of recognition results to obtain the container number.
When the method is adopted to identify the container number, character segmentation needs to be carried out on a target area where the container number is located, the dependence is strong, the problem of inaccurate character segmentation exists under the conditions of poor light, pollution, large inclination and the like, and the problem of low identification accuracy caused by inaccurate character segmentation exists.
Disclosure of Invention
In view of the above, the present application provides a method, an apparatus and a computer device for identifying a container number, so as to provide a method for identifying a container number with high accuracy.
The first aspect of the application provides a container number identification method, which comprises the following steps:
Positioning a target area where the container number is located from an image to be identified containing the container number, and performing space transformation on the target area to obtain a transformed target area;
Extracting features of the transformed target area to obtain a first feature map;
inputting the first feature map into a pre-trained container number recognition model, serializing the first feature map by the container number recognition model to obtain a feature sequence, performing coding treatment on the feature sequence to obtain a coding result, decoding the coding result, and outputting a decoding result;
and determining the container number in the image to be identified according to the decoding result.
A second aspect of the application provides a container number identification device, said device comprising a detection module, an identification module and a processing module, wherein,
The detection module is used for positioning a target area where the container number is located from an image to be identified containing the container number;
the identification module is used for carrying out space transformation on the target area to obtain a transformed target area;
The identification module is also used for extracting the characteristics of the transformed target area to obtain a first characteristic diagram;
The identification module is further configured to input the first feature map into a pre-trained container number identification model, sequence the first feature map by using the container number identification model to obtain a feature sequence, encode the feature sequence to obtain an encoding result, decode the encoding result, and output a decoding result;
and the processing module is used for determining the container number in the image to be identified according to the decoding result.
A third aspect of the application provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of any of the methods provided in the first aspect of the application.
A fourth aspect of the application provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the methods provided in the first aspect of the application when the program is executed.
According to the container number identification method, device and computer equipment, the target area where the container number is located in the image to be identified containing the container number, the target area is subjected to space transformation, the transformed target area is obtained, the transformed target area is subjected to feature extraction, a first feature map is obtained, the first feature map is further input into a container number identification model trained in advance, the container number identification model sequences the first feature map to obtain a feature sequence, the feature sequence is subjected to coding processing, a coding result is obtained, the coding result is decoded, and a decoding result is output, so that the container number in the image to be identified is determined according to the decoding result. Therefore, the container number can be identified based on the target area without character segmentation on the target area, and the identification accuracy is high.
Drawings
Fig. 1 is a flowchart of a first embodiment of a container number identification method provided by the present application;
FIG. 2 is a schematic illustration of a container number according to an exemplary embodiment of the present application;
FIG. 3 is a schematic diagram of an implementation of serializing a first feature diagram according to an exemplary embodiment of the present application;
FIG. 4 is a schematic diagram of an attention model shown in an exemplary embodiment of the present application;
fig. 5 is a flowchart of a second embodiment of a container number identification method provided by the present application;
FIG. 6 is a schematic diagram of a detection network according to an exemplary embodiment of the present application;
fig. 7 is a flowchart of a third embodiment of a container number identification method provided by the present application;
FIG. 8 is a schematic diagram of an implementation of a method for identifying a container number according to an exemplary embodiment of the present application;
FIG. 9 is a hardware block diagram of a computing device in which a container number identification device is located, according to an exemplary embodiment of the present application;
Fig. 10 is a schematic structural diagram of a container number identification device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the application. The term "if" as used herein may be interpreted as "at..once" or "when..once" or "in response to a determination", depending on the context.
The application provides a container number identification method, a device and computer equipment, and aims to provide a container number identification method with high identification accuracy.
The container number identification method and device provided by the application can be applied to computer equipment. For example, the present application is applicable to an image capturing apparatus (which may be a camera), and for example, it is also applicable to a server. In the present application, this is not limited.
The following specific embodiments are provided to describe the technical solution of the present application in detail, and the following specific embodiments may be combined with each other, and may not be repeated in some embodiments for the same or similar concepts or processes.
Fig. 1 is a flowchart of a first embodiment of a container number identification method provided by the present application. Referring to fig. 1, the method provided in this embodiment may include:
S101, locating a target area where the container number is located from an image to be identified containing the container number, and performing space transformation on the target area to obtain a transformed target area.
It should be noted that, the image to be identified is a snap frame image collected by the image collecting device, which may be an image containing a container number collected by the image collecting device in real time, or may be an image containing a container number stored by the image collecting device.
Specifically, fig. 2 is a schematic diagram of a container number according to an exemplary embodiment of the present application. Referring to fig. 2, the container numbers may be distributed horizontally or vertically, and the composition structure of the container numbers may be expressed in XYZ form no matter what kind of distribution is adopted. Where X is a 4-bit master bin number (all letters), Y is a 7-bit number, and Z is a 4-bit ISO number (it should be noted that the ISO number may not appear at times). It should be noted that the last digit in Y is a check code, which can be calculated from the 4-digit letter in X and the first 6-digit number in Y.
Further, a target area where the container number is located can be located from the image to be identified containing the container number using a related target detection algorithm. In the present embodiment, this is not limited. For example, in one embodiment, the target area where the container number is located may be located from the image to be identified based on a YOLO (You Only Look Once, YOLO for short) model.
For another example, the target area where the container number is located may also be located from the image to be identified containing the container number by a detection network.
For example, a deep convolutional neural network may be constructed for container number detection, where the input of the deep convolutional neural network is set as an image, the input is set as a coordinate position of a container number on the image, a quadrangle is used to represent the region coordinates of each row or each column, and when the container number is formed by a plurality of rows or columns, a plurality of quadrangle coordinates are output, and the quadrangle may be inclined, which indicates that the container number has a certain direction.
The constructed deep convolutional neural network may be a network modified based on an SSD network. For example, in one embodiment, the network may comprise a 29-layer full-convolutional network structure, wherein the first 13 layers inherit from a VGG-16 network (converting the last full-concatenated layer in the VGG16 network to a convolutional layer), followed by a 10-layer full-convolutional network structure, and then followed by a Text-box layer concatenated with 6 full-convolutional network structures. The Text-box layer is a key component of the constructed SSD-based modified network, and 6 full convolution network structures in the 6 Text-box layers are respectively connected with 6 feature map positions in the previous network, and each feature map position, one Text-box layer predicts an n-dimensional vector (the n-dimensional vector can comprise whether Text is 2-dimensional, horizontal boundary rectangle is 4-dimensional, rotation boundary rectangle is 5-dimensional and quadrilateral is 8-dimensional).
It should be noted that, the detailed description about the detection network will be described in detail in the following embodiments, which are not repeated here.
Further, the STN network may be used to spatially transform the target area to obtain a transformed target area. In a specific implementation, the target area is input to a pre-trained STN network (Spatial Transformers, STN for short), and the STN network spatially transforms the target area and outputs the transformed target area.
In particular, the STN network may perform spatial transformations (including, but not limited to, translation, scaling, rotation, etc.) on the target area without the need for calibration of the keypoints. Thus, the target area after transformation is utilized for identification, and the identification accuracy can be improved.
It should be noted that, for the specific structure and specific implementation principle of the STN network, reference may be made to the description in the related art, and no further description is given here.
S102, extracting features of the transformed target area to obtain a first feature map.
Specifically, the feature extraction can be performed on the transformed target region by using a conventional method. For example, a Scale-invariant feature transform SIFI algorithm (Scale-INVARIANT FEATURE TRANSFORM, SIFT) is adopted to extract features of the transformed target region. Of course, the feature extraction may also be performed on the transformed target area by using a neural network, for example, in an embodiment, a specific implementation procedure of this step may include:
The transformed target area is input into a neural network for feature extraction, a designated layer in the neural network is used for feature extraction of the transformed target area, the designated layer comprises a convolution layer or comprises a convolution layer and at least one of a pooling layer and a full-connection layer, and the output result of the designated layer is determined to be the first feature map.
In particular, the neural network for feature extraction may include a convolution layer for performing a filtering process on an input transformed target region. Further, at this time, the filtering processing result output by the convolution layer is the first feature map extracted. Furthermore, the neural network for feature extraction may also include a pooling layer and/or a fully connected layer. For example, in one embodiment, the neural network for performing feature extraction includes a convolution layer, a pooling layer, and a full connection layer, where the convolution layer is configured to perform filtering processing on an input transformed target area, the pooling layer is configured to perform compression processing on a filtering processing result, and the full connection layer is configured to perform aggregation processing on the compression processing result. Further, at this time, the aggregation processing result output by the full connection layer is the first feature map extracted.
S103, inputting the first feature map into a pre-trained container number recognition model, serializing the first feature map by the container number recognition model to obtain a feature sequence, performing coding processing on the feature sequence to obtain a coding result, decoding the coding result, and outputting a decoding result.
Specifically, fig. 3 is a schematic diagram of an implementation of serializing the first feature diagram according to an exemplary embodiment of the present application. Referring to fig. 3, the process of serializing the first feature map may include:
(1) Sliding a preset sliding window on the first characteristic diagram according to a preset moving step length so as to divide a local characteristic diagram of the position of the sliding window;
(2) And determining all the segmented local feature graphs as the feature sequences.
Specifically, in one embodiment, the container number identification model may be an attention model, where the attention model may include a convolution network, and the step (1) may be implemented through the convolution network.
In addition, the size of the preset sliding window is adapted to the first characteristic diagram. For example, when the dimension of the first feature map is a×b×c (where a and B are the height and width of the first feature map, respectively, and C is the number of channels included in the first feature map). At this time, the size of the sliding window may be set to a×a. In addition, the preset moving step length is set according to actual needs. In the present embodiment, this is not limited. For example, in one embodiment, the preset movement step is 2.
Further, referring to fig. 3, in a specific implementation, a preset sliding window may be disposed at one end of the first feature map, and a local feature map of the position of the sliding window may be segmented, so as to move the sliding window based on a preset movement step length, and segment the local feature map of the position of the sliding window after movement. In this way, this process is repeated until the sliding window moves to the other end of the first signature. And finally, determining all the segmented local feature graphs as feature sequences.
It should be noted that when the first feature map is segmented by using the preset sliding window and the preset moving step, if the final remaining portion cannot be covered by the sliding window, the first feature map may be filled. Further, since the above-described first feature map includes a plurality of channels, each of the divided partial feature maps also includes a plurality of channels.
Further, fig. 4 is a schematic diagram of an attention model according to an exemplary embodiment of the present application. Referring to fig. 4, the attention model may further include an input layer, a hidden layer, and an output layer connected in sequence, where X (X 1、X2、X3、X4……、Xm represents a feature sequence input to the input layer, α t,1、αt,2、αt,3、αt,3……αt,m represents a weight parameter of each feature in the feature sequence at time t (the dimension of the weight parameter of each feature is the same as the dimension of the feature), ct represents a coding result at time t, S t-1、St represents a hidden layer state (initial time, hidden layer state is 0) associated with each time, and y t、yt+1 represents a decoding result at each time.
Referring to fig. 4, a detailed implementation process of performing an encoding process on a feature sequence to obtain an encoding result, and outputting a decoding result after decoding the encoding result is described in detail below, where the process may include:
(1) And calculating weight parameters of all the features in the feature sequence at all the moments.
Specifically, this step is implemented by the input layer. In addition, the weight parameters of each feature in each time feature sequence may be calculated according to a first formula:
Wherein alpha t,i is the weight parameter of the ith feature in the feature sequence at the time t;
X i is the ith feature in the feature sequence;
s t-1 is the hidden layer state at the time t-1;
Is an activation function;
w and U are model parameters of the attention model.
(2) And calculating the coding result at each moment according to the weight parameters of each feature in the feature sequence at each moment and the feature sequence.
Specifically, this step is implemented by hidden layers. In addition, the implementation process of the step can comprise the steps of carrying out weighted summation processing on the feature sequence by utilizing the weight parameters of each feature in each time feature sequence, and determining the obtained weighted summation as the coding result of the time.
With reference to the foregoing description, the process may be expressed by a second formula, which is:
Wherein Ct is the encoding result at time t.
(3) And calculating hidden layers related to the context of each time according to the characteristic sequence and the coding result of each time.
Specifically, this step is implemented by hidden layers. Furthermore, the context-dependent hidden layer state at each time instant may be calculated according to a third formula:
st=LSTM(st-1,Ct,yt-1)
The hidden layer state at time t is related to the hidden layer state at time t-1, the decoding result at time t, and the decoding result outputted by the attention model at time t-1.
(4) And obtaining decoding results at all the moments according to the hidden layers related to the context at all the moments.
Specifically, this step is implemented by the output layer. In addition, the decoding result at each time may be calculated by a fourth formula:
yt=softmax(st)
Specifically, the decoding result at each time includes the confidence level of each candidate character at that time and the character identified at that time. The character identified at this time is the candidate character with the highest confidence among the candidate characters.
According to the method provided by the embodiment, the attention model is utilized to identify the target area after the space transformation, so that the container number can be identified based on the target area without character identification, and the accuracy is high.
S104, determining the container number in the image to be identified according to the decoding result.
Specifically, in an embodiment, each character identified in the decoding result may be directly combined in sequence, and the combined result is determined as the container number in the image to be identified.
According to the method provided by the embodiment, the target area where the container number is located in the image to be identified containing the container number, the target area is subjected to space transformation, the transformed target area is obtained, the transformed target area is subjected to feature extraction, a first feature map is obtained, the first feature map is further input into a container number identification model trained in advance, the first feature map is serialized by the container number identification model, a feature sequence is obtained, the feature sequence is subjected to coding processing, a coding result is obtained, the coding result is decoded, and then a decoding result is output, so that the container number in the image to be identified is determined according to the decoding result. Therefore, the container number can be identified based on the target area where the whole container number is located without character segmentation, and the identification accuracy is high. In addition, the method provided by the application can identify the rotating, bending and tilting container numbers (namely, the container numbers with larger deformation can be identified) without adding manual marks, and has wider applicability.
Fig. 5 is a flowchart of a second embodiment of a container number identification method provided by the present application. Referring to fig. 5, based on the above embodiment, step S101 of the method provided in the present embodiment, a specific implementation process of locating a target area where a container number is located from an image to be identified including the container number may include:
S501, inputting the image to be identified into a pre-trained detection network, extracting multi-level features of the image to be identified by the detection network to obtain a specified number of second feature images, respectively carrying out classification and position regression on each second feature image, and outputting classification results and position information of a plurality of candidate areas, wherein the dimensions of the specified number of second feature images are different.
In particular, the detection network may be implemented by a convolutional layer. Further, the specified number is set according to actual needs. For example, in one embodiment, the specified number is 6.
Fig. 6 is a schematic diagram of a detection network according to an exemplary embodiment of the present application. Referring to fig. 6, the detection network may include a 29-layer full-convolution network structure, wherein the first 13-layer full-convolution network structure is inherited from the VGG-16 network (the last full-connection layer in the VGG-16 network is converted into a convolution layer), and is followed by a 10-layer full-convolution network structure (e.g. Conv6 to conv11_2 in fig. 6, wherein conv8_2, conv9_2, conv10_2, and conv11_2 are each preceded by a layer of full-convolution network structure, which is not shown in fig. 6). It should be noted that, referring to fig. 6, the 23-layer full convolution network structure is used to perform multi-level feature extraction, so as to obtain 6 second feature graphs.
Further, with continued reference to FIG. 6, the 23-layer full convolutional network structure is followed by 6 Text-box layers, each of which may be implemented by a full convolutional network structure. Referring to fig. 6, each Text-box layer is connected to a previous full convolution network structure, and is used to classify and regress the second feature map output by the full convolution network structure, and output classification results and position information of multiple candidate regions.
In addition, referring to fig. 6, an NMS layer is connected to the back of the text-box layer, and is configured to perform non-maximum suppression processing on the plurality of candidate areas based on the classification result and the location information of each candidate area, so as to obtain a target area where the container number is located.
It should be noted that, in an embodiment, the classification result may be a foreground and background classification result. In addition, the position information of a candidate region can be characterized by a 12-dimensional vector including coordinates (8 dimensions) of four points of a quadrangle including the candidate region, and coordinates of a center point, a width, and a height (4 dimensions) of a circumscribed horizontal rectangle of the quadrangle. Of course, in an embodiment, the location information may also include coordinates and width (5 dimensions) of two diagonal points of the rotation boundary rectangle corresponding to the candidate region.
It should be noted that, because the dimensions of the plurality of second feature maps are different, that is, the receptive fields of the plurality of second feature maps are different, the finally obtained target area is equivalent to the target area obtained by classifying and position regression on the feature maps of the plurality of different receptive fields, and the target area has strong multi-scale detection capability.
The following briefly describes the implementation principle of the Text-box layer for classifying the second feature map and performing position regression.
Specifically, the Text-box layer comprises three parts, namely a candidate frame layer, a classification layer and a position regression layer. The candidate frame layer is used for generating a plurality of candidate frames with different sizes at the positions of the pixel points according to a preset rule by taking each pixel point in the second feature map as a center, and further providing the candidate frames for the classification layer and the position regression layer to perform category judgment and position fine modification.
Further, the classification layer outputs probabilities that each candidate box belongs to the foreground and the background. The position regression layer outputs position information of each candidate frame. It should be noted that, the classification layer and the position regression layer are implemented by using a convolution layer. In addition, to accommodate text detection, the convolution layer may employ 3*5 convolution kernels.
For example, in one embodiment, the dimension of one of the second feature patterns is 40×42×128, where 40×42 represents the height and width of the second feature pattern, and 128 represents the number of channels included in the second feature pattern. The candidate frame layer uses each pixel point in the second feature map as a center, and generates 20 candidate frames with different sizes at the positions of the pixel points according to a preset rule (the specific implementation principle is described in the related art, and details are not repeated here).
In addition, the dimension of the convolution kernel in the classification layer is 40×3×5×128, where 40 denotes the number of convolution kernels and 3*5 denotes the size of the convolution kernel. The step size of the convolution kernel shift is 1. In this way, the classification layer carries out convolution processing on the second feature map to obtain a first convolution processing result. The dimension of the first convolution result is 40×40×42. Further, for a first target convolution processing result corresponding to each pixel point in the first convolution processing result (the first target convolution processing result includes 40 convolution values), the first convolution processing result represents a classification result of 20 candidate frames corresponding to the pixel point (the classification result of each candidate frame includes a classification (two-dimensional) of foreground and background).
Further, for example, the convolution kernel of the position regression layer has a dimension of 240×3×5×128, where 240 represents the number of convolution kernels and 3*5 represents the size of the convolution kernels. The step size of the convolution kernel shift is 1. Thus, the position regression layer carries out convolution processing on the second feature map to obtain a second convolution processing result. The dimension of the second convolution result is 240×40×42. And aiming at a second target convolution processing result (the second target convolution processing result comprises 240 convolution values) corresponding to each pixel point in the second convolution processing result, wherein the second target convolution processing result represents the position information of 20 candidate frames corresponding to the pixel point. Referring to the foregoing description, the location information of each candidate box includes 12 dimensions. For example, the first 12 of the 240 convolution values represent the location information of the first candidate box.
The above-mentioned candidate boxes are understood as candidate regions.
S502, performing non-maximum suppression processing on the plurality of candidate areas based on the classification result and the position information of each candidate area to obtain a target area where the container number is located.
With continued reference to fig. 6, in the example shown in fig. 6, this step may be accomplished by detecting the NMS layer in the network.
It should be noted that, for the specific implementation principle and implementation procedure of the non-maximum suppression, reference may be made to the description in the related art, and no further description is given here.
According to the method provided by the embodiment, the image to be identified is input into the pre-trained detection network, the detection network performs multi-level feature extraction on the image to be identified to obtain a designated number of second feature images, classification and position regression are performed on each second feature image respectively, classification results and position information of a plurality of candidate areas are output, and non-maximum suppression processing is performed on the plurality of candidate areas based on the classification results and the position information of each candidate area to obtain a target area where a container number is located. Therefore, the dimensions of the plurality of second feature images are different, namely the receptive fields of the plurality of second feature images are different, and the finally obtained target area is equivalent to the target area obtained by classifying and position regression on the feature images of the plurality of different receptive fields, so that the target area has strong multi-scale detection capability. Therefore, the target area where the container number is located can be accurately positioned.
Optionally, in a possible implementation manner of the present application, the step S101, a specific implementation process of locating a target area where the container number is located from the image to be identified containing the container number may include:
(1) And adjusting the size of the image to be identified to obtain a plurality of target images with different sizes.
For example, in an embodiment, interpolation processing or downsampling processing may be performed on the image to be identified to obtain the target images with different sizes.
(2) Inputting the target images to a pre-trained detection network aiming at each target image, carrying out multi-level feature extraction on the target images by the detection network to obtain a specified number of second feature images, respectively carrying out classification and position regression on each second feature image, and outputting classification results and position information of a plurality of candidate areas, wherein the dimensions of the specified number of second feature images are different.
(3) And performing non-maximum suppression processing on the plurality of candidate areas based on the classification result and the position information of each candidate area to obtain a target area where the container number in the target image is located.
In particular, the specific implementation process and implementation principle of the steps (2) and (3) may refer to the description in the foregoing embodiments, which are not repeated herein.
(4) And determining the target area of the container number in the image to be identified according to the target area of the container number in the target image.
Specifically, the non-maximum value inhibition processing can be performed on the target area where the container number is located in the plurality of target images based on the target area where the container number is located in the target image, so as to obtain the target area where the container number is located in the image to be identified.
According to the method provided by the embodiment, the size of the image to be identified is adjusted to obtain a plurality of target images with different sizes, and then the target area where the container in the image to be identified is located is positioned based on the target images. In this way, the accuracy of the positioning can be further improved.
It should be noted that, the networks used in the present application are all pre-trained networks. The training process of the network may include:
(1) Constructing a network;
For example, when constructing a detection network, the detection network sets an input image and outputs position information of an area where a container number is located. For example, when the square is used to represent the region coordinates of each row or each column and the box number is composed of a plurality of rows or columns, a plurality of square coordinates are output, and the square may be inclined, indicating that the box number has a certain direction.
For another example, for the identification network, an area where the container number is located may be set as an input, and output as a box number character string, represented by a row XYZ, X represents a 4-bit master box number, Y represents a 7-bit number, and Z represents a 4-bit ISO number.
(2) Obtaining a training sample;
for example, in this example, when training the detection network, the tag information of the training sample is the location information of the area where the container number is located. It should be noted that a complete box number may be composed of multiple rows or columns, where each row or column should be marked with a quadrilateral coordinate, the quadrilateral encloses all characters in the row or column, and not much blank should be left, and the quadrilateral may be inclined, indicating that the box number has a certain direction.
For another example, when training a container number recognition network, the tag information of the training sample is a number string. It should be noted that a complete box number may be formed by multiple rows or multiple columns, the labeling result is uniformly written in XYZ form of one row, X represents the 4-bit master box number, Y represents the 7-bit number, Z represents the 4-bit ISO number, the last digit in Y is the check bit, and the last digit can be calculated from the 4-bit letter in X and the first 6-bit digit of Y, and verification should be given when labeling.
(3) And training the network by adopting a training set to obtain a trained network.
Specifically, the network parameters in the network may be set to a specified value, and then the obtained training samples are used to train the network to obtain a trained network.
Specifically, the process can comprise two stages of forward propagation and backward propagation, namely, inputting a training sample, performing forward propagation on the training sample to extract data characteristics and calculate a loss function, and backward propagation, sequentially performing forward and backward propagation from the last layer of the network by using the loss function, and modifying network parameters of the network by using a gradient descent method so as to enable the loss function to be converged.
Fig. 7 is a flowchart of a third embodiment of a container number identification method provided by the present application. Referring to fig. 7, the method provided in this embodiment, based on the above embodiment, step S104, a process of determining the container number in the image to be identified according to the decoding result, may include:
S701, judging whether the decoding result meets a specified check rule.
Specifically, the specific implementation process of this step may include:
(1) And judging whether the composition structure of the decoding result is matched with the composition structure of the container number.
Specifically, referring to the foregoing description, the composition structure of the container number can be expressed in XYZ form. Where X is a 4-bit master bin number (all letters), Y is a 7-bit number, and Z is a 4-bit ISO number (it should be noted that the ISO number may not appear at times). In this step, when the first 4 characters identified in the decoding result are letters and the 5 th to 11 th characters identified in the decoding result are numbers, it is determined that the composition structure of the decoding result is matched with the composition structure of the container number, otherwise, it is determined that the composition structure of the decoding result is not matched with the composition structure of the container number.
(2) If not, determining that the decoding result does not meet the check rule.
Specifically, when it is determined that the composition structure of the decoding result does not match the composition structure of the container number according to step (1), it is determined that the decoding result does not satisfy the above-described verification rule at this time. For example, in one embodiment, when there are numbers in the first 4 characters identified in the decoding result, it is determined that the composition result of the decoding result is not matched with the composition structure of the container number, and then it is determined that the decoding result does not satisfy the verification rule. For another example, when there are letters in the 5 th to 11 th characters identified in the decoding result, it is determined that the composition result of the decoding result does not match the composition structure of the container number, and it is further determined that the decoding result does not satisfy the above-described verification rule.
(3) If so, calculating the check value of the decoding result according to a preset rule.
(4) And judging whether the check value is equal to the check code identified in the decoding result.
(5) And if the check value is equal to the check code identified in the decoding result, determining that the decoding result meets the check rule, otherwise, determining that the decoding result does not meet the check rule.
Specifically, the check value of the decoding result can be calculated according to the following method (1) according to the corresponding relation between the preset letters and the digits, the first 4 characters identified in the decoding result are converted into digits to obtain the converted decoding result, and (2) the check value of the decoding result is calculated according to the following formula:
S is a check value of a decoding result;
c n is the nth character in the converted decoding result.
Specifically, when the first 4 characters identified in the decoding result are converted into numbers, the conversion can be performed according to a preset correspondence between letters and numbers. For example, table 1 shows the preset letter-to-number correspondence relationship in an exemplary embodiment of the present application:
Table 1 correspondence between preset letters and numbers
Further, referring to the foregoing description, the 11 th bit in the container number is a check code, and thus, the 11 th character identified in the decoding result is an identified check code. In this step, it is determined whether the calculated check value is equal to the check code identified in the decoding result, and then when the calculated check value is equal to the check code identified in the decoding result, it is determined that the decoding result satisfies the check rule, or else it is determined that the decoding result does not satisfy the check rule.
S702, if yes, determining a first combination result obtained by sequentially combining the characters identified in the decoding result as a container number in the image to be identified.
S703, if not, correcting the decoding result to obtain a corrected decoding result, and determining a second combination result of each character in the corrected decoding result after sequential combination as the container number in the image to be identified, wherein the corrected decoding result meets the verification rule.
Specifically, in one possible implementation manner, the specific implementation process of this step may include:
(1) And (3) executing the step (2) when the composition structure of the decoding result is not matched with the composition structure of the container number, and executing the step (5) when the composition structure of the decoding result is matched with the composition structure of the container number.
(2) And carrying out first correction on the decoding result so that the composition structure of the decoding result after the first correction is matched with the composition structure of the container number.
Specifically, if there is a number in the first 4 characters identified in the decoding result, in an embodiment, it may be determined whether the number is a character error character recorded in a pre-established character error recognition table of different types, if so, the number is replaced with a letter corresponding to the number recorded in the character error recognition table of different types, otherwise, the number is replaced with a letter with the highest confidence in each candidate character at the moment. Of course, in another embodiment, the number may be directly replaced by the letter with the highest confidence in each candidate character at that time.
Further, if there is a letter in the 5 th to 11 th characters identified in the decoding result, in an embodiment, it may be determined whether the letter is a misidentified character recorded in a pre-established different type of character misidentification table, if so, the letter is replaced with a number corresponding to the letter recorded in the different type of character misidentification table, otherwise, the letter is replaced with a number with the greatest confidence in each candidate character at that time. Of course, in an embodiment, the letter may be directly replaced by the number with the highest confidence in each candidate character at the moment.
For example, table 2 the present application shows a pre-established list of different types of character misidentifications for an exemplary embodiment. Referring to table 2, at the time of recognition, 0 is easily erroneously recognized as O, or O is erroneously recognized as 0. For example, in one embodiment, when "0" exists in the first 4 bits identified by the decoding result, at this time, "0" is replaced with "O".
TABLE 2 different types of character misidentification tables
0 O
1 I
2 Z
(3) And when the first corrected decoding result meets the verification rule, determining a combined result of sequentially combining the characters in the first corrected decoding result as the container number in the image to be identified.
(4) And (5) executing the step when the decoding result after the first correction does not meet the check rule.
(5) And carrying out second correction on the decoding result or the first corrected decoding result to obtain a second corrected decoding result, and determining a combined result obtained by sequentially combining all characters in the second corrected decoding result as a container number in the image to be identified, wherein the second corrected decoding result meets the verification rule.
In one possible implementation, the decoding result or the character with the lowest confidence in each character in the first corrected decoding result may be modified to obtain the second corrected decoding result.
According to the verification rule, the target character corresponding to the character with the lowest confidence coefficient when the verification rule is met can be calculated, and then the character with the lowest confidence coefficient is replaced by the target character, so that a second corrected decoding result is obtained.
In addition, in another possible implementation manner, whether the error recognition character recorded in the error recognition table of the same type exists in the first 10 characters in the decoding result or the first corrected decoding result can be judged according to the error recognition table of the same type, if one error recognition character recorded in the error recognition table of the same type exists in the 10 characters, the error recognition character is replaced by the character corresponding to the error recognition character, the corrected decoding result is obtained, whether the corrected decoding result meets a verification rule is further judged, and if the correction rule is met, the combined result of the characters in the corrected decoding result after sequential combination is determined to be the recognized container number. If not, modifying the decoding result or the number with the lowest confidence in each character in the first corrected decoding result according to the method to obtain a second corrected decoding result.
Further, if at least two misidentification characters recorded in the same type of character misidentification table exist in the 11 characters, at this time, any one misidentification character can be replaced by a character corresponding to the misidentification character, a plurality of corrected decoding results are obtained, whether a target decoding result meeting a verification rule exists in the plurality of corrected decoding results is further judged, and if so, the target decoding result is determined to be a second corrected decoding result. And determining the combination result of the characters in the target decoding result after the characters are sequentially combined as the identified container number. If not, at this time, any two wrongly-recognized characters can be replaced by the corresponding characters of the wrongly-recognized characters to obtain at least one corrected decoding result, and further judging whether the corrected decoding result has a target decoding result meeting the check rule, and if so, determining the target decoding result as a second corrected decoding result. And determining the combination result of the characters in the target decoding result after the characters are sequentially combined as the container number in the image to be identified. If the character is not present, modifying the decoding result or the number with the lowest confidence in each character in the first corrected decoding result according to the method to obtain a second corrected decoding result.
In this example, if there is no misrecognized character recorded in the character misrecognizing table of the same type, the decoding result or the number with the lowest confidence coefficient in each character in the first corrected decoding result may be directly modified to obtain the second corrected decoding result.
For example, table 3 is a pre-established table of type-misrecognized characters as shown in an exemplary embodiment. Referring to table 3, at the time of recognition, "M" is easily mistakenly recognized as "N". Therefore, if "M" is identified in the decoding result, at this time, "M" may be modified to "N".
TABLE 3 same type of misrecognized character table
M N
O D
U J
E F
L I
6 8
According to the method provided by the embodiment, whether the decoding result meets the specified verification rule is judged, when the decoding result meets the specified verification rule, the first combination result obtained by sequentially combining all the characters identified in the decoding result is determined to be the container number in the image to be identified, and when the decoding result does not meet the specified verification rule, the decoding result is corrected to obtain the corrected decoding result, and the second combination result obtained by sequentially combining all the characters in the corrected decoding result is determined to be the container number in the image to be identified. Wherein the corrected decoding result satisfies the above-mentioned check rule. In this way, the accuracy can be further improved.
Fig. 8 is a schematic diagram illustrating an implementation of a container number recognition method according to an exemplary embodiment of the present application. Referring to fig. 8, in the example shown in fig. 8, the STN network, the network for performing feature extraction, and the container number recognition model are integrated into a recognition network, and when a target area is input into the recognition network, the network can output a decoding result. In this way, the container number in the image to be identified can be determined based on the decoding result.
The application also provides an embodiment of the container number recognition device corresponding to the embodiment of the container number recognition method.
The embodiment of the container number identification device can be applied to computer equipment. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a memory into a memory by a processor of a computer device where the device is located. In terms of hardware, as shown in fig. 9, a hardware configuration diagram of a computing device where a container number identifying device according to an exemplary embodiment of the present application is located is shown. In addition to the memory 910, the processor 920, the memory 930, and the network interface 940 shown in fig. 9, the computer device in which the apparatus is located in the embodiment generally includes other hardware according to the actual function of the container number identifying apparatus, which will not be described herein.
Fig. 10 is a schematic structural diagram of a container number identification device according to an embodiment of the present application. Referring to fig. 10, the container number identifying device provided in the present embodiment may include a detection module 100, an identifying module 200 and a processing module 300, wherein,
The detection module 100 is configured to locate a target area where the container number is located from an image to be identified including the container number;
The identifying module 200 is configured to spatially transform the target area to obtain a transformed target area;
the identification module 200 is further configured to perform feature extraction on the transformed target area to obtain a first feature map;
the recognition module 200 is further configured to input the first feature map into a pre-trained container number recognition model, sequence the first feature map by using the container number recognition model to obtain a feature sequence, encode the feature sequence to obtain an encoding result, and decode the encoding result to output a decoding result;
and the processing module is used for determining the container number in the image to be identified according to the decoding result.
The device of the present embodiment may be used to implement the technical solution of the method embodiment shown in fig. 1, and its implementation principle and technical effects are similar, and are not described here again.
The present application also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of any of the methods provided in the first aspect of the present application.
In particular, computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., internal hard disk or removable disks), magneto-optical disks, and CD-ROM and DVD-ROM disks.
With continued reference to fig. 9, the present application further provides a computer device, including a memory 910, a processor 920, and a computer program stored in the memory 910 and executable on the processor 920, where the processor 920 implements the steps of any one of the methods provided in the first aspect of the present application when the program is executed.
The foregoing description of the preferred embodiments of the application is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the application.

Claims (8)

1.一种集装箱箱号识别方法,其特征在于,所述方法包括:1. A method for identifying a container number, characterized in that the method comprises: 从包含集装箱箱号的待识别图像中定位集装箱箱号所在的目标区域,并对所述目标区域进行空间变换,得到变换后的目标区域,所述空间变换包括:平移、缩放、旋转;Locating a target area where the container number is located from an image to be identified that contains the container number, and performing spatial transformation on the target area to obtain a transformed target area, wherein the spatial transformation includes: translation, scaling, and rotation; 对所述变换后的目标区域进行特征提取,得到第一特征图;Performing feature extraction on the transformed target area to obtain a first feature map; 将所述第一特征图输入到预先训练好的集装箱箱号识别模型中,按照预设移动步长将预设滑窗在所述第一特征图上滑动,以分割出所述滑窗所在位置的局部特征图;将分割出的所有局部特征图确定为特征序列,并对所述特征序列进行编码处理,得到编码结果,以及对所述编码结果进行解码后输出解码结果;The first feature map is input into a pre-trained container number recognition model, and a preset sliding window is slid on the first feature map according to a preset moving step length to segment a local feature map where the sliding window is located; all segmented local feature maps are determined as a feature sequence, and the feature sequence is encoded to obtain an encoding result, and the encoding result is decoded and then outputted; 判断所述解码结果是否满足指定的校验规则;所述判断所述解码结果是否满足指定的校验规则,包括:判断所述解码结果的组成结构是否与集装箱箱号的组成结构匹配;若不匹配,确定所述解码结果不满足所述校验规则;若匹配,按照预设规则计算所述解码结果的校验值;判断所述校验值是否等于所述解码结果中识别出的校验码;若所述校验值等于所述解码结果中识别出的校验码,则确定所述解码结果满足所述校验规则,否则,确定所述解码结果不满足所述校验规则;Determine whether the decoding result satisfies the specified verification rule; the determining whether the decoding result satisfies the specified verification rule includes: determining whether the composition structure of the decoding result matches the composition structure of the container number; if not, determining that the decoding result does not satisfy the verification rule; if matching, calculating the verification value of the decoding result according to a preset rule; determining whether the verification value is equal to the verification code identified in the decoding result; if the verification value is equal to the verification code identified in the decoding result, determining that the decoding result satisfies the verification rule, otherwise, determining that the decoding result does not satisfy the verification rule; 若满足指定的校验规则,将所述解码结果中识别出的各个字符依序组合后的第一组合结果确定为所述待识别图像中的集装箱箱号;If the specified verification rule is met, a first combination result after sequentially combining the characters identified in the decoding result is determined as the container number in the image to be identified; 若不满足指定的校验规则,对所述解码结果进行校正,得到校正后的解码结果,其中,所述校正后的解码结果满足所述校验规则;将所述校正后的解码结果中的各个字符依序组合后的第二组合结果确定为所述待识别图像中的集装箱箱号;If the specified verification rule is not satisfied, the decoding result is corrected to obtain a corrected decoding result, wherein the corrected decoding result satisfies the verification rule; and a second combination result after sequentially combining the characters in the corrected decoding result is determined as the container number in the image to be identified; 其中,所述从包含集装箱箱号的待识别图像中定位集装箱箱号所在的目标区域,包括:用检测网络输出所述待识别图像上所述集装箱箱号的坐标位置,该坐标位置包括采用四边形表示的所述集装箱箱号的各行或各列的区域坐标,所述四边形包括倾斜四边形,表示所述集装箱箱号具有倾斜角度。Wherein, locating the target area where the container number is located from the image to be identified containing the container number includes: using a detection network to output the coordinate position of the container number on the image to be identified, the coordinate position includes the area coordinates of each row or column of the container number represented by a quadrilateral, and the quadrilateral includes an inclined quadrilateral, indicating that the container number has a tilt angle. 2.根据权利要求1所述的方法,其特征在于,所述从包含集装箱箱号的待识别图像中定位集装箱箱号所在的目标区域,包括:2. The method according to claim 1, characterized in that the step of locating the target area where the container number is located from the image to be identified containing the container number comprises: 将所述待识别图像输入到预先训练好的检测网络,由所述检测网络对所述待识别图像进行多层级特征提取,得到指定数量个第二特征图,并分别在各个所述第二特征图上进行分类和位置回归,输出多个候选区域的分类结果和位置信息;其中,所述指定数量个第二特征图的维度不同;Input the image to be identified into a pre-trained detection network, and perform multi-level feature extraction on the image to be identified by the detection network to obtain a specified number of second feature maps, and perform classification and position regression on each of the second feature maps, and output classification results and position information of multiple candidate regions; wherein the dimensions of the specified number of second feature maps are different; 基于各个候选区域的分类结果和位置信息,对所述多个候选区域进行非极大值抑制处理,得到集装箱箱号所在的目标区域。Based on the classification results and location information of each candidate area, non-maximum suppression processing is performed on the multiple candidate areas to obtain the target area where the container number is located. 3.根据权利要求1所述的方法,其特征在于,所述从包含集装箱箱号的待识别图像中定位集装箱箱号所在的目标区域,包括:3. The method according to claim 1, characterized in that the step of locating the target area where the container number is located from the image to be identified containing the container number comprises: 对所述待识别图像进行尺寸调整,得到多个不同尺寸的目标图像;Resizing the image to be recognized to obtain a plurality of target images of different sizes; 针对每个目标图像,将所述目标图像输入到预先训练好的检测网络,由所述检测网络对所述目标图像进行多层级特征提取,得到指定数量个第二特征图,并分别在各个所述第二特征图上进行分类和位置回归,输出多个候选区域的分类结果和位置信息;其中,所述指定数量个第二特征图的维度不同;For each target image, the target image is input into a pre-trained detection network, and the detection network performs multi-level feature extraction on the target image to obtain a specified number of second feature maps, and performs classification and position regression on each of the second feature maps, and outputs classification results and position information of multiple candidate regions; wherein the dimensions of the specified number of second feature maps are different; 基于各个候选区域的分类结果和位置信息,对所述多个候选区域进行非极大值抑制处理,得到所述目标图像中的集装箱箱号所在的目标区域;Based on the classification results and position information of each candidate area, non-maximum suppression processing is performed on the multiple candidate areas to obtain the target area where the container number in the target image is located; 依据所述目标图像中的集装箱箱号所在的目标区域,确定所述待识别图像中集装箱箱号所在的目标区域。The target area where the container number in the to-be-recognized image is located is determined according to the target area where the container number in the target image is located. 4.根据权利要求1所述的方法,其特征在于,所述对所述目标区域进行空间变换,得到变换后的目标区域,包括:4. The method according to claim 1, characterized in that the step of performing spatial transformation on the target area to obtain the transformed target area comprises: 将所述目标区域输入到预先训练好的STN网络,由所述STN网络对所述目标区域进行空间变换后输出变换后的目标区域。The target region is input into a pre-trained STN network, and the STN network performs spatial transformation on the target region and then outputs the transformed target region. 5.根据权利要求1所述的方法,其特征在于,所述对所述特征序列进行编码处理,得到编码结果,以及对所述编码结果进行解码后输出解码结果,包括:5. The method according to claim 1, characterized in that the encoding process of the feature sequence to obtain an encoding result, and the decoding process of the encoding result to output a decoding result, comprises: 计算各个时刻所述特征序列中各特征的权值参数;Calculating the weight parameter of each feature in the feature sequence at each moment; 依据各个时刻所述特征序列中各特征的权值参数和所述特征序列,计算各个时刻的编码结果;Calculating the encoding result at each moment according to the weight parameter of each feature in the feature sequence at each moment and the feature sequence; 依据所述特征序列和各个时刻的编码结果,计算各个时刻上下文相关的隐层状态;Calculating the context-related hidden layer state at each moment according to the feature sequence and the encoding result at each moment; 依据各个时刻上下文相关的隐层状态,得到各个时刻的解码结果。According to the hidden layer state related to the context at each moment, the decoding result at each moment is obtained. 6.一种集装箱箱号识别装置,其特征在于,所述装置包括检测模块、识别模块和处理模块,其中,6. A container number recognition device, characterized in that the device comprises a detection module, an identification module and a processing module, wherein: 所述检测模块,用于从包含集装箱箱号的待识别图像中定位集装箱箱号所在的目标区域;The detection module is used to locate the target area where the container number is located from the image to be identified containing the container number; 所述识别模块,用于对所述目标区域进行空间变换,得到变换后的目标区域,所述空间变换包括:平移、缩放、旋转;The recognition module is used to perform spatial transformation on the target area to obtain a transformed target area, wherein the spatial transformation includes: translation, scaling, and rotation; 所述识别模块,还用于对所述变换后的目标区域进行特征提取,得到第一特征图;The recognition module is further used to extract features from the transformed target area to obtain a first feature map; 所述识别模块,还用于将所述第一特征图输入到预先训练好的集装箱箱号识别模型中,按照预设移动步长将预设滑窗在所述第一特征图上滑动,以分割出所述滑窗所在位置的局部特征图;将分割出的所有局部特征图确定为特征序列,并对所述特征序列进行编码处理,得到编码结果,以及对所述编码结果进行解码后输出解码结果;The recognition module is further used to input the first feature map into a pre-trained container number recognition model, slide a preset sliding window on the first feature map according to a preset moving step length to segment a local feature map where the sliding window is located; determine all the segmented local feature maps as a feature sequence, encode the feature sequence to obtain an encoding result, and decode the encoding result and output a decoding result; 所述处理模块,用于判断所述解码结果是否满足指定的校验规则;所述判断所述解码结果是否满足指定的校验规则,包括:判断所述解码结果的组成结构是否与集装箱箱号的组成结构匹配;若不匹配,确定所述解码结果不满足所述校验规则;若匹配,按照预设规则计算所述解码结果的校验值;判断所述校验值是否等于所述解码结果中识别出的校验码;若所述校验值等于所述解码结果中识别出的校验码,则确定所述解码结果满足所述校验规则,否则,确定所述解码结果不满足所述校验规则;The processing module is used to determine whether the decoding result satisfies the specified verification rule; the determination of whether the decoding result satisfies the specified verification rule includes: determining whether the composition structure of the decoding result matches the composition structure of the container number; if not, determining that the decoding result does not satisfy the verification rule; if matching, calculating the verification value of the decoding result according to a preset rule; determining whether the verification value is equal to the verification code identified in the decoding result; if the verification value is equal to the verification code identified in the decoding result, determining that the decoding result satisfies the verification rule, otherwise, determining that the decoding result does not satisfy the verification rule; 若满足指定的校验规则,将所述解码结果中识别出的各个字符依序组合后的第一组合结果确定为所述待识别图像中的集装箱箱号;If the specified verification rule is met, a first combination result after sequentially combining the characters identified in the decoding result is determined as the container number in the image to be identified; 若不满足指定的校验规则,对所述解码结果进行校正,得到校正后的解码结果,其中,所述校正后的解码结果满足所述校验规则;将所述校正后的解码结果中的各个字符依序组合后的第二组合结果确定为所述待识别图像中的集装箱箱号;If the specified verification rule is not satisfied, the decoding result is corrected to obtain a corrected decoding result, wherein the corrected decoding result satisfies the verification rule; and a second combination result after sequentially combining the characters in the corrected decoding result is determined as the container number in the image to be identified; 其中,所述检测模块,具体用于:用检测网络输出所述待识别图像上所述集装箱箱号的坐标位置,该坐标位置包括采用四边形表示的所述集装箱箱号的各行或各列的区域坐标,所述四边形包括倾斜四边形,表示所述集装箱箱号具有倾斜角度。Among them, the detection module is specifically used to: use the detection network to output the coordinate position of the container number on the image to be identified, and the coordinate position includes the regional coordinates of each row or column of the container number represented by a quadrilateral, and the quadrilateral includes an inclined quadrilateral, indicating that the container number has a tilt angle. 7.一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述程序被处理器执行时实现权利要求1-5任一项所述方法的步骤。7. A computer-readable storage medium having a computer program stored thereon, wherein when the program is executed by a processor, the steps of the method according to any one of claims 1 to 5 are implemented. 8.一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现权利要求1-5任一项所述方法的步骤。8. A computer device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method according to any one of claims 1 to 5 when executing the program.
CN201811113365.XA 2018-09-25 2018-09-25 A container number identification method, device and computer equipment Active CN110942057B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811113365.XA CN110942057B (en) 2018-09-25 2018-09-25 A container number identification method, device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811113365.XA CN110942057B (en) 2018-09-25 2018-09-25 A container number identification method, device and computer equipment

Publications (2)

Publication Number Publication Date
CN110942057A CN110942057A (en) 2020-03-31
CN110942057B true CN110942057B (en) 2024-12-06

Family

ID=69904808

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811113365.XA Active CN110942057B (en) 2018-09-25 2018-09-25 A container number identification method, device and computer equipment

Country Status (1)

Country Link
CN (1) CN110942057B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626982A (en) * 2020-04-13 2020-09-04 中国外运股份有限公司 Method and device for identifying batch codes of containers to be detected
CN112116586B (en) * 2020-09-29 2025-01-03 腾讯科技(深圳)有限公司 Method, device, computer equipment and storage medium for determining box fork slot area
CN113052156B (en) * 2021-03-12 2023-08-04 北京百度网讯科技有限公司 Optical character recognition method, device, electronic equipment and storage medium
CN113408512A (en) * 2021-06-03 2021-09-17 云从科技集团股份有限公司 Method, system, device and medium for checking container by using robot
CN114267032A (en) * 2021-12-10 2022-04-01 广东省电子口岸管理有限公司 Container positioning identification method, device, equipment and storage medium
CN114529893A (en) * 2021-12-22 2022-05-24 电子科技大学成都学院 Container code identification method and device
CN115527209A (en) * 2022-09-22 2022-12-27 宁波港信息通信有限公司 Method, device and system for identifying shore bridge box number and computer equipment
CN116229280B (en) * 2023-01-09 2024-06-04 广东省科学院广州地理研究所 Method and device for identifying collapse sentry, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203539A (en) * 2015-05-04 2016-12-07 杭州海康威视数字技术股份有限公司 The method and apparatus identifying container number
CN107679531A (en) * 2017-06-23 2018-02-09 平安科技(深圳)有限公司 Licence plate recognition method, device, equipment and storage medium based on deep learning
CN108205673A (en) * 2016-12-16 2018-06-26 塔塔顾问服务有限公司 Method and system for container code identification

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101770569A (en) * 2008-12-31 2010-07-07 汉王科技股份有限公司 Dish name recognition method based on OCR
US8977059B2 (en) * 2011-06-03 2015-03-10 Apple Inc. Integrating feature extraction via local sequential embedding for automatic handwriting recognition
CN102841928B (en) * 2012-07-18 2015-12-09 中央人民广播电台 File security sending, receiving method and device between net
CN107133616B (en) * 2017-04-02 2020-08-28 南京汇川图像视觉技术有限公司 Segmentation-free character positioning and identifying method based on deep learning
CN107423732A (en) * 2017-07-26 2017-12-01 大连交通大学 Vehicle VIN recognition methods based on Android platform
CN107527059B (en) * 2017-08-07 2021-12-21 北京小米移动软件有限公司 Character recognition method and device and terminal
CN107563245A (en) * 2017-08-24 2018-01-09 广东欧珀移动通信有限公司 The generation of graphic code and method of calibration, device and terminal, readable storage medium storing program for executing
CN107798327A (en) * 2017-10-31 2018-03-13 北京小米移动软件有限公司 Character identifying method and device
CN107871126A (en) * 2017-11-22 2018-04-03 西安翔迅科技有限责任公司 Model recognizing method and system based on deep-neural-network
CN108009515B (en) * 2017-12-14 2022-04-22 杭州远鉴信息科技有限公司 Power transmission line positioning and identifying method of unmanned aerial vehicle aerial image based on FCN
CN108062754B (en) * 2018-01-19 2020-08-25 深圳大学 Segmentation and recognition method and device based on dense network image
CN108399419B (en) * 2018-01-25 2021-02-19 华南理工大学 Method for recognizing Chinese text in natural scene image based on two-dimensional recursive network
CN108491836B (en) * 2018-01-25 2020-11-24 华南理工大学 An overall recognition method for Chinese text in natural scene images
CN108416318A (en) * 2018-03-22 2018-08-17 电子科技大学 Diameter radar image target depth method of model identification based on data enhancing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203539A (en) * 2015-05-04 2016-12-07 杭州海康威视数字技术股份有限公司 The method and apparatus identifying container number
CN108205673A (en) * 2016-12-16 2018-06-26 塔塔顾问服务有限公司 Method and system for container code identification
CN107679531A (en) * 2017-06-23 2018-02-09 平安科技(深圳)有限公司 Licence plate recognition method, device, equipment and storage medium based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Visual attention models for scene text recognition;Ghosh,S.K,et al.;2017 14th IAPR International Conference on Document Analysis and Recognition;第994页第1节至第945页第2节,图2 *

Also Published As

Publication number Publication date
CN110942057A (en) 2020-03-31

Similar Documents

Publication Publication Date Title
CN110942057B (en) A container number identification method, device and computer equipment
CN106909924B (en) Remote sensing image rapid retrieval method based on depth significance
CN110570433B (en) Image semantic segmentation model construction method and device based on generation countermeasure network
CN110188827B (en) Scene recognition method based on convolutional neural network and recursive automatic encoder model
CN109118473B (en) Angular point detection method based on neural network, storage medium and image processing system
CN107305630B (en) Text sequence recognition method and device
CN108038435B (en) Feature extraction and target tracking method based on convolutional neural network
CN111046859B (en) Character recognition method and device
CN108399625B (en) A SAR Image Orientation Generation Method Based on Deep Convolutional Generative Adversarial Networks
CN106845341B (en) Unlicensed vehicle identification method based on virtual number plate
CN110909618B (en) Method and device for identifying identity of pet
CN111079683A (en) Remote sensing image cloud and snow detection method based on convolutional neural network
CN106815323B (en) Cross-domain visual retrieval method based on significance detection
CN108805157A (en) Classifying Method in Remote Sensing Image based on the random supervision discrete type Hash in part
CN111523537A (en) Character recognition method, storage medium and system
CN106203373B (en) A kind of human face in-vivo detection method based on deep vision bag of words
CN107533671B (en) Pattern recognition device, pattern recognition method, and recording medium
CN111353325B (en) Key point detection model training method and device
US8934716B2 (en) Method and apparatus for sequencing off-line character from natural scene
US10580127B2 (en) Model generation apparatus, evaluation apparatus, model generation method, evaluation method, and storage medium
CN114565789B (en) Text detection method, system, device and medium based on set prediction
CN110942073A (en) Container trailer number identification method and device and computer equipment
CN109902751B (en) Dial Digit Character Recognition Method Fusion Convolutional Neural Network and Half-word Template Matching
WO2007026951A1 (en) Image search method and device
CN108694411B (en) A method for identifying similar images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant