CN117671617A

CN117671617A - Real-time lane recognition method in container port environment

Info

Publication number: CN117671617A
Application number: CN202311654539.4A
Authority: CN
Inventors: 叶海雄; 高明琪; 胡庆松; 康智超; 许哲
Original assignee: Shanghai Ocean University
Current assignee: Shanghai Ocean University
Priority date: 2023-12-05
Filing date: 2023-12-05
Publication date: 2024-03-08

Abstract

The invention belongs to the technical field of intelligent transportation of containers, and discloses a real-time lane recognition method under a container port environment, which is based on a port environment image captured by a vehicle-mounted monocular camera, and performs preprocessing on the port environment image to obtain a first target feature and a second target feature; wherein the first target feature includes, but is not limited to, road, vehicle, and road sign key visual features; the second target feature is a scene feature; the lane line detection model comprises a visual characteristic processing module and a scene priori perception module; extracting a first target feature based on feature pooling, and performing enhancement and outlier correction processing on the first target feature; extracting a second target feature based on the scene prior perception module; and inputting the first target feature and the second target feature into a lane line detection module for deep learning, outputting a lane line detection result of the current road in real time, and feeding back and warning in real time according to the result.

Description

Real-time lane recognition method in container port environment

Technical Field

The invention relates to the technical field of intelligent transportation of containers, in particular to a real-time lane recognition method in a container port environment.

Background

In recent years, the development of automated container ports requires highly robust lane line detection techniques to ensure efficient flow of cargo and safe travel of vehicles. However, the harbor environment is complex, and there are challenges such as shielding of vision, abrasion of lane lines, and ground marks similar to lane lines but semantically different. In the prior art, the performance is unstable under complex road conditions, the high-resolution camera is interfered by illumination, and a deep learning algorithm needs a large amount of data and calculation resources. Therefore, development of a lane line detection technology with strong adaptability is needed to support reliable navigation of port and logistics operations.

In view of this, the present invention provides a real-time lane recognition method in a container port environment.

Disclosure of Invention

In order to overcome the problems in the prior art, the invention provides a real-time lane recognition method in a container port environment, which aims at complex port environment conditions and mixed running of various vehicles, ensures that the vehicles can accurately and safely run, and reduces traffic accidents caused by fuzzy or missing lane lines.

According to one aspect of the present invention, there is provided a real-time lane recognition method in a container port environment, comprising the steps of:

preprocessing a port environment image based on the port environment image captured by the vehicle-mounted monocular camera to obtain a first target feature and a second target feature; wherein the first target feature includes, but is not limited to, road, vehicle, and road sign key visual features; the second target feature is a scene feature;

the lane line detection model comprises a visual characteristic processing module and a scene priori perception module; extracting a first target feature based on feature pooling, and performing enhancement and outlier correction processing on the first target feature; extracting a second target feature based on the scene prior perception module;

and inputting the first target feature and the second target feature into a lane line detection module for deep learning, outputting a lane line detection result of the current road in real time, and feeding back and warning in real time according to the result.

As a preferred aspect of the present invention, the first target feature is a key visual feature including, but not limited to, road features, vehicle features, and road sign features; and taking all the key visual features as a comparison group in the lane line detection model, carrying out data enhancement on the key visual features, expanding a training data set, and carrying out outlier correction on the data.

As a preferred aspect of the present invention, the second object feature is a scene environment feature, including but not limited to port topography, port logo and port object. The scene prior perception module comprises port topography, port marks or port objects in a port environment.

As a preferable scheme of the invention, the lane line detection model adopts a pyramid structure to extract and integrate characteristic layers of different layers, wherein the characteristic layers specifically comprise an L0 characteristic layer, an L1 characteristic layer and an L2 characteristic layer, and the L0 characteristic layer, the L1 characteristic layer and the L2 characteristic layer are respectively arranged from high level to low level;

based on a feature fusion strategy of a spatial attention mechanism, pooling and fusing the features of the L0 feature layer and the L2 feature layer; the L2 feature layer extracts basic feature information of the first target image, and the L0 feature layer inputs a scene prior perception module to perform scene classification tasks.

As a preferred scheme of the invention, the L2 feature layer further comprises a feature pooling based on the leachable anchor points to obtain key visual features, and a lane line in the container port environment is adaptively detected based on the key visual features;

in the lane line detection task, we define each anchor point as a four-dimensional vector reg _i ＝{S _ix ,S _iy ,θ _i ,len _i S, where S _ix ,S _iy Is the normalized x, y coordinates, θ, of the starting point i _i Indicates the direction, len _i Representing length, learnable anchor parameters reg _i Dynamically updated during training by a back propagation algorithm.

As a preferable scheme of the invention, the features of the L0 feature layer and the L2 feature layer are pooled and fused based on a feature fusion strategy of a spatial attention mechanism, and the specific steps are as follows:

performing feature transformation on the L0 feature layer and the L2 feature layer through the full connection layer, and then aligning in the space dimension through reshape operation;

after the full connection layer and the reshape, summing the two groups of feature images to obtain an initial fusion feature image;

calculating pixel-level self-attention scores for the initial fusion feature images to obtain a spatial attention weight image;

using the softmax layer, we regularize these self-attention scores into probability distributions in the range of 0 to 1;

element-level multiplication of the spatial attention weighting map is performed on the feature maps of the L0 feature layer and the L2 feature layer to produce a final fused feature map.

As a preferred scheme of the present invention, the specific application logic of the scene prior perception module is:

the full connection layer maps the original output of the scene priori aware module to a new scene representation space;

based on the connection of the new scene representation space and the main output of the lane detection model, two new outputs, namely a classification task and a regression task, are obtained.

According to another aspect of the present invention, there is provided a real-time lane recognition system in a container port environment, based on implementation of a real-time lane recognition method in a container port environment, comprising:

the image acquisition module is used for preprocessing the port environment image based on the port environment image captured by the vehicle-mounted monocular camera to obtain a first target feature and a second target feature; wherein the first target feature includes, but is not limited to, road, vehicle, and road sign key visual features; the second target feature is a scene feature;

the lane detection module inputs the first target feature and the second target feature into the lane line detection module for deep learning, outputs the lane line detection result of the current road in real time, and feeds back and warns in real time aiming at the result.

According to another aspect of the invention, there is provided a computer program product stored on a computer readable medium, comprising a computer readable program for, when executed on an electronic device, providing a user input interface for implementing the method of real-time lane recognition in a container port environment.

According to yet another aspect of the present invention, there is provided a computer readable storage medium storing instructions that, when executed on a computer, cause the computer to perform the method of real-time lane identification in a container port environment.

The invention discloses a technical effect and advantages of a real-time lane recognition method in a container port environment, which are as follows:

according to the method, the characteristics of the container port are optimized, and the lane line detection model is enabled to better understand the complex scene by introducing the scene priori perception module and the attention mechanism, so that the robustness of the model is improved, and the accuracy of lane line detection is improved;

the invention adopts the image captured by the monocular camera as training data, thereby effectively reducing the data acquisition cost. Camera video streaming data is generally readily available and is already ubiquitous on port/logistics park horizontally running vehicles and mobile devices, without the need for additional purchase of expensive sensors or devices.

The deep learning model is trained, and the subsequent use and maintenance cost is relatively low. Compared with the traditional model driving method, the method does not need to frequently carry out parameter adjustment and model update, and only needs to provide new scene data input.

Through embedded deployment optimization, the algorithm is easily integrated into the existing port/logistics park horizontal operation vehicle system without large-scale infrastructure transformation and system reconstruction, so that the deployment cost is reduced, and the application and benefit of the technology can be realized more quickly.

Drawings

FIG. 1 is a flow chart of a method for identifying real-time lanes in a container port environment according to the present invention;

FIG. 2 is a block diagram of the algorithm design of the lane line detection model of the present invention. The method comprises the steps of carrying out a first treatment on the surface of the

FIG. 3 is an exemplary diagram of various scene lanes in a container port environment;

FIG. 4 is a view showing the proportions of each scene of the training image of the lane line detection model of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Despite substantial progress in lane line detection technology over the past decades, significant challenges remain in dealing with unique environments such as enclosed harbor areas. On the one hand, the port environment mainly accommodates heavy vehicles, resulting in serious wear and obstruction of the lane markings during congestion, which is more pronounced than ordinary roads (as shown in fig. 3 (a) and (b)). This situation requires a lane detection model with greater robustness. On the other hand, port environments exhibit some unique features, such as complex ground markings (as shown in fig. 3 (c), (d) and (e)), which may be visually similar to lane lines, but have entirely different semantics. There is a need to develop new lane detection methods that not only accurately identify lanes, but also effectively distinguish between these different types of ground markings.

Example 1

Referring to fig. 1-2, the method for identifying real-time lanes in a container port environment according to the present embodiment includes the following steps:

capturing a port environment image of a port environment using an onboard monocular camera; preprocessing the collected port environment image, analyzing the image to obtain a first target image, and using the first target image for input data of a lane line detection model;

it should be noted that: the method for detecting the lane lines based on the existing monocular camera has practicability and economy in ports and logistics parks. The method has the advantages that the existing port environment information is directly extracted, based on information analysis extracted by the existing equipment, expensive sensors or equipment are not required to be purchased additionally, in practical application, the acquisition cost of video stream data is relatively low, the video stream data is easier to popularize in ports/logistics parks, the practicability is high, therefore, the first target image is used for input data of a lane line detection model, the implementation can be fast, the cost is reduced, the operation efficiency and the safety are improved, and the demands of ports and logistics fields are met.

The lane line detection model comprises a visual characteristic pooling module and a scene priori perception module;

extracting a first target feature based on feature pooling, and performing enhancement and outlier correction processing on the first target feature;

it should be noted that: wherein the first target feature is a key visual feature including, but not limited to, road features, vehicle features, and road sign features; taking all the key visual features as a comparison group in a lane line detection model, carrying out data enhancement on the key visual features, expanding a training data set, and carrying out outlier correction on the data; the lane line detection model is optimized through the enhanced and corrected data, so that the accuracy of key visual feature extraction is improved, the content in the image is better understood, and important information is provided for subsequent tasks such as image analysis, target detection and recognition;

the lane line detection model adopts a pyramid structure to extract and integrate characteristic layers of different layers, wherein the characteristic layers specifically comprise an L0 characteristic layer, an L1 characteristic layer and an L2 characteristic layer, and the L0 characteristic layer, the L1 characteristic layer and the L2 characteristic layer are respectively arranged from high level to low level;

wherein pooling and fusion are performed based on the features of the L0 feature layer and the L2 feature layer; the basic feature information of the first target image is extracted through the L2 feature layer, and the basic feature information is input into a scene priori perception module through the L0 feature layer to carry out scene classification tasks.

It should be noted that: in this way the expressive power of the model is increased. The method is beneficial to improving the performance and the robustness of the model, and the model can extract the basic structural information of the image by using low-level features and understand more complex semantic information of the image by using high-level features.

In addition, in order to enhance understanding of the lane line detection model on the lane line application scene, the features output by the corresponding L0 feature layer in the visual feature pooling module are input to the scene priori perception module, and the scene classification task is completed based on the scene priori perception module, so that key visual features are extracted. Such a scene prior perception module may help the model better understand the background and context information of the image, thereby more accurately classifying and identifying objects in the image.

In general, such feature extraction model designs combine features of different levels, extract basic structural information of an image using low-level features, and understand more complex semantic information of the image through high-level features to improve accuracy and robustness of the model. Meanwhile, the scene priori perception module is introduced to enhance the understanding capability of the model to the scene.

The L2 feature layer also comprises a feature pooling based on a leachable anchor point to acquire key visual features, and a lane line in the container port environment is adaptively detected based on the key visual features;

detecting any lane lineIn this task, we define each anchor point as a four-dimensional vector reg _i ＝{S _ix ,S _iy ,θ _i ,len _i S, where S _ix ,S _iy Is the normalized x, y coordinates, θ, of the starting point i _i Indicates the direction, len _i Representing length, learnable anchor parameters reg _i Dynamically updated during training by a back propagation algorithm.

It should be noted that: because the lane position distribution has certain statistical regularity, the strategy enables the model to be more flexibly adapted to the lane line characteristics in the port environment of the closed container. These anchor points are dynamically adjusted during training by a back propagation algorithm to more accurately align with the lane lines. The method can not only enable the model to perform more accurate lane positioning and avoid the situation that other ground marks are mistakenly identified as lane lines, but also improve the calculation efficiency of the model.

A direct initialization strategy is chosen in lane line detection so that the algorithm converges faster. In the pooling process, we refer to the pooling method in LaneATT, and a single-stage detector is implemented through the anchor point itself. This design allows our model to more efficiently use global information in the feature map rather than just boundary information, further improving model performance.

Based on the strategy of the spatial attention mechanism, the effective pooling and fusion of the L0 feature layer and the L2 feature layer features are realized, and the specific steps are as follows:

carrying out feature transformation on the L0 feature layer and the L2 feature layer through the full connection layer, and then adjusting the shapes of the L0 feature layer and the L2 feature layer to be aligned in dimension through reshape operation;

using the softmax layer, we regularize these self-attention scores into probability distributions in the range of 0 to 1, where the attention weight of each pixel reflects its importance in feature fusion;

element-level multiplication of the spatial attention weighting map is performed on the feature maps of the L0 feature layer and the L2 feature layer to achieve selective feature fusion. Thus, the eigenvalues of each pixel are adjusted according to their corresponding weights in the spatial attention weighting map to produce the final fused eigenvector.

It should be noted that: the feature fusion strategy significantly enhances the expressive power of the model and the ability to identify lane line features. The spatial attention mechanism plays a key role in the feature fusion stage, and can learn autonomously and assign weights to different input feature maps to produce the final fused feature map. The basic premise of this approach is that high-level features carry a lot of semantic information, while low-level features encapsulate a lot of detailed information. By fusing the two types of features, we can obtain feature representations which are rich in semantics and retain details, which is critical to the lane detection task.

In this way, our model can dynamically allocate different attention weights according to the importance of each feature, thereby improving the efficiency of feature fusion and the perceptibility of the model to lane lines in a complex environment. In addition, the introduction of the spatial attention mechanism enhances the interpretability of our model, so that we can better understand the mode of operation of the model.

Extracting a second target feature based on the scene prior perception module;

the second target feature is a scene environmental feature including, but not limited to, port terrain, port markers, and port objects. The scene prior perception module comprises port topography, port marks or port objects in the port environment, and the scene prior perception module is used for carrying out recognition and analysis so as to enhance the understanding of the model to the environment.

It should be noted that: the scene priori aware module provides a higher level of context semantic information for lane detection. The task of this module is to identify the scene in which the entire input image is located, e.g. "container area", "main road", etc. Such scene information is very valuable for lane detection tasks, as different scenes typically have their specific lane line layout and style.

The purpose of this module is to determine if the current feature belongs to a lane line or other marking and to the overall feature of the lane structure, providing a priori conditions. We use the fully connected layer to map the original output (i.e. scene logic) C of the scene a priori aware module to a new representation space. The output FC (C) of the fully connected layer is a richer representation of the scene, providing more scene information. Subsequently, we connect this new scene representation with the main output of the lane detection model, resulting in two new outputs, lc s for the classification task and Lreg for the regression task. The output Lc l s of the classification task may be used to determine whether the current feature belongs to a lane line or other marker, while the output Lreg of the regression task may be used to estimate overall features such as the position and shape of the lane line. The design can enable the model to process different types of tasks simultaneously, and improve the efficiency and performance of the model.

It should be noted that: the lane line detection module outputs a lane line detection result of the current road in real time. These results may be used for navigation systems, autopilot systems or assistance decisions for the driver of the vehicle. The system may also provide real-time feedback and warnings to ensure safe driving of the vehicle if an abnormal situation or safety problem is detected.

The embodiment mainly utilizes computer vision and deep learning technology to improve the accuracy and the robustness of lane line detection in the port environment. By integrating preprocessing, scene perception and deep learning models, reliable detection and real-time feedback of lane lines in a complex environment can be realized, and safety and efficiency of port operation can be improved.

Example 2

In the present embodiment, on the basis of embodiment 1, a lane line detection model is analyzed, wherein the lane line detection model trains each scene ratio of an image, for example, as shown in fig. 4;

in addition, the current real-time lane detection model uses PortLaneNet, portLaneNet with RGB images captured by a monocular camera as input, and the predicted lane is denoted as l= { L1, L2, …, L N }.

And compared, the index parameters obtained through PortLananeNet lane detection are highest, and the correlation is strongest. The method can obtain the evidence from the table 1, and compared with the model processing methods such as LaneatT 7, DI Lane 9, GANet 15, CLRNet 8 and the like, the PortLananenet detection method has more outstanding accuracy and robustness, and is beneficial to improving the safety and efficiency of port operation.

Table 1: comparison table of different real-time lane detection models

The specific application logic for the lane detection model is:

step 1: preparing a manually marked lane line data set;

step 2: marking the position parameters and the Euler angle parameters of the lane lines in the image to obtain a marked lane line image;

step 3: using the rapid lane line detection model to forward transmit a lane line image with a mark to obtain a recognition result and a loss function of the lane line image;

step 4: and reversely updating parameters in the rapid lane line detection model by using the loss function in the step 3.

The above formulas are all formulas with dimensionality removed and numerical calculation, the formulas are formulas with the latest real situation obtained by software simulation through collecting a large amount of data, and preset parameters and threshold selection in the formulas are set by those skilled in the art according to the actual situation.

Example 3

The embodiment, which is not described in detail in embodiment 1, provides a distributed cloud data retrieval method, based on implementation of a real-time lane recognition method in a container port environment, including:

Example 4

In a part not described in detail in this embodiment, but in the description of embodiment 1, a computer program product stored on a computer readable medium is provided, including a computer readable program, for providing a user input interface when executed on an electronic device to implement the method for real-time lane recognition in a container port environment.

Example 5

In a part not described in detail in this embodiment, but in the description of embodiment 1, a computer-readable storage medium is provided, in which instructions are stored, which when executed on a computer, cause the computer to perform the method for identifying a real-time lane in a container port environment.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Finally: the foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. The real-time lane recognition method in the container port environment is characterized by comprising the following steps:

2. The method of claim 1, wherein the first target feature is a key visual feature including, but not limited to, road features, vehicle features and road sign features; and taking all the key visual features as a comparison group in the lane line detection model, carrying out data enhancement on the key visual features, expanding a training data set, and carrying out outlier correction on the data.

3. A method of real time lane identification in a container port environment according to claim 2 wherein the second target feature is a scene environment feature including, but not limited to, port topography, port logo and port object. The scene prior perception module comprises port topography, port marks or port objects in a port environment.

4. The method for identifying the real-time lane in the container port environment according to claim 2, wherein the lane line detection model adopts a pyramid structure to extract and integrate different layers of characteristic layers, the characteristic layers specifically comprise an L0 characteristic layer, an L1 characteristic layer and an L2 characteristic layer, and the L0 characteristic layer, the L1 characteristic layer and the L2 characteristic layer are respectively arranged from high level to low level;

5. The method of claim 4, wherein the L2 feature layer further comprises obtaining key visual features based on feature pooling of the learnable anchor points, and adaptively detecting lane lines in the container port environment based on the key visual features;

6. The method for recognizing real-time lanes in container port environment according to claim 5, wherein features of the L0 feature layer and the L2 feature layer are pooled and fused based on a feature fusion strategy of a spatial attention mechanism, specifically comprising the steps of:

7. The method for recognizing real-time lanes in a container port environment according to claim 6, wherein the specific application logic of the scene priori aware module is:

8. A real-time lane recognition system in a container port environment, characterized in that it is based on the implementation of a real-time lane recognition method in a container port environment according to any one of claims 1-7, comprising:

9. A computer program product stored on a computer readable medium, characterized by: comprising a computer readable program which, when executed on an electronic device, provides a user input interface to implement a method of real-time lane recognition in a container port environment according to any one of claims 1 to 8.

10. A computer-readable storage medium, characterized by: instructions stored which, when executed on a computer, cause the computer to perform a method for real-time lane recognition in a container port environment according to any one of claims 1 to 8.