Detailed Description
In order that the above-mentioned objects, features and advantages of the present application may be more clearly understood, the solution of the present application will be further described below. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, but the present application may be practiced in other ways than those described herein; it is to be understood that the embodiments described in this specification are only some embodiments of the present application and not all embodiments.
The embodiment of the application provides a road image labeling method for lane line identification, which is used for realizing automatic labeling of lane lines in road images. Fig. 1 is a flowchart of a road image labeling method for lane line identification according to an embodiment of the present disclosure. As shown in fig. 1, the road image annotation method provided by the embodiment of the application includes steps S101 to S103.
S101: the method comprises the steps of obtaining three-dimensional coordinate data of a lane line in a road and obtaining the pose of a camera for shooting the road.
The three-dimensional coordinate data of the lane line is data representing the position characteristics of the lane line in the road in a three-dimensional coordinate system. The three-dimensional coordinate system may be a vehicle coordinate system or a world coordinate system, which is not particularly limited in the embodiments of the present application.
The three-dimensional coordinate data of the lane line may be represented in the form of coordinates of a midpoint of the lane line, or may be represented by a spatial coordinate expression in the extending direction of the lane line, which is not particularly limited in the embodiment of the present application. In practical application, if the three-dimensional coordinate data of the lane line is obtained by adopting an image processing method, the three-dimensional coordinate data is preferably represented in a form of coordinate points; if the three-dimensional coordinate data of the lane line is obtained from data such as a high-precision map, the three-dimensional coordinate data is preferably represented by adopting a space coordinate expression mode.
In the specific application of the embodiment of the application, the following methods are used for acquiring the three-dimensional coordinate data of the lane line in the road.
1. The method for three-dimensional reconstruction of the lane line based on the road image is characterized in that at least two road images which shoot the same area are used as a basis, three-dimensional point clouds of the road are obtained through reduction, and three-dimensional coordinate data of the lane line are extracted based on the three-dimensional point cloud data of the road. Specifically, steps S1011-S1012 are included.
S1011: and acquiring at least two road images.
The three-dimensional reconstruction method of the lane line based on the road image needs to utilize a plurality of road images shot in the same area to restore the three-dimensional characteristics of the road, so at least two road images need to be acquired. The aforementioned road image should be an image formed by photographing the same road area. It should be noted that the camera poses corresponding to the respective road images should be different.
In an application of the embodiment of the application, the road image labeling method for lane line identification is executed at a remote server, and the remote server is in communication connection with a vehicle client and receives collected data generated and reported by various sensors of the vehicle client. The collected data includes road images captured by a vehicle camera, position data of the vehicle, traveling direction data of the vehicle, vehicle attitude data, and the like. Therefore, it is possible to determine a vehicle traveling to a specific geographic range based on the position data of the vehicle and the traveling direction data of the vehicle, and to take an image captured by a camera of the vehicle in the specific geographic range as a road image for three-dimensional reconstruction.
S1012: and three-dimensional reconstruction is carried out on the road according to the at least two road images by adopting a three-dimensional reconstruction method, and the three-dimensional coordinate data of the lane line is determined.
In specific implementation, the method for determining the three-dimensional coordinate data of the lane line according to the three-dimensional reconstruction of the road by at least two road images comprises the following steps of: (1) acquiring matching feature points in the at least two road images; (2) determining the pose of a camera forming the road image according to the matched feature points; (3) constructing a space dense point cloud representing the road according to the at least two road images and the corresponding poses of the cameras; (4) obtaining a lane line point cloud according to the space dense point cloud; (5) and determining the three-dimensional coordinate data of the lane line according to the lane line point cloud.
(1) And acquiring the matched feature points in the at least two road images.
The obtaining of the matching feature points in each road image specifically includes feature point extraction and feature point matching. The characteristic points refer to image pixel points or pixel regions with typical characteristic in the image, and most of the characteristic points are points with large changes of pixel gray levels of some neighborhood pixels in specific application.
At present, the method for acquiring the image feature points includes: A. a weighted average Harris-Laplace feature point extraction algorithm; B. a feature extraction algorithm based on scale invariant feature transformation; searching an extreme point in a spatial scale by detecting the local characteristics of an image, and taking the extreme point as a characteristic point; C. a feature extraction algorithm based on accelerated robust features; the feature points based on the accelerated robust features are subjected to scale invariant feature simplification thought for reference, and the feature points are rapidly extracted by means of an integral graph and a haar wavelet technology.
After the feature points of each road image are obtained, the feature points are required to be matched to determine matched feature points.
In this embodiment of the application, the method for determining matching feature points based on feature points of each road image may include: A. the feature matching method adopting the normalized cross-correlation has the advantages of being capable of resisting global brightness change and contrast change and high in processing speed. B. The feature point matching method based on the feature with the unchanged size comprises the steps of calculating feature vectors of feature points by utilizing the field of the feature points, and then determining matched feature points based on Euclidean distances among the feature vectors. C. And (3) a feature extraction algorithm based on the accelerated robust features.
It should be noted that, before extracting the matching feature points for each road image, in practical applications, each road image needs to be preprocessed. The goal of the preprocessing is to improve the visual effect of the image, improve the definition of the image, selectively highlight useful information in the image and suppress useless information; the image preprocessing comprises the smoothing processing of the image, and specifically adopts methods such as morphological filtering, bilateral filtering, adaptive mean filtering, adaptive median filtering, adaptive weighted filtering and the like.
(2) And determining the pose of a camera forming the road image according to the matched feature points.
The step of determining the pose of the camera of the road image based on the matching feature points is a process of sparse point cloud reconstruction, which obtains the pose of the camera shooting each road image and the three-dimensional coordinates of the sparse point cloud (spatial position points corresponding to the matching feature points) based on the matching feature points.
According to the pinhole imaging model, the relationship between the pixel point position and the three-dimensional space coordinate point formed after the camera shoots the image is
Wherein x and y are respectively the abscissa and the ordinate of a pixel point in an image; k is the internal parameter matrix of the camera,
f is the focal length of the camera, and u and v are the abscissa and ordinate of the pixel point of the camera principal point; [ R | t]Is a matrix of the poses of the camera,

r is a rotation matrix of the camera coordinate system relative to the world coordinate system, t is a translation matrix of the origin of the camera coordinate system relative to the world coordinate system, and X, Y and Z are the space coordinates of the object. Assuming that K, R and t are known, the corresponding numerical intervals of X, Y and Z can be predicted on each road image feature point respectively; binding and adjusting the characteristic points in the road images, namely determining X, Y and Z values; in practical applications, K is generally known, and R and t are not determined, and in the case of sufficient road images and feature points, parameters in R and t can be determined. That is, the three-dimensional coordinates of the sparse point cloud and the pose [ R | t ] of the camera can be determined by the method]。
In the specific application of the embodiment of the present application, the intrinsic parameter matrix K of each camera needs to be acquired. The method for acquiring the parameter matrix K in the camera specifically comprises a Tsai calibration method, a dot template calibration method, a Zhangyinyou plane calibration method and a camera self-calibration method.
In the specific application of the embodiment of the application, the pose of the camera and the three-dimensional coordinates of the sparse point cloud can be calculated by adopting a binding adjustment method based on the principle of minimizing the reprojection error; in practical application, a step of layered reconstruction can be adopted to provide an effective initial value for binding adjustment; in practical application, incremental binding adjustment, global binding adjustment or hybrid binding adjustment methods can be adopted to calculate the pose of each camera and the three-dimensional coordinates of the sparse point cloud.
(3) Constructing a space dense point cloud representing the road according to the at least two road images and the corresponding poses of the cameras; and the reconstruction of the spatial dense point cloud is to calculate the three-dimensional coordinates of a spatial coordinate system corresponding to each pixel point in the road image pixel by pixel on the premise of knowing the pose of the camera, and then obtain the spatial dense point cloud in the spatial coordinate system.
The basic principle of the calculation of the spatial dense point cloud is to find points with image consistency in the space; the foregoing image consistency means that in a three-dimensional image representing the same scene, if a selected three-dimensional point is located on the surface of an object, after the three-dimensional point is projected onto each image according to the internal and external parameters of the camera to form a projection point, scenes contained in a small area centered on the projection point in each image should be very similar.
The measure of image correspondence between two images can be obtained by processing the image correspondence points using Sum of Squared Differences (SSD), Sum of Absolute Differences (SAD) or Normalized Cross-Correlation (NCC).
In practical application, a voxel-based method, a point cloud diffusion-based method or a depth map fusion-based method can be adopted to determine the spatially dense point cloud. In practical application, a depth map fusion-based method is mostly adopted to determine the spatial dense point cloud.
Through the foregoing steps (1) to (3), a spatially dense point cloud representing road features may be determined, and then (4) and (5) may be performed.
(4) And obtaining a lane line point cloud according to the space dense point cloud.
After the spatial dense point cloud is determined, an image with the clearest lane line can be determined from all the road images, pixels corresponding to the lane line are determined, and the spatial dense point cloud corresponding to the pixels is used as the lane line point cloud.
In some applications, where the lane line point cloud has a significant height difference relative to the road plane in the spatially dense point cloud, the lane line point cloud may also be determined by analysis of the spatially dense point cloud coordinate data.
(5) And determining the three-dimensional coordinate data of the lane line according to the lane line point cloud.
In specific implementation, the three-dimensional coordinate data of the lane line is determined according to the lane point cloud, filtering, segmentation and fusion processing can be performed on the lane line point cloud data to determine a small number of three-dimensional coordinate points representing the lane line, and then data fitting is performed on the basis of the three-dimensional coordinate points to obtain the three-dimensional coordinate data representing the lane line.
2. Data query-based method
The data query method is to determine the three-dimensional coordinate data of the lane line according to the high-precision map data by querying the high-precision map data.
The high-precision map contains a large amount of auxiliary information related to driving, wherein the auxiliary information comprises road data such as the position, type, width, gradient and curvature of a road lane line. The data query-based method determines the three-dimensional coordinate data of the lane line based on the data by querying the data in the high-precision map.
In the embodiment of the application, after the position of the vehicle is determined according to the position information of the vehicle, the high-precision map is inquired according to the position of the vehicle and the driving direction of the vehicle, and then the three-dimensional coordinate data of the corresponding lane line can be determined.
In step S101, there are several methods of acquiring the pose of the camera.
(1) Method for acquiring camera pose based on matching feature points
The step of determining the pose of the cameras of the road images based on the matching feature points is a process of sparse point cloud reconstruction, which obtains the pose of the cameras taking each road image based on the matching feature points. This determination method is based on the aforementioned method for determining the pose of the capturing camera in the three-dimensional coordinate data of the lane line, and the related contents can be referred to the aforementioned description.
(2) And determining the pose of the camera according to the positioning data based on the vehicle positioning characteristic data.
In a specific embodiment, a vehicle is provided with a sensor capable of acquiring vehicle positioning information and attitude information, and in a specific application, the sensor for acquiring vehicle positioning information and attitude information may include one or more of an inertial sensor, a wheel speed sensor, and a sensor for navigation, and based on sensor data generated by the sensor, calculation may be performed to determine position data of the vehicle and pose data of the vehicle; subsequently, based on the position data of the vehicle and the position data of the vehicle, and the conversion relationship in the camera coordinate system and the vehicle coordinate system, the pose of the camera can be determined.
S102: and performing projection transformation on the three-dimensional coordinate data according to the pose to obtain two-dimensional projection data projected into the road image.
Formula as expressed in step S101
After the pose of the camera is determined, and the internal parameters of the camera and the three-dimensional coordinate data of the lane lines are known, the two-dimensional lane line projection data (x, y) projected into the road image can be determined by the aforementioned formula.
S103: and adopting the two-dimensional projection data as lane marking data of the road image.
In step S103, the two-dimensional projection data is used as corresponding road image labeling data, and the determined two-dimensional projection data is used as a label to establish association with a corresponding road image.
By adopting the road image labeling method for lane line identification provided by the embodiment of the application, the three-dimensional coordinate data of the lane line and the sitting posture of the camera are directly utilized, and the two-dimensional projection data obtained by performing projection transformation on the three-dimensional coordinate data is used as the lane line labeling data. The method for marking the lane line of the road image determined by the method can realize automation of marking the lane line and solve the problem of high cost of the existing manual marking data.
FIG. 2 is a flowchart of a road image labeling method for lane line identification according to another embodiment of the present application; as shown in the figure, in some applications of the embodiments of the present application, the road image labeling method for lane line identification includes steps S104 and S105 in addition to the steps S101 to S103 of signing.
S104: and processing the road image by adopting a historical lane line identification model to obtain lane line identification data.
The historical lane line recognition model is obtained by training a deep learning model by adopting historical sample data training; the lane line recognition model is used for processing the road image shot by the vehicle camera and determining the lane line in the road.
The embodiment of the application is not limited to the deep learning model for constructing the lane line identification model, and various available deep learning algorithm models can be adopted.
S105: and judging whether the road image is a bad sample image or not according to the lane line marking data and the lane line identification data.
In the embodiment of the present application, the specific steps performed in step S105 may be as in S1051-S1054.
S1051: and calculating the difference value of the lane line marking data and the lane line identification data.
In practical application, the number of the lane marking data and the lane identification data may be multiple, and therefore, the calculation of the difference value between the lane marking data and the lane identification data is to determine the lane identification data corresponding to each lane marking data, or to determine the closest lane marking data corresponding to the lane identification data, and to find the difference value between the corresponding data, and to take the average value of the found difference values as the difference value between the lane marking data and the lane identification data.
S1052: judging whether the difference is larger than a set threshold value or not; if yes, executing S1053; if not, S1054 is performed.
S1053: and determining the road image as a bad sample image.
S1054: and determining the road image as a non-bad sample image.
In the embodiment of the application, the bad sample image is an image of which the obtained lane line identification data does not meet the set requirement after the historical lane line identification model is adopted for processing. In the embodiment of the present application, the setting requirement is embodied by a difference between the lane marking data and the lane identification data of the setting requirement. Because the bad sample image can not be processed by the historical lane line recognition model to obtain more reasonable lane line recognition data, the bad sample image can be used as a basis for retraining the lane line recognition model.
In practical application, if the difference is larger than a set threshold, the lane line identification data is determined to be larger than the lane line marking data; and because the lane line marking data is more accurate data, the lane line identification data determined by the historical lane line identification model is determined to be inaccurate, and the historical lane line identification model is determined reversely and cannot process the road image well, so that the road image can be used as a bad sample image, and the bad sample image can be used as a sample of a subsequently retrained lane line identification model to improve the identification capability of the lane line identification model.
In addition to providing the road image annotation method for lane line identification, the embodiment of the application also provides a road image annotation device for lane line identification, which has the same inventive concept as the method.
Fig. 3 is a schematic structural diagram of a road image labeling device for lane line identification according to an embodiment of the present application. As shown in fig. 3, the road image labeling device for lane line identification according to the embodiment of the present application includes a data acquiring unit 11, a projection calculating unit 12, and a calibration unit 13.
The data acquisition unit 11 is used for acquiring three-dimensional coordinate data of a lane line in a road and acquiring the pose of a camera shooting the road.
The three-dimensional coordinate data of the lane line is data representing the position characteristics of the lane line in the road in a three-dimensional coordinate system. The three-dimensional coordinate system may be a vehicle coordinate system or a world coordinate system, and the embodiment of the present application is not particularly limited.
The three-dimensional coordinate data of the lane line may be represented in the form of a lane line coordinate, or may be represented by a spatial coordinate expression in the extending direction of the lane line, which is not particularly limited in the embodiment of the present application. In practical application, if the three-dimensional coordinate data of the lane line is obtained by adopting an image processing method, the three-dimensional coordinate data is preferably represented in a form of coordinate points; if the three-dimensional coordinate data of the lane line is obtained from data such as a high-precision map, the three-dimensional coordinate data is preferably represented by adopting a space coordinate expression mode.
In some applications of the embodiments of the present application, the data acquisition unit 11 acquires three-dimensional coordinate data of a lane line in a road in the following ways.
1. The method comprises the steps of realizing three-dimensional reconstruction of a lane line based on a road image, restoring to obtain three-dimensional point cloud of the road based on at least two road images with tangible same lane line image information, and extracting three-dimensional coordinate data of the lane line based on the three-dimensional point cloud data of the road.
Specifically, the data acquiring unit 11 includes a picture acquiring subunit and a three-dimensional reconstruction subunit.
The image acquisition subunit is used for acquiring at least two road images; the three-dimensional reconstruction method of the lane line based on the road image needs to utilize a plurality of road images shot in the same area to restore the three-dimensional characteristics of the road, so at least two road images need to be acquired. The aforementioned road image should be an image formed by photographing the same road area.
It should be noted that the camera shooting poses corresponding to the respective road images should be different.
In one application of the embodiment of the application, the road image marking device for lane line identification is deployed at a remote server, and the remote server is in communication connection with a vehicle client and receives collected data generated and reported by various sensors of the vehicle client. The collected data includes road images captured by a vehicle camera, position data of the vehicle, traveling direction data of the vehicle, vehicle attitude data, and the like. Therefore, it is possible to determine a vehicle traveling to a specific geographic range based on the position data of the vehicle and the traveling direction data of the vehicle, and to take an image captured by a camera of the vehicle in the specific geographic range as a road image for three-dimensional reconstruction.
The three-dimensional reconstruction subunit is used for reconstructing the road in three dimensions according to the at least two road images by adopting a three-dimensional reconstruction method and determining the three-dimensional coordinate data of the lane line
In specific implementation, the three-dimensional reconstruction subunit comprises a matching feature point acquisition subunit, a pose acquisition subunit, a dense point cloud acquisition subunit, a lane line point cloud acquisition subunit and a three-dimensional coordinate data calculation subunit.
And the matching feature point acquisition subunit is used for acquiring matching feature points in the at least two road images. The obtaining of the matching feature points in each road image specifically includes feature point extraction and feature point matching. The characteristic points refer to image pixel points or pixel regions with typical characteristic in the image, and most of the characteristic points are points with large changes of pixel gray levels of some neighborhood pixels in specific application.
At present, the method for acquiring the image feature points includes: A. a weighted average Harris-Laplace feature point extraction algorithm; B. a feature extraction algorithm based on scale invariant feature transformation; searching an extreme point in a spatial scale by detecting the local characteristics of an image, and taking the extreme point as a characteristic point; C. a feature extraction algorithm based on accelerated robust features; the feature points based on the accelerated robust features are subjected to scale invariant feature simplification thought for reference, and the feature points are rapidly extracted by means of an integral graph and a haar wavelet technology.
After the feature points of each road image are obtained, the feature points are required to be matched to determine matched feature points.
In this embodiment of the application, the method for determining matching feature points based on feature points of each road image may include: A. the feature matching method adopting the normalized cross-correlation has the advantages of being capable of resisting global brightness change and contrast change and high in processing speed. B. The feature point matching method based on the feature with the unchanged size comprises the steps of calculating feature vectors of feature points by utilizing the field of the feature points, and then determining matched feature points based on Euclidean distances among the feature vectors. C. And (3) a feature extraction algorithm based on the accelerated robust features.
It should be noted that, before extracting the matching feature points for each road image, in practical applications, each road image needs to be preprocessed. The goal of the preprocessing is to improve the visual effect of the image, improve the definition of the image, selectively highlight useful information in the image and suppress useless information; the image preprocessing comprises the smoothing processing of the image, and specifically adopts methods such as morphological filtering, bilateral filtering, adaptive mean filtering, adaptive median filtering, adaptive weighted filtering and the like.
And the pose acquisition subunit is used for determining the pose of the camera forming the road image according to the matched feature points.
The step of determining the pose of the camera of the road image based on the matching feature points is a process of sparse point cloud reconstruction, which obtains the pose of the camera shooting each road image and the three-dimensional coordinates of the sparse point cloud (spatial position points corresponding to the matching feature points) based on the matching feature points.
According to the pinhole imaging model, the relationship between the pixel point position and the three-dimensional space coordinate point formed after the camera shoots the image is
Wherein x and y are pixel points in the image respectively
The abscissa and ordinate; k is the internal parameter matrix of the camera,
f is the focal length of the camera, and u and v are the abscissa and ordinate of the pixel point of the camera principal point; [ R | t]Is a matrix of the poses of the camera,

r is a rotation matrix of the camera coordinate system relative to the world coordinate system, t is a translation matrix of the origin of the camera coordinate system relative to the world coordinate system, and X, Y and Z are the space coordinates of the object. Assuming that K, R and t are known, the corresponding numerical intervals of X, Y and Z can be predicted on each road image feature point respectively; binding and adjusting the characteristic points in the road images, namely determining X, Y and Z values; in practical applications, K is generally known, and R and t are not determined, and in the case of sufficient road images and feature points, parameters in R and t can be determined. That is, the three-dimensional coordinates of the sparse point cloud and the pose [ R | t ] of the camera can be determined by the method]。
In the specific application of the embodiment of the present application, the intrinsic parameter matrix K of each camera needs to be acquired. The method for acquiring the parameter matrix K in the camera specifically comprises a Tsai calibration method, a dot template calibration method, a Zhangyinyou plane calibration method and a camera self-calibration method.
In the specific application of the embodiment of the application, the pose of the camera and the three-dimensional coordinates of the sparse point cloud can be calculated by adopting a binding adjustment method based on the principle of minimizing the reprojection error; in practical application, a step of layered reconstruction can be adopted to provide an effective initial value for binding adjustment; in an actual music palace, incremental binding adjustment, global binding adjustment or a mixed binding adjustment method can be adopted to calculate the pose of each camera and the three-dimensional coordinates of the sparse point cloud.
The method comprises the following steps of three-dimensionally reconstructing a road according to at least two road images to determine three-dimensional coordinate data of a lane line, wherein the steps are sequentially executed as follows: (1) acquiring matching feature points in the at least two road images; (2) determining the pose of a camera forming the road image according to the matched feature points; (3) constructing a space dense point cloud representing the road according to the at least two road images and the corresponding poses of the cameras; (4) obtaining a lane line point cloud according to the space dense point cloud; (5) and determining the three-dimensional coordinate data of the lane line according to the lane line point cloud.
The dense point cloud obtaining subunit is used for constructing a space dense point cloud representing the road according to the at least two road images and the corresponding poses of the cameras; and the reconstruction of the spatial dense point cloud is to calculate the three-dimensional coordinates of a spatial coordinate system corresponding to each pixel point in the road image pixel by pixel on the premise of knowing the pose of the camera, and then obtain the spatial dense point cloud in the spatial coordinate system. The basic principle of the calculation of the spatial dense point cloud is to find points with image consistency in the space; the foregoing image consistency means that in a three-dimensional image representing the same scene, if a selected three-dimensional point is located on the surface of an object, after the three-dimensional point is projected onto each image according to the internal and external parameters of the camera to form a projection point, scenes contained in a small area centered on the projection point in each image should be very similar.
The measure of image correspondence between two images can be obtained by processing the image correspondence points using Sum of Squared Differences (SSD), Sum of Absolute Differences (SAD) or Normalized Cross-Correlation (NCC).
In practical application, a voxel-based method, a point cloud diffusion-based method or a depth map fusion-based method can be adopted to determine the spatially dense point cloud. In practical application, a depth map fusion-based method is mostly adopted to determine the spatial dense point cloud.
And the lane line point cloud obtaining subunit is used for obtaining the lane line point cloud according to the space dense point cloud.
After the spatial dense point cloud is determined, an image with the clearest lane line can be determined from all the road images, pixels corresponding to the lane line are determined, and the spatial dense point cloud corresponding to the pixels is used as the lane line point cloud.
In some applications, where the lane line point cloud has a significant height difference relative to the road plane in the spatially dense point cloud, the lane line point cloud may also be determined by analysis of the spatially dense point cloud coordinate data.
And the three-dimensional coordinate data calculation subunit is used for determining the three-dimensional coordinate data of the lane line according to the lane line point cloud.
In specific implementation, the three-dimensional coordinate data of the lane line is determined according to the lane point cloud, filtering, segmentation and fusion processing can be performed on the lane line point cloud data to determine a small number of three-dimensional coordinate points representing the lane line, and then data fitting is performed on the basis of the three-dimensional coordinate points to obtain the three-dimensional coordinate data representing the lane line.
In other embodiments, the data acquisition unit 11 determines the three-dimensional coordinate data of the lane line from the high-precision map data by querying the high-precision map data.
The high-precision map contains a large amount of auxiliary information related to driving, wherein the auxiliary information comprises road data such as the position, type, width, gradient and curvature of a road lane line. The data query-based method determines the three-dimensional coordinate data of the lane line based on the data by querying the data in the high-precision map.
In the embodiment of the application, after the position of the vehicle is determined according to the position information of the vehicle, the high-precision map is inquired according to the position of the vehicle and the driving direction of the vehicle, and then the three-dimensional coordinate data of the corresponding lane line can be determined.
The data acquisition unit 11 acquires the pose of the camera in the following ways.
(1) Method for acquiring camera pose based on matching feature points
The step of determining the pose of the cameras of the road images based on the matching feature points is a process of sparse point cloud reconstruction, which obtains the pose of the cameras taking each road image based on the matching feature points. This determination method is based on the aforementioned method for determining the pose of the capturing camera in the three-dimensional coordinate data of the lane line, and the related contents can be referred to the aforementioned description.
(2) And determining the pose of the camera according to the positioning data based on the vehicle positioning characteristic data.
In a specific embodiment, a vehicle is provided with a sensor capable of acquiring vehicle positioning information and attitude information, and in a specific application, the sensor for acquiring vehicle positioning information and attitude information may include one or more of an inertial sensor, a wheel speed sensor, and a sensor for navigation, and based on sensor data generated by the sensor, calculation may be performed to determine position data of the vehicle and pose data of the vehicle; subsequently, based on the position data of the vehicle and the position data of the vehicle, and the conversion relationship in the camera coordinate system and the vehicle coordinate system, the pose of the camera can be determined.
And the projection calculation unit 12 is configured to perform projection transformation on the three-dimensional coordinate data according to the pose, so as to obtain two-dimensional projection data projected into the road image.
According to the formula
After the pose of the camera is determined, and the internal parameters of the camera and the three-dimensional coordinate data of the lane lines are known, the two-dimensional lane line projection data (x, y) projected into the road image can be determined by the aforementioned formula.
The calibration unit 13 is configured to use the two-dimensional projection data as lane marking data of the road image. The calibration unit 13 uses the two-dimensional projection data as the labeling data of the corresponding road image, and establishes a relationship between the determined two-dimensional projection data as a label and the corresponding road image.
In one application of the embodiment of the present application, the road image labeling device for lane line identification includes a model calculation unit 14 and a bad sample identification unit 15 in addition to the aforementioned data acquisition unit 11, projection calculation unit 12 and calibration unit 13.
The model calculation unit 14 is configured to process the road image by using a historical lane line identification model to obtain lane line identification data.
The historical lane line recognition model is obtained by training a deep learning model by adopting historical sample data training; the lane line recognition model is used for processing the road image shot by the vehicle camera and determining the lane line in the road.
The embodiment of the application is not limited to the deep learning model for constructing the lane line identification model, and various available deep learning algorithm models can be adopted.
The bad sample identification unit 15 is configured to determine whether the road image is a bad sample image according to the lane line marking data and the lane line identification data.
In specific application, the bad sample calculation unit determines whether the road image is a bad sample image by calculating a difference value between the lane line marking data and the lane line identification data and judging whether the difference value is greater than a set threshold value.
If the difference value is larger than the set threshold value, determining that the road image is a bad sample image; and if the difference value is smaller than the set threshold value, determining that the road image is not a bad sample image.
In practical application, the number of the lane marking data and the lane identification data may be multiple, and therefore, the calculation of the difference value between the lane marking data and the lane identification data is to determine the lane identification data corresponding to each lane marking data, or to determine the closest lane marking data corresponding to the lane identification data, and to find the difference value between the corresponding data, and to take the average value of the found difference values as the difference value between the lane marking data and the lane identification data.
In the embodiment of the application, the bad sample image is an image of which the obtained lane line identification data does not meet the set requirement after the historical lane line identification model is adopted for processing. In the embodiment of the present application, the setting requirement is embodied by a difference between the lane marking data and the lane identification data of the setting requirement. Because the bad sample image can not be processed by the historical lane line recognition model to obtain more reasonable lane line recognition data, the bad sample image can be used as a basis for retraining the lane line recognition model.
In practical application, if the difference is larger than a set threshold, the lane line identification data is determined to be larger than the lane line marking data; and because the lane line marking data is more accurate data, the lane line identification data determined by the historical lane line identification model is determined to be inaccurate, and the historical lane line identification model is determined reversely and cannot process the road image well, so that the road image can be used as a bad sample image, and the bad sample image can be used as a sample of a subsequently retrained lane line identification model to improve the identification capability of the lane line identification model.
The embodiment of the application also provides electronic equipment for realizing the road image labeling method for lane line identification. Fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application. As shown in fig. 4, the electronic device comprises at least one processor 21, at least one memory 22 and at least one communication interface 23.
The memory 22 in this embodiment may be either volatile memory or nonvolatile memory, or a combination of the two. In some embodiments, memory 22 stores the following elements: executable units or data structures, or a subset thereof, or an expanded set thereof: an operating system and an application program. The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic tasks and processing hardware-based tasks. The application programs include application programs for various application tasks.
In the embodiment of the present application, the processor 21 executes the steps of the road image labeling method for lane line identification by calling a program or an instruction (specifically, a program or an instruction stored in an application program) stored in the memory 22.
In the embodiment of the present Application, the Processor 21 may be a general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The steps of the road image labeling method for lane line identification provided by the embodiment of the application can be directly implemented by a hardware decoding processor, or implemented by combining hardware and software units in the decoding processor. The software elements may be located in ram, flash, rom, prom, or eprom, registers, among other storage media that are well known in the art. The storage medium is located in a memory 22, and the processor 21 reads the information in the memory 22 and performs the steps of the method in combination with its hardware.
The communication interface 23 is used for implementing information transmission between the intelligent driving control system and the external device, for example, to obtain various vehicle sensor data, and generate and issue corresponding control instructions to the vehicle actuator.
The memory and processor components in the electronic device are coupled together by a bus system 24, and the bus system 24 is used to enable communications among the components. In the embodiment of the present application, the bus system may be a CAN bus, and may also be another type of bus. The bus system 234 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, the various buses are labeled as bus system 24 in fig. 4.
The embodiments of the present application further provide a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium stores a program or an instruction, and the program or the instruction enables a computer to execute the steps of the embodiment of the road image labeling method for lane line identification, which are not described herein again to avoid repeated descriptions.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.