WO2024193208A1

WO2024193208A1 - Deep learning-based signal light recognition method and apparatus

Info

Publication number: WO2024193208A1
Application number: PCT/CN2024/072682
Authority: WO
Inventors: 陈安猛; 陈远鹏; 胡文博; 冷静; 宣经纬; 薛鹏; 张军良; 张文海
Original assignee: 合众新能源汽车股份有限公司
Priority date: 2023-03-23
Filing date: 2024-01-17
Publication date: 2024-09-26
Also published as: CN116503832A

Abstract

A deep learning-based signal light recognition method and apparatus, relating to the technical field of autonomous driving, for use in signal light recognition in autonomous driving. The main objective is to predict distance information between a signal light and an ego vehicle, and to assist in decision-making for subsequent planning of the vehicle. The main technical solution is as follows: acquiring target images at a preset time interval; on the basis of the target images, obtaining, by using a preset deep neural network prediction model, lamp panel information and light source information of a signal light to be recognized; on the basis of the lamp panel information and the light source information, forming a corresponding signal light queue by using a preset rule, and adding the signal light queue into a preset signal light queue set; on the basis of destination information and current position information of a vehicle, determining the signal light queue corresponding to a target signal light from the preset signal light queue set; and adjusting the speed of the vehicle by using a preset speed change rule on the basis of distance information between the target signal light and the vehicle and a light source category stored in the signal light queue corresponding to the target signal light.

Description

A signal light recognition method and device based on deep learning

Technical Field

The present invention relates to the field of autonomous driving technology, and in particular to a signal light recognition method and device based on deep learning.

Background Art

In an autonomous driving scenario, a vehicle needs to promptly identify environmental information around the road, such as zebra crossings, traffic lights, etc. The vehicle can ensure driving safety by performing autonomous driving based on the identified environmental information. Among them, the recognition of traffic lights is particularly important for the safety of autonomous driving.

At present, the signal light recognition method used is completed through 2D target detection. However, due to the different models and sizes of signal lights on the road, it is difficult to obtain the distance between the signal light and the vehicle through scale information through 2D target detection. Without accurate distance information, in the application of autonomous driving scenarios, the lack of distance information will cause some decision errors and lags.

Summary of the invention

In view of the above problems, the present invention provides a traffic light recognition method and device based on deep learning. The main purpose is to achieve a scenario where an autonomous driving vehicle passes through an intersection with traffic lights, predict the distance information between the traffic light and the vehicle, and assist in decision-making for subsequent planning of the vehicle.

In order to solve the above technical problems, the present invention proposes the following solutions:

In a first aspect, the present invention provides a signal light recognition method based on deep learning, the method comprising:

Acquire a target image at a preset time interval, wherein the target image contains at least one signal light to be identified;

Based on the target image, a preset deep neural network prediction model is used to obtain the lamp panel position, light source position, light source type and light source depth of the signal light to be identified;

Based on the lamp panel position, light source position, light source type and light source depth of the signal light to be identified, a signal light queue corresponding to the signal light to be identified is formed using preset rules and added to the preset signal light queue set; wherein the signal light queue stores the signal light to be identified within a preset time range. The lamp panel position, light source position, light source type and distance information between the signal light to be identified and the vehicle corresponding to the signal light;

Based on the destination information and current location information of the vehicle, determining a signal light queue corresponding to a target signal light from the preset signal light queue set, wherein the target signal light is a signal light set at the intersection through which the vehicle is to pass;

Based on the distance information between the target signal light and the vehicle and the light source category stored in the signal light queue corresponding to the target signal light, the speed of the vehicle is adjusted using a preset speed change rule so as to pass through the intersection that the vehicle is preparing to pass.

In a second aspect, the present invention provides a signal light recognition device based on deep learning, the device comprising:

A first acquisition unit, configured to acquire a target image at a preset time interval, wherein the target image includes at least one signal light to be identified;

A prediction unit, configured to obtain, based on the target image, a lamp panel position, a light source position, a light source type, and a light source depth of the signal light to be identified by using a preset deep neural network prediction model;

A forming unit, configured to form a signal light queue corresponding to the signal light to be identified by using a preset rule based on the light panel position, light source position, light source type and light source depth of the signal light to be identified, and add the signal light queue to a preset signal light queue set; wherein the signal light queue stores the light panel position, light source position, light source type and distance information between the signal light to be identified and the vehicle within a preset time range corresponding to the signal light to be identified;

A first determining unit is used to determine a signal light queue corresponding to a target signal light from the preset signal light queue set based on the destination information and current position information of the vehicle, wherein the target signal light is a signal light set at the intersection through which the vehicle is to pass;

An adjustment unit is used to adjust the speed of the vehicle using a preset speed change rule based on the distance information between the target signal light and the vehicle and the light source category stored in the signal light queue corresponding to the target signal light, so as to pass the intersection that the vehicle is preparing to pass.

In order to achieve the above-mentioned purpose, according to the third aspect of the present invention, a storage medium is provided, wherein the storage medium includes a stored program, wherein when the program is running, the device where the storage medium is located is controlled to execute the traffic light recognition method based on deep learning described in the first aspect.

In order to achieve the above object, according to a fourth aspect of the present invention, there is provided an electronic device, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor. A computer program, when the processor executes the program, implements all or part of the steps of the signal light recognition device based on deep learning as described in the second aspect.

Through the above technical scheme, the signal light recognition method and device based on deep learning provided by the present invention are because the currently used signal light recognition method is completed through 2D target detection. However, due to the different models and sizes of signal lights on the road, it is difficult to obtain the distance between the signal light and the vehicle through scale information through 2D target detection. Without accurate distance information, in the application of autonomous driving scenarios, the lack of distance information will cause some decision errors and lags. To this end, the present invention obtains a target image at a preset time interval, wherein the target image contains at least one signal light to be identified; based on the target image, a preset deep neural network prediction model is used to obtain the lamp panel position, light source position, light source category and light source depth of the signal light to be identified; based on the lamp panel position, light source position, light source category and light source depth of the signal light to be identified, a signal light queue corresponding to the signal light to be identified is formed using a preset rule and added to a preset signal light queue set; wherein the signal light queue stores the lamp panel position, light source position, light source category and distance information between the signal light to be identified and the vehicle within a preset time range; based on the destination information and current position information of the vehicle, a signal light queue corresponding to a target signal light is determined from the preset signal light queue set, wherein the target signal light is a signal light set at the intersection through which the vehicle is to pass; based on the distance information between the target signal light and the vehicle and the light source category stored in the signal light queue corresponding to the target signal light, a preset speed change rule is used to adjust the vehicle speed so as to pass through the intersection through which the vehicle is to pass. The present invention applies a combination of depth estimation and target detection methods to predict a traffic light with depth information. In an era when telephoto lenses are standard, traffic lights can be discovered earlier and used as an auxiliary distance reference in scenarios where high-precision maps or GPS are unavailable, thereby enhancing the robustness of the autonomous driving system and avoiding some shortcomings of 2D traffic light recognition. For example, in autonomous driving scenario applications, the lack of distance information can cause some decision-making errors and lags.

The above description is only an overview of the technical solution of the present invention. In order to more clearly understand the technical means of the present invention, it can be implemented according to the contents of the specification. In order to make the above and other purposes, features and advantages of the present invention more obvious and easy to understand, the specific implementation methods of the present invention are listed below.

BRIEF DESCRIPTION OF THE DRAWINGS

Various other advantages and benefits will become apparent to those skilled in the art by reading the following detailed description of the preferred embodiment. The present invention is not intended to be limiting. In addition, the same reference symbols are used to represent the same components throughout the drawings. In the drawings:

FIG1 shows a flow chart of a signal light recognition method based on deep learning provided by an embodiment of the present invention;

FIG2 shows a flow chart of another signal light recognition method based on deep learning provided by an embodiment of the present invention;

FIG3 shows a block diagram of a signal light recognition device based on deep learning provided by an embodiment of the present invention;

FIG4 shows a block diagram of another signal light recognition device based on deep learning provided by an embodiment of the present invention;

FIG5 shows an application scenario diagram corresponding to another signal light recognition device based on deep learning provided by an embodiment of the present invention.

DETAILED DESCRIPTION

The exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although the exemplary embodiments of the present disclosure are shown in the accompanying drawings, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments set forth herein. On the contrary, these embodiments are provided in order to enable a more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

The current method for signal light recognition is accomplished through 2D target detection. However, due to the different types and sizes of signal lights on the road, it is difficult to obtain the distance between the signal light and the vehicle through scale information through 2D target detection. Without accurate distance information, the lack of distance information in autonomous driving scenarios will cause some decision errors and lags. To address this problem, the inventors thought of combining depth estimation and target detection methods to predict a signal light with depth information. In an era when telephoto lenses are standard, signal lights can be discovered earlier, and can be used as an auxiliary distance reference in scenarios where high-precision maps or GPS are unavailable, thereby enhancing the robustness of the autonomous driving system.

To this end, an embodiment of the present invention provides a signal light recognition method based on deep learning. The method can predict the distance information between the signal light and the vehicle when the autonomous driving vehicle passes through an intersection with a signal light, and assist in decision-making for the subsequent planning of the vehicle. The specific execution steps are shown in FIG1, including:

101. Acquire a target image at a preset time interval.

Wherein, the target image contains at least one signal light to be identified.

Traffic lights are a category of traffic safety products. They are an important tool to strengthen road traffic management, reduce the occurrence of traffic accidents, improve road use efficiency, and improve traffic conditions. They are suitable for crosses, T-shaped intersections, etc., and are controlled by road traffic signal controllers to guide vehicles and pedestrians to pass safely and orderly. Traffic lights are composed of red lights, green lights, and yellow lights. Red lights indicate that passage is prohibited, green lights indicate that passage is allowed, and yellow lights indicate slow travel or warning. The "Road Traffic Law Implementation Regulations" divide traffic lights into: motor vehicle lights, non-motor vehicle lights, pedestrian crossing lights, lane lights, direction indicator lights, flashing warning lights, and road and railway level crossing lights.

During the process of automatic driving of the vehicle, if it encounters an intersection, it is necessary to identify the traffic lights at the intersection and adjust the vehicle speed according to the instructions of the traffic lights in order to pass the intersection. This embodiment first needs to acquire the target image at a preset time interval, wherein the preset time interval can be every second, every 2 seconds, etc., and is set according to actual needs. This embodiment does not make specific limitations; since the intersection is usually cross-shaped or T-shaped, there may be 3, 2, etc. traffic lights in different directions in the image obtained by the vehicle, but before executing the next step, the image obtained by the vehicle must contain at least one traffic light to be identified.

102. Based on the target image, a preset deep neural network prediction model is used to obtain the lamp panel position, light source position, light source type and light source depth of the signal light to be identified.

From step 101, the target image can be obtained, and then the lamp panel position, light source position, light source category and light source depth of the signal light to be identified are obtained by using a preset deep neural network prediction model; wherein the preset deep neural network prediction model is pre-designed, and includes an input image module, a feature extraction module, a feature fusion module, a depth prediction module, a signal light panel frame prediction module, a ROI extraction module, a light source regression classification module, etc., which are not specifically limited in this embodiment;

The input image module obtains the target image as the input data of the preset deep neural network prediction model, extracts features through the feature extraction module, and then fuses the extracted features through the feature fusion module. The depth of the light source is obtained through the depth prediction module based on the fused information; the light panel frame information is obtained through the signal light panel frame prediction module based on the fused information, and the light panel frame information includes the position of the light panel; the ROI extraction module matches the light panel frame information with the features obtained through the feature extraction module, and extracts ROI features: Based on the ROI features, the light source position and light source category are obtained through the light source regression classification module. The light source category is classified according to different light source colors, images, etc., for the purpose of indicating whether the vehicle is moving forward or turning, etc. This embodiment does not make specific limitations.

103. Based on the lamp panel position, light source position, light source type and light source depth of the signal light to be identified, a signal light queue corresponding to the signal light to be identified is formed using preset rules and added to the preset signal light queue set.

Among them, the traffic light queue saves the lamp panel position, light source position, light source type and distance information between the traffic light to be identified and the vehicle corresponding to the traffic light to be identified within a preset time range; the distance information between the traffic light to be identified and the vehicle is obtained by the light source depth, which can be the numerical value of the light source depth, or the distance corrected by combining the light source depth with the light source scale information.

A signal light panel corresponds to a signal light queue, that is, a signal light queue includes all the information of a signal light, such as: light panel information, light source related information (light source position, light source type and light source depth), etc.

The order of adding the formed traffic light queues to the preset traffic light queue set can be set as follows: the queue formed by the traffic light directly in front of the vehicle is located at the first one of the preset traffic light queues, followed by the queue corresponding to the traffic light on the left side of the intersection, and the queue corresponding to the traffic light on the right side of the intersection; or the traffic light at the intersection to be passed can be determined first according to the destination and current position of the vehicle, and then the queue formed by the traffic light is located at the first one of the preset traffic light queue set; this implementation does not make specific limitations.

104. Based on the destination information and current location information of the vehicle, determine a signal light queue corresponding to the target signal light from a set of preset signal light queues.

Wherein, the target signal light is a signal light set at the intersection where the vehicle is to pass;

The preset signal light queue set can be obtained from step 103. It is necessary to determine the intersection that the vehicle is going to pass through based on the destination set in the vehicle's automatic driving system and the current position information determined. When the intersection is determined, the light panel position of the signal light at the intersection can be obtained, and then the signal light queue corresponding to the light panel position can be searched from the preset signal light queue. The signal light queue found is formed by the signal light at the intersection that the vehicle is going to pass through. The signal light is the target signal light, and the signal light queue is the signal light queue corresponding to the target signal light.

105. Based on the distance information between the target signal light and the vehicle and the light source type stored in the signal light queue corresponding to the target signal light, the vehicle speed is adjusted using a preset speed change rule.

The adjusting of the vehicle speed is used to pass through an intersection that the vehicle is preparing to pass.

From step 104, the signal light queue corresponding to the target signal light can be obtained, and the distance information and light source type of the target signal light and the vehicle that are pre-stored can be obtained from the signal light queue, and the vehicle speed can be adjusted using the preset speed change rule. The preset speed change rule can set different levels of speed according to the different light source types and the different distances between the target signal light and the vehicle. For example, when the distance between the target signal light and the vehicle is in the first distance range (far away), regardless of the light source type, the vehicle speed can be reduced to the first level of speed. When the distance between the target signal light and the vehicle gradually decreases and reaches the second distance range (neither far nor near), the color of the target signal light is further determined. If it is a red light, it means that the vehicle is not far from the stop line, and the target signal light may still be red when the vehicle reaches the stop line. In this case, the vehicle speed is reduced to the second level of speed so that the subsequent vehicle can stop smoothly and quickly when it reaches the stop line. If it is a green light, it means that the vehicle is not far from the stop line, and the target signal light may still be red when the vehicle reaches the stop line. In this case, the vehicle can maintain the first level of speed. It should be noted that the first level speed is faster than the second level speed.

Based on the implementation of the embodiment of FIG. 1 above, it can be seen that the present invention provides a signal light recognition method based on deep learning. The present invention obtains a target image at a preset time interval, wherein the target image contains at least one signal light to be recognized; based on the target image, a preset deep neural network prediction model is used to obtain the lamp panel position, light source position, light source category and light source depth of the signal light to be recognized; based on the lamp panel position, light source position, light source category and light source depth of the signal light to be recognized, a signal light queue corresponding to the signal light to be recognized is formed using a preset rule and added to a preset signal light queue set; wherein the signal light queue stores the lamp panel position, light source position, light source category and distance information between the signal light to be recognized and the vehicle within a preset time range; based on the destination information and current position information of the vehicle, a signal light queue corresponding to a target signal light is determined from the preset signal light queue set, wherein the target signal light is a signal light set at the intersection through which the vehicle is to pass; based on the distance information between the target signal light and the vehicle and the light source category stored in the signal light queue corresponding to the target signal light, a preset speed change rule is used to adjust the vehicle speed so as to pass through the intersection through which the vehicle is to pass. The present invention combines depth estimation and target detection methods. Predicting a traffic light with depth information can detect traffic lights earlier in the era when telephoto lenses are standard, and can serve as an auxiliary distance reference in scenarios where high-precision maps or GPS are unavailable, enhancing the robustness of the autonomous driving system, and avoiding some shortcomings of 2D traffic light recognition, such as the lack of distance information in autonomous driving scene applications, which can cause some decision errors and lags. Further, as a refinement and extension of the embodiment shown in Figure 1, an embodiment of the present invention also provides another method for traffic light recognition based on deep learning, as shown in Figure 2, and its specific steps are as follows:

201. Acquire a target image at a preset time interval.

This step is combined with the description of step 101 in the above method, and the same contents are not repeated here.

Wherein, the target image contains at least one signal light to be identified;

202. Based on the target image, a preset deep neural network prediction model is used to obtain the lamp panel position, light source position, light source type and light source depth of the signal light to be identified.

This step is combined with the description of step 102 in the above method, and the same contents are not repeated here.

The target image can be obtained from step 201, and then based on the target image, the resnet-18 network in the preset deep neural network prediction model is used as the backbone network to perform feature extraction to obtain the target feature; based on the target feature, the PAN-FPN structure in the preset deep neural network prediction model is used to perform feature fusion to obtain target feature fusion information; based on the target feature fusion information, the convolution layer and the depth prediction layer in the preset deep neural network prediction model are used to obtain the light source depth of the signal light to be identified; based on the target feature fusion information, the full connection technology of the neural network is used to perform automatic regression to obtain the target frame position of the light panel; the target frame position of the light panel is matched with the target feature, and the ROI feature is extracted according to the target frame position of the light panel; based on the ROI feature, target frame regression and category classification are performed to obtain the light source position and the light source category of the signal light to be identified, wherein the light source category is one or more of a red light, a yellow light, a green light, a left arrow, a right arrow, a forward arrow, a backward arrow and a number. As shown in Figure 5, the box at the traffic light contains the light panel information. First, the preset deep neural network prediction model detects the box at the traffic light, and then obtains the features of the area within the box ROI corresponding to the traffic light to identify the internal light source, including its category and position.

203. Based on the lamp panel position, light source position, light source type and light source depth of the signal light to be identified, a signal light queue corresponding to the signal light to be identified is formed using preset rules, and added to the preset signal light queue set.

This step is combined with the description of step 103 in the above method, and the same contents are not repeated here.

The signal light queue stores information on the lamp panel position, light source position, light source type, and distance between the signal light to be identified and the vehicle corresponding to the signal light to be identified within a preset time range.

From step 202, the lamp panel position, light source position, light source type and light source depth of the signal light to be identified can be obtained, and then the scale information corresponding to the signal light to be identified is obtained. The method for obtaining the scale information is to obtain the approximate distance through the image size (mainly width and height) of the target light source and the prior knowledge of the width and height information and the actual target size; the light source depth corresponding to the signal light to be identified and the scale information are fused to obtain the distance information between the signal light to be identified and the vehicle. The light source distance output by the model must conform to the relationship between the image size of the light source and the actual distance. Because the size of traffic lights has national standards, the image size and the actual size are related and there will not be too much error. The scale information is used as a priori knowledge to correct the distance between the signal light to be identified and the vehicle; based on the lamp panel position, the light source position, the light source type and the distance information between the signal light to be identified and the vehicle, a signal light queue corresponding to the signal light to be identified is formed; and the signal light queue corresponding to the signal light to be identified is added to the preset signal light queue set in the chronological order of formation of the signal light queue.

204. Obtain lane line information and road arrow information of the lane where the vehicle is located.

The on-board camera can be used to obtain the lane line information and road arrow information on the lane where the vehicle is located, as shown in FIG5 , and is used to locate the current position of the subsequent vehicle.

205. Determine the current position information of the vehicle based on the lane line information and the road arrow information.

The lane line information and road arrow information can be obtained from step 204, that is, an image of the road condition ahead can be obtained, as shown in FIG5 , including lane lines, road arrows and traffic lights, etc.; the current position information of the vehicle can be obtained by approximating the position of the center line of the vehicle through the center of the image, and the current position information includes: the lateral position and direction of the vehicle, and the relationship with the lane lines and road arrows is shown in FIG5 .

206. Based on the destination information and current position information of the vehicle, determine a signal light queue corresponding to the target signal light from a set of preset signal light queues.

This step is combined with the description of step 104 in the above method, and the same contents are not repeated here.

Wherein, the target traffic light is a traffic light set at the intersection through which the vehicle is to pass.

207. Based on the distance information between the target signal light and the vehicle and the light source type stored in the signal light queue corresponding to the target signal light, the vehicle speed is adjusted using a preset speed change rule.

This step is combined with the description of step 105 in the above method, and the same contents are not repeated here.

Determine whether the distance information between the target signal light and the vehicle is less than a preset threshold; if the distance information between the target signal light and the vehicle is not less than the preset threshold, adjust the speed of the vehicle to a first preset speed; if the distance information between the target signal light and the vehicle is less than the preset threshold, determine whether the light source category is a green light; if the light source category is a green light, adjust the speed of the vehicle to a first preset speed; if the light source category is not a green light, adjust the speed of the vehicle to a second preset speed, and monitor whether the vehicle meets a preset parking condition; wherein the preset parking condition is a preset safety distance before the vehicle reaches a stop line or a stationary vehicle in front, and the second preset threshold is less than the first preset threshold; when the vehicle meets the preset parking condition, adjust the speed of the vehicle to 0.

Furthermore, in another preferred embodiment of the present invention, when it is detected that there are other vehicles in front of the lane where the vehicle is located, the vehicle distance is maintained and the vehicle in front is followed through the intersection where the target signal light is located; after the vehicle passes the intersection where the target signal light is located, the preset signal light queue set is cleared.

Based on the implementation of FIG. 2 above, it can be seen that the present invention provides a signal light recognition method based on deep learning. The present invention mainly targets the scene where an autonomous driving vehicle passes through an intersection with a signal light, applies a combination of depth estimation and target detection methods, and uses an end-to-end deep neural network architecture to predict a signal light with depth information, and then predicts the distance information between the signal light and the vehicle, and assists in the subsequent planning of the vehicle. Decision-making can be avoided in some shortcomings of 2D signal light recognition, such as the lack of distance information in the application of autonomous driving scenarios, which will cause some decision errors and lags. Moreover, in the era of standard telephoto lenses, signal lights can be found earlier, and can be used as auxiliary distance references in scenarios where high-precision maps or GPS are unavailable, thereby enhancing the robustness of the autonomous driving system. The distance between the vehicle and the signal light can be obtained in real time through high-precision maps and GPS information, but this requires that the high-precision map be accurate and updated in a timely manner, and that the GPS error is small. Both of these conditions will fail in reality.

Furthermore, as an implementation of the method shown in FIG. 1 above, an embodiment of the present invention further provides a signal light recognition device based on deep learning, which is used to implement the method shown in FIG. 1 above. This device embodiment corresponds to the aforementioned method embodiment. For ease of reading, this device embodiment will not repeat the details of the aforementioned method embodiment one by one, but it should be clear that the device in this embodiment can correspond to all the contents of the aforementioned method embodiment. As shown in Figure 3, the device includes:

A first acquisition unit 31 is used to acquire a target image at a preset time interval, wherein the target image contains at least one signal light to be identified;

A prediction unit 32, configured to obtain the lamp panel position, light source position, light source type and light source depth of the signal light to be identified by using a preset deep neural network prediction model based on the target image obtained from the acquisition unit 31;

The forming unit 33 is used to form a signal light queue corresponding to the signal light to be identified based on the lamp panel position, light source position, light source type and light source depth of the signal light to be identified obtained from the prediction unit 32 using a preset rule, and add the signal light queue to a preset signal light queue set; wherein the signal light queue stores the lamp panel position, light source position, light source type and distance information between the signal light to be identified and the vehicle within a preset time range.

A first determining unit 34 is used to determine, based on the destination information and current position information of the vehicle, a signal light queue corresponding to a target signal light from the preset signal light queue set obtained by the forming unit 33, wherein the target signal light is a signal light set at the intersection through which the vehicle is to pass;

The adjustment unit 35 is used to adjust the speed of the vehicle using a preset speed change rule based on the distance information between the target signal light and the vehicle and the light source category stored in the signal light queue corresponding to the target signal light obtained from the determination unit 34, so as to pass through the intersection that the vehicle is preparing to pass.

Furthermore, as an implementation of the method shown in FIG. 2 above, an embodiment of the present invention also provides another signal light recognition device based on deep learning, which is used to implement the method shown in FIG. 2 above. This device embodiment corresponds to the aforementioned method embodiment. For ease of reading, this device embodiment will no longer repeat the details of the aforementioned method embodiment one by one, but it should be clear that the device in this embodiment can correspond to all the contents of the aforementioned method embodiment. As shown in FIG. 4, the device includes:

a first determining unit 34, configured to determine, based on the destination information of the vehicle and the current position information obtained from the second determining unit 37, a signal light queue corresponding to a target signal light from the set of preset signal light queues obtained by the forming unit 33, wherein the target signal light is a signal light provided at the intersection through which the vehicle is to pass;

an adjusting unit 35, configured to adjust the speed of the vehicle by using a preset speed change rule based on the distance information between the target signal light and the vehicle and the light source type stored in the signal light queue corresponding to the target signal light obtained from the determining unit 34, so as to pass through the intersection that the vehicle is preparing to pass;

A second acquisition unit 36 is used to acquire lane line information and road arrow information of the lane where the vehicle is located;

A second determining unit 37, configured to determine the current position information of the vehicle based on the lane line information and the road arrow information obtained from the second acquiring unit 36;

The monitoring unit 38 is used to maintain the vehicle distance and follow the vehicle in front through the intersection where the target signal light is located when it is detected that there are other vehicles in front of the lane where the vehicle is located;

The deleting unit 39 is used to clear the preset signal light queue set after the vehicle obtained from the monitoring unit 38 passes through the intersection where the target signal light is located.

Furthermore, the prediction unit 32 includes:

A first feature extraction module 321 is used to extract features based on the target image using the resnet-18 network in the preset deep neural network prediction model as a backbone network to obtain target features;

A feature fusion module 322 is used to perform feature fusion based on the target feature obtained from the first feature extraction module 321 using the PAN-FPN structure in the preset deep neural network prediction model to obtain target feature fusion information;

A first prediction module 323, configured to obtain the light source depth of the signal light to be identified by using the convolution layer and the depth prediction layer in the preset deep neural network prediction model based on the target feature fusion information obtained from the feature fusion module 322;

A regression module 324 is used to automatically regress the target frame position of the lamp panel using a full connection technology of a neural network based on the target feature fusion information obtained from the feature fusion module 322;

A second feature extraction module 325 is used to match the lamp panel target frame position obtained from the regression module 324 with the target feature, and extract ROI features according to the lamp panel target frame position;

The second prediction module 326 is used to perform target frame regression and category classification based on the ROI features obtained from the second feature extraction module 325 to obtain the light source position and the light source category of the signal light to be identified, wherein the light source category is one or more of a red light, a yellow light, a green light, a left arrow, a right arrow, a forward arrow, a backward arrow and a number.

Furthermore, the forming unit 33 includes:

A first acquisition module 331 is used to acquire the scale information corresponding to the signal light to be identified;

A second acquisition module 332 is used to fuse the light source depth corresponding to the signal light to be identified and the scale information obtained from the first acquisition module 331 to obtain the distance information between the signal light to be identified and the vehicle;

A forming module 333, configured to form a signal light queue corresponding to the signal light to be identified based on the lamp panel position, the light source position, the light source type and the distance information between the signal light to be identified and the vehicle obtained from the second acquiring module 332;

The adding module 334 is used to add the signal light queue corresponding to the signal light to be identified obtained from the forming module 333 to the preset signal light queue set according to the time sequence of the formation of the signal light queue.

Furthermore, the adjustment unit 35 includes:

A first judgment module 351 is used to judge whether the distance information between the target signal light and the vehicle is less than a preset threshold;

A first adjustment module 352, configured to adjust the speed of the vehicle to a first preset speed if the distance information between the target signal light and the vehicle obtained from the first judgment module 351 is not less than a preset threshold;

A second judgment module 353 is used to judge whether the light source type is a green light if the distance information between the target signal light and the vehicle obtained from the first judgment module 351 is less than a preset threshold;

A second adjustment module 354 is configured to adjust the speed of the vehicle to a first preset speed if the light source type obtained from the second judgment module 353 is a green light;

a third adjustment module 355, configured to adjust the speed of the vehicle to a second preset speed if the light source type obtained from the second judgment module 353 is not a green light, and monitor whether the vehicle meets a preset parking condition; wherein the preset parking condition is a preset safety distance before the vehicle reaches a stop line or a stationary vehicle ahead, and the second preset threshold is less than the first preset threshold;

When the vehicle meets the preset parking condition, the speed of the vehicle is adjusted to 0.

Furthermore, an embodiment of the present invention also provides a processor, which is used to run a program, wherein the program, when running, executes the deep learning-based traffic light recognition method described in Figures 1-2 above.

Furthermore, an embodiment of the present invention also provides a storage medium, which is used to store a computer program, wherein when the computer program is running, it controls the device where the storage medium is located to execute the deep learning-based traffic light recognition method described in Figures 1-2 above.

In the above embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference can be made to the relevant descriptions of other embodiments.

It is understandable that the related features in the above methods and devices can be referenced to each other. In addition, the "first", "second" and the like in the above embodiments are used to distinguish the embodiments, and do not represent the advantages and disadvantages of the embodiments.

Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the systems, devices and units described above can refer to the corresponding processes in the aforementioned method embodiments and will not be repeated here.

The algorithms and displays provided herein are not inherently related to any particular computer, virtual system, or other device. Various general purpose systems may also be used with the teachings based herein. Description, it is obvious to construct the structure required for this type of system. In addition, the present invention is not directed to any specific programming language. It should be understood that various programming languages can be utilized to implement the content of the present invention described herein, and the above description of specific languages is for the purpose of disclosing the best mode of the present invention.

In addition, the memory may include non-permanent memory in a computer-readable medium, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash RAM, and the memory includes at least one memory chip.

Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment in combination with software and hardware. Moreover, the present application may adopt the form of a computer program product implemented in one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) that contain computer-usable program code.

The present application is described with reference to the flowchart and/or block diagram of the method, device (system) and computer program product according to the embodiment of the present application. It should be understood that each process and/or box in the flowchart and/or block diagram, and the combination of the process and/or box in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for realizing the function specified in one process or multiple processes in the flowchart and/or one box or multiple boxes in the block diagram.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

In a typical configuration, a computing device includes one or more processors (CPU), input/output interfaces, network interfaces, and memory.

Memory may include non-permanent storage in a computer-readable medium, random access memory (RAM) and/or non-volatile memory in the form of read-only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer readable media include permanent and non-permanent, removable and non-removable media that can be implemented by any method or technology to store information. Information can be computer readable instructions, data structures, program modules or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disk read-only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by a computing device. As defined in this article, computer readable media does not include temporary computer readable media (transitory media), such as modulated data signals and carrier waves.

It should also be noted that the terms "include", "comprises" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, commodity or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, commodity or device. In the absence of more restrictions, the elements defined by the sentence "comprises a ..." do not exclude the existence of other identical elements in the process, method, commodity or device including the elements.

Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems or computer program products. Therefore, the present application may adopt the form of a complete hardware embodiment, a complete software embodiment or an embodiment in combination with software and hardware. Moreover, the present application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) that contain computer-usable program code.

The above are only embodiments of the present application and are not intended to limit the present application. For those skilled in the art, the present application may have various modifications and variations. Any modifications, equivalent substitutions, improvements, etc. made within the scope of the present application should be included in the scope of the claims of this application.

Claims

A signal light recognition method based on deep learning, characterized in that the method comprises:

Acquire a target image at a preset time interval, wherein the target image contains at least one signal light to be identified;

Based on the target image, a preset deep neural network prediction model is used to obtain the lamp panel position, light source position, light source type and light source depth of the signal light to be identified;

Based on the lamp panel position, light source position, light source type and light source depth of the signal light to be identified, a signal light queue corresponding to the signal light to be identified is formed using preset rules and added to a preset signal light queue set; wherein the signal light queue stores the lamp panel position, light source position, light source type and distance information between the signal light to be identified and the vehicle within a preset time range corresponding to the signal light to be identified;

Based on the destination information and current location information of the vehicle, determining a signal light queue corresponding to a target signal light from the preset signal light queue set, wherein the target signal light is a signal light set at the intersection through which the vehicle is to pass;

Based on the distance information between the target signal light and the vehicle and the light source category stored in the signal light queue corresponding to the target signal light, the speed of the vehicle is adjusted using a preset speed change rule so as to pass through the intersection that the vehicle is preparing to pass.
The method according to claim 1 is characterized in that the step of obtaining the lamp panel position, light source position, light source type and light source depth of the signal light to be identified based on the target image using a preset deep neural network prediction model comprises:

Based on the target image, the resnet-18 network in the preset deep neural network prediction model is used as the backbone network to perform feature extraction to obtain target features;

Based on the target features, the PAN-FPN structure in the preset deep neural network prediction model is used to perform feature fusion to obtain target feature fusion information;

Based on the target feature fusion information, the convolution layer and the depth prediction layer in the preset deep neural network prediction model are used to obtain the light source depth of the signal light to be identified.
The method according to claim 2 is characterized in that the step of obtaining the lamp panel position, light source position, light source type and light source depth of the signal light to be identified based on the target image using a preset deep neural network prediction model comprises:

Based on the target feature fusion information, the fully connected technology of the neural network is used to automatically regress and obtain the target frame position of the lamp panel;

Matching the lamp panel target frame position with the target feature, and extracting ROI features according to the lamp panel target frame position;

Target frame regression and category classification are performed based on the ROI features to obtain the light source position and the light source category of the signal light to be identified, wherein the light source category is one or more of a red light, a yellow light, a green light, a left arrow, a right arrow, a forward arrow, a backward arrow and a number.
The method according to claim 1 is characterized in that, based on the lamp panel position, light source position, light source type and light source depth of the signal light to be identified, a signal light queue corresponding to the signal light to be identified is formed using a preset rule and added to a preset signal light queue set, comprising:

Obtaining scale information corresponding to the signal light to be identified;

The light source depth and the scale information corresponding to the signal light to be identified are integrated to obtain the distance information between the signal light to be identified and the vehicle;

Forming a signal light queue corresponding to the signal light to be identified based on the light panel position, the light source position, the light source type and the distance information between the signal light to be identified and the vehicle;

The signal light queues corresponding to the signal lights to be identified are added to the preset signal light queue set according to the time sequence of the formation of the signal light queues.
The method according to claim 1 is characterized in that the distance information between the target signal light and the vehicle and the light source category stored in the signal light queue corresponding to the target signal light are used to adjust the speed of the vehicle by using a preset speed change rule so as to pass through the intersection that the vehicle is to pass, comprising:

Determining whether the distance information between the target signal light and the vehicle is less than a preset threshold;

If the distance information between the target signal light and the vehicle is not less than a preset threshold, adjusting the speed of the vehicle to a first preset speed;

If the distance information between the target signal light and the vehicle is less than a preset threshold, determining whether the light source type is a green light;

If the light source type is a green light, adjusting the speed of the vehicle to a first preset speed;

If the light source type is not a green light, the speed of the vehicle is adjusted to a second preset speed, and the vehicle is monitored to see whether it meets a preset parking condition; wherein the preset parking condition is a preset safety distance before the vehicle reaches a stop line or a stationary vehicle ahead, and the second preset threshold is less than the first preset threshold;

When the vehicle meets the preset parking condition, the speed of the vehicle is adjusted to 0.
The method according to any one of claims 1 to 5, characterized in that before determining the signal light queue corresponding to the target signal light from the preset signal light queue set based on the destination information and current position information of the vehicle, the method further comprises:

Obtaining lane line information and road arrow information of the lane where the vehicle is located;

Determining the current position information of the vehicle based on the lane line information and the road arrow information;
The method according to claim 6, characterized in that, after determining the signal light queue corresponding to the target signal light from the preset signal light queue set based on the destination information and current location information of the vehicle, the method further comprises:

When it is detected that there are other vehicles ahead of the lane where the vehicle is located, the vehicle maintains the distance between the vehicles and follows the vehicle ahead through the intersection where the target signal light is located;

After the vehicle passes the intersection where the target signal light is located, the preset signal light queue set is cleared.
A signal light recognition device based on deep learning, characterized by comprising:

A first acquisition unit, configured to acquire a target image at a preset time interval, wherein the target image includes at least one signal light to be identified;

A prediction unit, configured to obtain, based on the target image, a lamp panel position, a light source position, a light source type, and a light source depth of the signal light to be identified by using a preset deep neural network prediction model;

A forming unit is used to form a signal light queue corresponding to the signal light to be identified based on the lamp panel position, light source position, light source type and light source depth of the signal light to be identified using a preset rule, and add it to the preset signal light queue set; wherein the signal light queue saves the preset time The information of the lamp panel position, light source position, light source type and the distance between the signal light to be identified and the vehicle corresponding to the signal light to be identified within the time range;

A first determining unit is used to determine a signal light queue corresponding to a target signal light from the preset signal light queue set based on the destination information and current position information of the vehicle, wherein the target signal light is a signal light set at the intersection through which the vehicle is to pass;

An adjustment unit is used to adjust the speed of the vehicle using a preset speed change rule based on the distance information between the target signal light and the vehicle and the light source category stored in the signal light queue corresponding to the target signal light, so as to pass the intersection that the vehicle is preparing to pass.
A storage medium, comprising a stored program, characterized in that when the program is running, the device where the storage medium is located is controlled to execute the traffic light recognition method based on deep learning as described in any one of claims 1 to claim 7.
An electronic device comprises a memory, a processor and a computer program stored in the memory and executable on the processor, wherein when the processor executes the program, the signal light recognition method based on deep learning as described in any one of claims 1 to 7 is implemented.