[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN106934378B - Automobile high beam identification system and method based on video deep learning - Google Patents

Automobile high beam identification system and method based on video deep learning Download PDF

Info

Publication number
CN106934378B
CN106934378B CN201710156201.4A CN201710156201A CN106934378B CN 106934378 B CN106934378 B CN 106934378B CN 201710156201 A CN201710156201 A CN 201710156201A CN 106934378 B CN106934378 B CN 106934378B
Authority
CN
China
Prior art keywords
key frame
frame
deep learning
module
video data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710156201.4A
Other languages
Chinese (zh)
Other versions
CN106934378A (en
Inventor
李成栋
丁子祥
许福运
张桂青
郝丽丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201710156201.4A priority Critical patent/CN106934378B/en
Publication of CN106934378A publication Critical patent/CN106934378A/en
Application granted granted Critical
Publication of CN106934378B publication Critical patent/CN106934378B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • G06V20/47Detecting features for summarising video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于视频深度学习的汽车远光灯识别系统及方法,该系统包括以下两部分:前台部分,用于实现对远光灯违章行为的识别与处理,包括依次连接的道路监测设备模块、视频处理与识别模块、识别结果处理模块以及待检违章结果数据库;后台部分,用于视频处理并实现视频的深度学习,包括关键帧提取算法、带标签数据库以及深度学习模块,带标签数据库是对原始视频数据调用关键帧提取算法进行关键帧提取而构建得到,带标签数据库中的数据用于深度学习模块的训练,训练好的深度学习模块与关键帧提取算法一起供视频处理与识别模块调用。本发明对监控视频进行自动分析识别,保证了执法证据的完备性,且与人工判断类似,具有智能性。

Figure 201710156201

The invention discloses a system and method for recognizing high beam lights of an automobile based on video deep learning. The system includes the following two parts: a front part, which is used to realize the identification and processing of illegal behavior of high beam lights, including road monitoring connected in sequence. Equipment module, video processing and recognition module, recognition result processing module, and database of illegal results to be checked; background part, used for video processing and deep learning of video, including key frame extraction algorithm, labelled database and deep learning module, labelled The database is constructed by calling the key frame extraction algorithm to extract key frames from the original video data. The data in the labeled database is used for the training of the deep learning module. The trained deep learning module and the key frame extraction algorithm are used for video processing and recognition. module call. The invention automatically analyzes and recognizes the surveillance video, ensures the completeness of the law enforcement evidence, and is similar to manual judgment and has intelligence.

Figure 201710156201

Description

Automobile high beam identification system and method based on video deep learning
Technical Field
The invention relates to an automobile high beam identification system, in particular to an automobile high beam identification system and method based on video deep learning. Belongs to the technical field of intelligent traffic.
Background
Since the innovation is open, the economy of China is continuously, stably and rapidly developed, the living standard of people in China is improved unprecedentedly, and more people in China have private vehicles. The rapid increase of the number of private cars brings convenience to people going out, and meanwhile, the occurrence frequency of traffic accidents is higher and higher.
There are many reasons for traffic accidents, many of which are caused by improper use of high beam lights. At present, the violation of the high beam is mainly supervised by the traffic police, and due to the limitation of police force and time, the violation of all the high beams cannot be effectively supervised. In addition, some high beam snapshot systems developed in recent years all recognize snapshot pictures, but these methods have certain limitations, which are expressed in that: 1) the number of the captured high beam pictures is small and inconsistent, the high beam pictures are likely to be generated by a driver during normal use and are easily misjudged as disorder high beam, so that the pictures are not sufficient as law enforcement evidence; 2) in order to obtain the pictures, a plurality of capturing devices are often additionally erected at the same place, so that the manufacturing cost is high; 3) the originally laid video monitoring equipment cannot be completely utilized, and resource waste is caused.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an automobile high beam identification system based on video deep learning.
The invention also provides an automobile high beam identification method based on video deep learning corresponding to the system.
In order to achieve the purpose, the invention adopts the following technical scheme:
an automobile high beam identification system based on video deep learning comprises the following two parts:
the foreground part is used for realizing the identification and processing of the high beam violation behaviors and comprises a road monitoring equipment module, a video processing and identifying module, an identification result processing module and a database of the violation results to be detected, which are connected in sequence;
the background part is used for processing the video and realizing the deep learning of the video and comprises a key frame extraction algorithm, a labeled database and a deep learning module, wherein the labeled database is constructed by calling the key frame extraction algorithm to extract the key frame from the original video data, the data in the labeled database is used for the training of the deep learning module, and the trained deep learning module and the key frame extraction algorithm are used for the calling of the video processing and recognition module.
As one of the preferable technical solutions, the key frame extraction algorithm is a clustering-based key frame extraction algorithm.
As one of the preferable technical solutions, the deep learning module is a CNN + LSE (convolutional neural network + least square estimation) -based deep learning module.
The system corresponds to an automobile high beam identification method based on video deep learning, and the method specifically comprises the following steps:
(1) the road monitoring equipment module acquires driving video data of the automobile and transmits the driving video data to the video processing and identifying module;
(2) the video processing and identifying module calls a key frame extraction algorithm to extract key frames of video data, then graying operation is carried out, the grayed key frames are used as input, a deep learning module which is trained according to a database with labels and is based on CNN + LSE is called, output labels of all key frames are obtained, the output labels comprise dipped headlights, fog lamps or high beam lamps, and the labels are assigned to corresponding key frame images;
(3) and (3) taking the video data and the key frame with the label obtained in the step (2) as the input of a recognition result processing module for judging whether the vehicle violates the regulations, embedding a license plate recognition system in the recognition result processing module, extracting the license plate of the target vehicle when the target vehicle has the high beam violation behaviors, acquiring the vehicle information, and importing the suspected violation video data into a database of the violation results to be detected.
In the step (2), the key frame extraction algorithm is as follows:
(2-1) taking the ith segment V in the original video databaseiExtracting n frames at equal time intervals and using Fi,jNaming the frame at the jth moment of the ith video data, and representing the key frame sequence of the corresponding video data as { F }i,1,Fi,2,...,Fi,nIn which Fi,1Is the first frame, Fi,nIs a tail frame; defining the similarity between two adjacent frames as the similarity of histograms of the two adjacent frames (namely histogram characteristic difference), and controlling the clustering density by a predefined threshold value delta; wherein i, j and n are integers;
(2-2) selecting the first frame Fi,1Is the initial cluster center and calculates frame Fi,jSimilarity with initial cluster center, if the value is less than delta, judging that the distance between the frame and the cluster center frame is too large, therefore, Fi,jCannot be added to the cluster; if Fi,jSimilarity with all clustering centers is less than delta, Fi,jForm aA new cluster, Fi,jIs a new cluster center; otherwise, adding the frame into the cluster with the maximum similarity to the frame, and enabling the distance between the frame and the center of the cluster to be minimum;
(2-3) repeating (2-2) to convert the original video data ViAfter the n frames extracted in (1) are respectively classified into different clusters, the key frames can be selected: extracting the frame nearest to the cluster center from each cluster as the representative frame of the cluster, wherein the representative frames of all clusters form the original video data ViThe key frame of (1).
In the step (2), the construction method of the database with the tags comprises the following steps:
the method comprises the steps of taking a large amount of vehicle running video data under a big data background as original video data, calling a key frame extraction algorithm based on clustering to the original video data to extract key frames, manually judging the light types of vehicles in the key frames, and adding labels to each key frame to enable the original key frames to become labeled data, wherein the label types comprise: three types of dipped headlight, fog light and high beam are respectively represented by-1, 0 and 1; storing the key frame data with the label into a labeled database, wherein the data in the labeled database are the original video data and the labeled key frame thereof, and the labeled key frame is represented as (F)i,jK), where k takes the value-1, 0 or 1.
In the step (2), a construction method of the CNN + LSE-based deep learning module is that a LeNet5 convolutional neural network structure is adopted, the module is divided into eight layers, the first six layers are a feature extraction part, the second two layers are a classifier part, wherein the feature extraction layer adopts a classical convolutional neural network structure, and the classifier layer adopts a full-connection structure; the module takes data in a labeled database as training data, a CNN + LSE combined algorithm is adopted to train the deep learning module, a CNN method is adopted to train the feature extraction part, and an LSE method is adopted to train the classifier layer so as to realize the rapid learning of module parameters and enhance the generalization capability of the module.
The specific method comprises the following steps:
inputting a video key frame in a database with labels into a first layer of a CNN + LSE-based deep learning module; performing convolution operation on the upper layer output by adopting different convolution cores in the second layer; the third layer performs pooling (down-sampling) on the upper layer output; the fourth layer and the fifth layer repeat the operations of the second layer and the third layer; the sixth layer sequentially expands the output characteristics of the upper layer and arranges the output characteristics into a line; the seventh layer is fully interconnected with the upper layer output features; the last layer is also in a form of full interconnection with the upper layer. The output of the deep learning module based on CNN + LSE is in three cases: low beam, fog and high beam, denoted-1, 0 and 1, respectively.
The deep learning module based on CNN + LSE is trained as follows:
taking any sample from the tagged database (F)i,jK) to Fi,jFirstly, graying operation is carried out to change the key frame into a grayscale image, and then the grayed key frame F is converted into a grayscale imagei,j' input into the module, i.e. input data as (F)i,j', k); training the two parts of the deep learning module by adopting a CNN (common noise network) and LSE (least squares) method respectively; the parameter training method of the feature extraction part comprises the following steps:
(2-a1) initializing all connection weight parameters of the feature extraction part in the deep learning module;
(2-A2) calculating the actual output label O corresponding to the input key framek
(2-A3) calculating actual output label OkDifference from the corresponding ideal output label k;
(2-A4) weight learning: reversely transmitting and adjusting a connection weight parameter matrix of a feature extraction part in the deep learning module by a method of minimizing errors;
(2-A5) until all the key frames of the video data are traversed, and the parameter training is finished;
the parameter training method of the classifier part is as follows:
(2-B1) connection weights and biases between the rasterized layer and the fully-connected layer are randomly generated and the fully-connected layer output is written
Is a matrix
Figure BDA0001247072090000031
Wherein G (-) is an activation function, aiTo connect weights, biFor bias, L is the number of nodes of the full link layer, N is the number of all key frames, xjKey frame, i ═ 1,2, …, L, j ═ 1,2, …, N;
(2-B2) writing the net output result of the corresponding key frame as an output vector Y ═ Y1y2… yn]TWherein y isjFor the jth key frame xjA corresponding output tag;
(2-B3) calculating an output weight β ═ PHY between the fully-connected layer and the output layer, where P ═ HTH)-1
In the step (3), the data in the database of the violation results to be detected is the video data judged to be violating the regulations by the identification result processing module, wherein the violation results to be detected should be manually checked, then the information which is confirmed to be correct is imported into the database of the violation regulations, and the information which is judged to be correct is deleted.
In the step (3), the method for judging whether the high beam violation exists is as follows: keyframe labeled as high beam
Figure BDA0001247072090000041
And its next key frame
Figure BDA0001247072090000042
Time interval Δ T between j2-j1If the delta T is larger than or equal to theta, the vehicle has the phenomenon that the high beam violates the regulations, wherein the theta is a violation time threshold value.
The invention has the beneficial effects that:
the invention automatically analyzes and identifies the monitoring video, ensures the completeness of law enforcement evidence, is similar to manual judgment, has intelligence, is simple in equipment arrangement, and can fully utilize the original monitoring equipment. The method comprises the following specific steps:
(1) by mining the video data, the sufficiency of law enforcement evidence is greatly improved on the basis of ensuring the accuracy, and the loss of an evidence chain is prevented when the high beam violates the law;
(2) the requirement of the same point position on the number of the devices is low, and a large amount of originally distributed monitoring devices can be directly reused, so that the cost is reduced, and the utilization rate of the devices is improved;
(3) the intelligent judgment of high beam violation is carried out by adopting a video deep learning-based mode, so that manual law enforcement is replaced, real automation is realized, and the efficiency is improved; meanwhile, after deep learning, the high beam violation identification effect is expected to reach or exceed the manual identification level, so that the real intellectualization of the identification system is realized;
(4) the deep learning module performs parameter learning on the system by adopting a CNN + LSE method, so that the parameter learning speed of the system is higher, the generalization capability of the module is stronger, and the robustness of the system is improved.
Drawings
FIG. 1 is a schematic diagram of the system architecture of the present invention;
fig. 2 is a diagram of a CNN + LSE-based deep learning module architecture.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and examples, which are provided for the purpose of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1, an automobile high beam identification system based on video deep learning includes the following two parts:
the foreground part is used for realizing the identification and processing of the high beam violation behaviors and comprises a road monitoring equipment module, a video processing and identifying module, an identification result processing module and a database of the violation results to be detected, which are connected in sequence;
the background part is used for processing the video and realizing the deep learning of the video and comprises a key frame extraction algorithm, a labeled database and a deep learning module, wherein the labeled database is constructed by calling the key frame extraction algorithm to extract the key frame from the original video data, the data in the labeled database is used for the training of the deep learning module, and the trained deep learning module and the key frame extraction algorithm are used for the calling of the video processing and recognition module.
The key frame extraction algorithm is a key frame extraction algorithm based on clustering; the deep learning module is a CNN + LSE-based deep learning module.
The system corresponds to an automobile high beam identification method based on video deep learning, and the method specifically comprises the following steps:
(1) the road monitoring equipment module obtains the driving video data of the automobile and transmits the driving video data to the video processing and identifying module.
(2) The video processing and recognition module calls a key frame extraction algorithm to extract key frames of original video data, then graying operation is carried out, the grayed key frames are used as input, a trained deep learning module with a label database based on CNN + LSE is called, output labels of all key frames, including dipped headlights, fog lights or high beam lights, are obtained, and the labels are assigned to corresponding key frame images.
The key frame extraction algorithm is as follows:
(2-1) taking the ith segment V in the original video databaseiExtracting n frames at equal time intervals and using Fi,jNaming the frame at the jth moment of the ith video data, and representing the key frame sequence of the corresponding video data as { F }i,1,Fi,2,...,Fi,nIn which Fi,1Is the first frame, Fi,nIs a tail frame; defining the similarity between two adjacent frames as the similarity of histograms of the two adjacent frames (namely histogram characteristic difference), and controlling the clustering density by a predefined threshold value delta; wherein i, j and n are integers;
(2-2) selecting the first frame Fi,1Is the initial cluster center and calculates frame Fi,jSimilarity with initial cluster center, if the value is less than delta, judging that the distance between the frame and the cluster center frame is too large, therefore, Fi,jCannot be added to the cluster; if Fi,jSimilarity with all clustering centers is less than delta, Fi,jForm a new cluster, Fi,jIs a new cluster center; otherwise, adding the frame into the cluster with the maximum similarity to the frame, and enabling the distance between the frame and the center of the cluster to be minimum;
(2-3) repeating (2-2) to convert the original video data ViAfter the n frames extracted in (1) are respectively classified into different clusters, the key frames can be selected: extracting the frame nearest to the cluster center from each cluster as the representative frame of the cluster, wherein the representative frames of all clusters form the original video data ViThe key frame of (1).
The construction method of the database with the labels comprises the following steps:
the method comprises the steps of taking a large amount of vehicle running video data under a big data background as original video data, calling a key frame extraction algorithm based on clustering to the original video data to extract key frames, manually judging the light types of vehicles in the key frames, and adding labels to each key frame to enable the original key frames to become labeled data, wherein the label types comprise: three types of dipped headlight, fog light and high beam are respectively represented by-1, 0 and 1; storing the key frame data with the label into a labeled database, wherein the data in the labeled database are the original video data and the labeled key frame thereof, and the labeled key frame is represented as (F)i,jK), where k takes the value-1, 0 or 1.
As shown in fig. 2, the deep learning module based on CNN + LSE is constructed by adopting a LeNet5 convolutional neural network structure, wherein the module is divided into eight layers, the first six layers are a feature extraction part, and the second two layers are a classifier part, wherein the feature extraction layer adopts a classical convolutional neural network structure, and the classifier layer adopts a full-connection structure; the module takes data in a labeled database as training data, a CNN + LSE combined algorithm is adopted to train the deep learning module, a CNN method is adopted to train the feature extraction part, and an LSE method is adopted to train the classifier layer so as to realize the rapid learning of module parameters and enhance the generalization capability of the module. The specific method comprises the following steps: inputting a video key frame in a database with labels into a first layer of a CNN + LSE-based deep learning module; performing convolution operation on the upper layer output by adopting different convolution cores in the second layer; the third layer performs pooling (down-sampling) on the upper layer output; the fourth layer and the fifth layer repeat the operations of the second layer and the third layer; the sixth layer sequentially expands the output characteristics of the upper layer and arranges the output characteristics into a line; the seventh layer is fully interconnected with the upper layer output features; the last layer is also in a form of full interconnection with the upper layer. The output of the deep learning module based on CNN + LSE is in three cases: low beam, fog and high beam, denoted-1, 0 and 1, respectively.
The deep learning module based on CNN + LSE is trained as follows:
taking any sample from the tagged database (F)i,jK) to Fi,jFirstly, graying operation is carried out to change the key frame into a grayscale image, and then the grayed key frame F is converted into a grayscale imagei,j' input into the module, i.e. input data as (F)i,j', k); training the two parts of the deep learning module by adopting a CNN (common noise network) and LSE (least squares) method respectively; the parameter training method of the feature extraction part comprises the following steps:
(2-a1) initializing all connection weight parameters of the feature extraction part in the deep learning module;
(2-A2) calculating the actual output label O corresponding to the input key framek
(2-A3) calculating actual output label OkDifference from the corresponding ideal output label k;
(2-A4) weight learning: reversely transmitting and adjusting a connection weight parameter matrix of a feature extraction part in the deep learning module by a method of minimizing errors;
(2-A5) until all the key frames of the video data are traversed, and the parameter training is finished;
the parameter training method of the classifier part is as follows:
(2-B1) connection weights and biases between the rasterized layer and the fully-connected layer are randomly generated and the fully-connected layer output is written
Is a matrix
Figure BDA0001247072090000071
Wherein G (-) is an activation function, aiTo connect weights, biFor bias, L is the number of nodes of the full link layer, N is the number of all key frames, xjKey frame, i ═ 1,2, …, L, j ═ 1,2, …, N;
(2-B2) corresponding to the key frameThe network output result is written as an output vector Y ═ Y1y2… yn]TWherein y isjFor the jth key frame xjA corresponding output tag;
(2-B3) calculating an output weight β ═ PHY between the fully-connected layer and the output layer, where P ═ HTH)-1
(3) And (3) taking the original video data and the key frame with the label obtained in the step (2) as the input of a recognition result processing module for judging whether the vehicle violates the regulations, embedding a license plate recognition system in the recognition result processing module, extracting the license plate of the target vehicle when the target vehicle has the high beam violation behaviors, acquiring the vehicle information, and importing the suspected violation video data into a database of the violation results to be detected.
The method for judging whether the high beam violation behaviors exist is as follows: keyframe labeled as high beam
Figure BDA0001247072090000072
And its next key frame
Figure BDA0001247072090000073
Time interval Δ T between j2-j1If the delta T is larger than or equal to theta, the vehicle has the phenomenon that the high beam violates the regulations, wherein the theta is a violation time threshold value.
(4) The data in the database of the result of the violation to be detected is the video data judged to be violating the regulations by the identification result processing module, wherein the result of the violation to be detected should be manually checked, then the information which is confirmed to be correct is imported into the database of the violation to be detected, and the misjudged information is deleted.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, the scope of the present invention is not limited thereto, and various modifications and variations which do not require inventive efforts and which are made by those skilled in the art are within the scope of the present invention.

Claims (2)

1.一种基于视频深度学习的汽车远光灯识别方法,其特征在于,具体步骤如下:1. a car high beam identification method based on video deep learning, is characterized in that, concrete steps are as follows: (1)道路监测设备模块获得汽车的行驶视频数据,并将其传输给视频处理与识别模块;(1) The road monitoring equipment module obtains the driving video data of the car and transmits it to the video processing and identification module; (2)视频处理与识别模块调用关键帧提取算法提取原始视频数据的关键帧,然后进行灰度化操作,将灰度化后的关键帧作为输入,调用带标签数据库训练好的基于CNN+LSE的深度学习模块,得到各个关键帧的输出标签,包括近光灯、雾灯或者远光灯,并将标签赋给相应关键帧图像;(2) The video processing and recognition module calls the key frame extraction algorithm to extract the key frames of the original video data, and then performs a grayscale operation. The grayscale key frames are used as input, and the trained CNN+LSE based on the labeled database is called. The deep learning module of , obtains the output label of each key frame, including low beam, fog light or high beam, and assigns the label to the corresponding key frame image; (3)将原始视频数据以及步骤(2)所得带标签的关键帧一并作为识别结果处理模块的输入,用于判断车辆是否违章,并且,在识别结果处理模块中嵌入车牌识别系统,当目标车辆存在远光灯违章行为时,对其进行车牌提取,获取车辆信息,将涉嫌违章视频数据导入待检违章结果数据库;(3) The original video data and the key frame with the label obtained in step (2) are used as the input of the recognition result processing module to judge whether the vehicle violates regulations, and the license plate recognition system is embedded in the recognition result processing module, when the target When the vehicle has high beam violations, extract the license plate, obtain vehicle information, and import the suspected violation video data into the pending violation result database; 步骤(3)中,对于是否存在远光灯违章行为的判断方法是:标签为远光灯的关键帧
Figure FDA0002403761450000011
与其下个关键帧
Figure FDA0002403761450000012
之间的时间间隔△T=j2-j1,若△T≥θ,则该车辆存在远光灯违章使用现象,其中,θ为违章时间阈值;
In step (3), the method for judging whether there is a high beam violation is: the key frame labeled as high beam
Figure FDA0002403761450000011
its next keyframe
Figure FDA0002403761450000012
The time interval between ΔT=j 2 -j 1 , if ΔT≥θ, the vehicle has the phenomenon of illegal use of high beams, where θ is the violation time threshold;
步骤(2)中,关键帧提取算法如下:In step (2), the key frame extraction algorithm is as follows: (2-1)取原始视频数据库中的第i段Vi,等时间间隔提取n个帧,并用Fi,j命名第i段视频数据的第j个时刻的帧,将相应视频数据的关键帧序列表示为{Fi,1,Fi,2,...,Fi,n},其中Fi,1为首帧,Fi,n为尾帧;定义相邻两帧之间的相似度是相邻两帧直方图的相似度,即直方图特征差别,预定义阈值δ控制聚类的密度;其中,i、j和n均为整数;(2-1) Take the i -th segment Vi in the original video database, extract n frames at equal time intervals, and use F i,j to name the frame at the j-th moment of the i-th segment of video data, and assign the key of the corresponding video data The frame sequence is expressed as {F i,1 ,F i,2 ,...,Fi ,n }, where F i,1 is the first frame and F i,n is the last frame; defines the similarity between two adjacent frames Degree is the similarity between the histograms of two adjacent frames, that is, the difference in histogram features, and the predefined threshold δ controls the density of clusters; where i, j, and n are all integers; (2-2)选定首帧Fi,1为初始的聚类中心,并计算帧Fi,j与初始聚类中心间的相似度,如果该相似度小于δ,则判定该帧Fi,j与聚类中心帧之间距离过大,因此,Fi,j不能加入该聚类中;如果Fi,j与所有聚类中心相似度均小于δ,则Fi,j形成一个新的聚类,Fi,j为新的聚类中心;否则,将该帧Fi,j加入到与之相似度最大的聚类中,使该帧Fi,j与这个聚类的中心之间的距离最小;(2-2) Select the first frame F i,1 as the initial cluster center, and calculate the similarity between the frame F i,j and the initial cluster center, if the similarity is less than δ, then determine the frame F i The distance between ,j and the cluster center frame is too large, so F i,j cannot be added to the cluster; if the similarity between F i,j and all cluster centers is less than δ, then F i,j forms a new cluster , F i,j is the new cluster center; otherwise, the frame F i,j is added to the cluster with the greatest similarity to it, so that the frame F i,j and the center of the cluster are closer the smallest distance between (2-3)重复(2-2)操作,将原始视频数据Vi中所提取的n个帧,分别归类到不同聚类后,就可以选择关键帧:从每个聚类中抽取离聚类中心最近的帧作为这个聚类的代表帧,所有聚类的代表帧就构成了原始视频数据Vi的关键帧;(2-3) Repeat the operation of (2-2), after classifying the n frames extracted from the original video data V i into different clusters, the key frame can be selected: extract the isolated frames from each cluster The nearest frame of the cluster center is used as the representative frame of this cluster, and the representative frames of all clusters constitute the key frame of the original video data Vi ; 步骤(2)中,带标签数据库的构建方法如下:In step (2), the construction method of the tagged database is as follows: 将大数据背景下的大量车辆行驶视频数据作为原始视频数据,对原始视频数据调用基于聚类的关键帧提取算法进行关键帧提取,人工判定关键帧中车辆的灯光类型,给每个关键帧添加标签使原始关键帧变为带标签数据,其中,标签类别包括:近光灯、雾灯以及远光灯三种,分别用-1,0以及1表示;将带有标签的关键帧数据存入带标签数据库,带标签数据库中的数据为原始视频数据及其带标签关键帧,表示为(Fi,j,k),其中k取-1,0或者1;Take a large number of vehicle driving video data in the background of big data as the original video data, call the key frame extraction algorithm based on clustering on the original video data to extract key frames, manually determine the light type of the vehicle in the key frame, and add to each key frame. The label turns the original key frame into labeled data. The label categories include: low beam, fog light, and high beam, which are represented by -1, 0, and 1 respectively; store the labeled key frame data in Labeled database, the data in the labeled database is the original video data and its labelled key frame, which is expressed as (F i,j ,k), where k is -1, 0 or 1; 步骤(2)中,基于CNN+LSE的深度学习模块的构建方法是,采用LeNet5卷积神经网络结构,模块共分为八层,前六层为特征提取部分,后两层为分类器部分,其中,特征提取层采用经典的卷积神经网络结构,分类器层采用全连接结构;由带标签数据库中的数据作为训练数据,采用CNN+LSE组合算法对深度学习模块进行训练,对于特征提取部分采用CNN方法进行训练,而对于分类器层,则采用LSE方法进行训练;In step (2), the construction method of the deep learning module based on CNN+LSE is to use the LeNet5 convolutional neural network structure. The module is divided into eight layers, the first six layers are the feature extraction part, and the last two layers are the classifier part. Among them, the feature extraction layer adopts the classic convolutional neural network structure, and the classifier layer adopts the fully connected structure; the data in the labeled database is used as the training data, and the deep learning module is trained by the CNN+LSE combination algorithm. For the feature extraction part The CNN method is used for training, and for the classifier layer, the LSE method is used for training; 基于CNN+LSE的深度学习模块的训练过程如下:The training process of the deep learning module based on CNN+LSE is as follows: 从带标签数据库中任取一个样本(Fi,j,k),对Fi,j首先进行灰度化操作,使其变为灰度图像,然后将灰度化后的关键帧Fi,j'输入到模块中,即输入数据为(Fi,j',k);对深度学习模块的两部分分别采用CNN与LSE的方法进行训练;其中,特征提取部分的参数训练方法如下:Take any sample (F i,j ,k) from the labeled database, first perform a grayscale operation on F i,j to make it a grayscale image, and then convert the grayscaled key frames F i, j ' is input into the module, that is, the input data is (F i,j ',k); the two parts of the deep learning module are trained by CNN and LSE respectively; among them, the parameter training method of the feature extraction part is as follows: (2-A1)初始化深度学习模块中特征提取部分的所有连接权重参数;(2-A1) Initialize all connection weight parameters in the feature extraction part of the deep learning module; (2-A2)计算输入关键帧相对应的实际输出标签Ok(2-A2) Calculate the actual output label O k corresponding to the input key frame; (2-A3)计算实际输出标签Ok与相应理想输出标签k的差值;(2-A3) Calculate the difference between the actual output label O k and the corresponding ideal output label k; (2-A4)权重学习:通过极小化误差的方法反向传播调节深度学习模块中特征提取部分的连接权重参数矩阵;(2-A4) Weight learning: adjust the connection weight parameter matrix of the feature extraction part in the deep learning module by backpropagation by minimizing the error; (2-A5)直至遍历所有视频数据的关键帧,参数训练完毕;(2-A5) until the key frames of all video data are traversed, and the parameter training is completed; 分类器部分的参数训练方法如下:The parameter training method of the classifier part is as follows: (2-B1)光栅化层与全连接层之间的连接权重与偏置随机生成,并将全连接层输出写为矩阵
Figure FDA0002403761450000031
(2-B1) The connection weights and biases between the rasterization layer and the fully connected layer are randomly generated, and the output of the fully connected layer is written as a matrix
Figure FDA0002403761450000031
其中G(·)为激活函数,ai为连接权重,bi为偏置,L为全连接层节点个数,N为所有关键帧的个数,xj为关键帧,i=1,2,…,L,j=1,2,…,N;where G( ) is the activation function, a i is the connection weight, b i is the bias, L is the number of fully connected layer nodes, N is the number of all key frames, x j is the key frame, i=1,2 ,...,L,j=1,2,...,N; (2-B2)将相应关键帧的网络输出结果写为输出向量Y=[y1 y2…yn]T,其中yj为第j个关键帧xj对应的输出标签;(2-B2) Write the network output result of the corresponding key frame as an output vector Y=[y 1 y 2 ... y n ] T , where y j is the output label corresponding to the jth key frame x j ; (2-B3)计算全连接层与输出层之间的输出权重β=PHY,其中P=(HTH)-1(2-B3) Calculate the output weight β=PHY between the fully connected layer and the output layer, where P=(H T H) -1 .
2.根据权利要求1所述的方法,其特征在于,步骤(3)中,待检违章结果数据库中的数据为经识别结果处理模块判断为违章的视频数据,其中的待检违章结果应当接受人工检查,然后将确认无误的信息导入违章数据库,并对误判的信息进行删除。2. method according to claim 1 is characterized in that, in step (3), the data in the illegal result database to be inspected is the video data judged to be illegal through the identification result processing module, and the illegal result to be inspected wherein should be accepted Manual inspection, and then import the confirmed information into the violation database, and delete the misjudged information.
CN201710156201.4A 2017-03-16 2017-03-16 Automobile high beam identification system and method based on video deep learning Active CN106934378B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710156201.4A CN106934378B (en) 2017-03-16 2017-03-16 Automobile high beam identification system and method based on video deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710156201.4A CN106934378B (en) 2017-03-16 2017-03-16 Automobile high beam identification system and method based on video deep learning

Publications (2)

Publication Number Publication Date
CN106934378A CN106934378A (en) 2017-07-07
CN106934378B true CN106934378B (en) 2020-04-24

Family

ID=59432614

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710156201.4A Active CN106934378B (en) 2017-03-16 2017-03-16 Automobile high beam identification system and method based on video deep learning

Country Status (1)

Country Link
CN (1) CN106934378B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6729516B2 (en) * 2017-07-27 2020-07-22 トヨタ自動車株式会社 Identification device
CN108229447B (en) * 2018-02-11 2021-06-11 陕西联森电子科技有限公司 High beam light detection method based on video stream
CN108921060A (en) * 2018-06-20 2018-11-30 安徽金赛弗信息技术有限公司 Motor vehicle based on deep learning does not use according to regulations clearance lamps intelligent identification Method
CN108932853B (en) * 2018-06-22 2021-03-30 安徽科力信息产业有限责任公司 Method and device for recording illegal parking behaviors of multiple motor vehicles
CN109191419B (en) * 2018-06-25 2021-06-29 国网智能科技股份有限公司 Real-time pressing plate detection and state recognition system and method based on machine learning
CN108986476B (en) * 2018-08-07 2019-12-06 安徽金赛弗信息技术有限公司 method, system and storage medium for recognizing non-use of high beam by motor vehicle according to regulations
CN109934106A (en) * 2019-01-30 2019-06-25 长视科技股份有限公司 A kind of user behavior analysis method based on video image deep learning
CN110046547A (en) * 2019-03-06 2019-07-23 深圳市麦谷科技有限公司 Report method, system, computer equipment and storage medium violating the regulations
CN111680638B (en) * 2020-06-11 2020-12-29 深圳北斗应用技术研究院有限公司 Passenger path identification method and passenger flow clearing method based on same

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942751A (en) * 2014-04-28 2014-07-23 中央民族大学 Method for extracting video key frame
CN105590102A (en) * 2015-12-30 2016-05-18 中通服公众信息产业股份有限公司 Front car face identification method based on deep learning
CN106407931A (en) * 2016-09-19 2017-02-15 杭州电子科技大学 Novel deep convolution neural network moving vehicle detection method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9978013B2 (en) * 2014-07-16 2018-05-22 Deep Learning Analytics, LLC Systems and methods for recognizing objects in radar imagery

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942751A (en) * 2014-04-28 2014-07-23 中央民族大学 Method for extracting video key frame
CN105590102A (en) * 2015-12-30 2016-05-18 中通服公众信息产业股份有限公司 Front car face identification method based on deep learning
CN106407931A (en) * 2016-09-19 2017-02-15 杭州电子科技大学 Novel deep convolution neural network moving vehicle detection method

Also Published As

Publication number Publication date
CN106934378A (en) 2017-07-07

Similar Documents

Publication Publication Date Title
CN106934378B (en) Automobile high beam identification system and method based on video deep learning
JP6820030B2 (en) A method and device for learning using a plurality of labeled databases having different label sets, and a test method and device using this {LEARNING METHOD AND LEARNING DEVICE USING MULTIPLE LABELED DATABASES WITH DIFFERENT LAB THE SAME}
EP3289528B1 (en) Filter specificity as training criterion for neural networks
CN107729818B (en) Multi-feature fusion vehicle re-identification method based on deep learning
CN107563372B (en) License plate positioning method based on deep learning SSD frame
Butt et al. Convolutional neural network based vehicle classification in adverse illuminous conditions for intelligent transportation systems
CN111079640B (en) Vehicle type identification method and system based on automatic amplification sample
CN107316010A (en) A kind of method for recognizing preceding vehicle tail lights and judging its state
CN109993138A (en) A kind of car plate detection and recognition methods and device
CN103810505A (en) Vehicle identification method and system based on multilayer descriptors
CN108875754B (en) A vehicle re-identification method based on multi-depth feature fusion network
CN105825212A (en) Distributed license plate recognition method based on Hadoop
CN111209905B (en) Defect shielding license plate recognition method based on combination of deep learning and OCR technology
CN108960074B (en) Small-size pedestrian target detection method based on deep learning
CN112115761A (en) Countermeasure sample generation method for detecting vulnerability of visual perception system of automatic driving automobile
CN110826415A (en) Method and device for re-identifying vehicles in scene image
CN114565896A (en) A cross-layer fusion improved YOLOv4 road target recognition algorithm
Agarwal et al. Vehicle Characteristic Recognition by Appearance: Computer Vision Methods for Vehicle Make, Color, and License Plate Classification
CN115116035A (en) A road traffic light recognition system and method based on neural network
CN113361491A (en) Method for predicting pedestrian crossing intention of unmanned automobile
CN111931650A (en) Target detection model construction and red light running responsibility tracing method, system, terminal and medium
CN110555425A (en) Video stream real-time pedestrian detection method
Prawinsankar et al. Traffic Congession Detection through Modified Resnet50 and Prediction of Traffic using Clustering
CN112633163B (en) Detection method for realizing illegal operation vehicle detection based on machine learning algorithm
CN113850112A (en) Road Condition Recognition Method and System Based on Siamese Neural Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant