CN112257569A - Target detection and identification method based on real-time video stream - Google Patents
Target detection and identification method based on real-time video stream Download PDFInfo
- Publication number
- CN112257569A CN112257569A CN202011128268.5A CN202011128268A CN112257569A CN 112257569 A CN112257569 A CN 112257569A CN 202011128268 A CN202011128268 A CN 202011128268A CN 112257569 A CN112257569 A CN 112257569A
- Authority
- CN
- China
- Prior art keywords
- target
- convolution
- video stream
- real
- detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 59
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000007689 inspection Methods 0.000 claims abstract description 10
- 230000009191 jumping Effects 0.000 claims abstract description 4
- 238000009417 prefabrication Methods 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 20
- 229910052582 BN Inorganic materials 0.000 claims description 18
- PZNSFCLAULLKQX-UHFFFAOYSA-N Boron nitride Chemical compound N#B PZNSFCLAULLKQX-UHFFFAOYSA-N 0.000 claims description 18
- 230000004913 activation Effects 0.000 claims description 16
- 238000013135 deep learning Methods 0.000 claims description 14
- 238000010606 normalization Methods 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 12
- 230000009286 beneficial effect Effects 0.000 claims description 7
- 230000008859 change Effects 0.000 claims description 6
- 230000004927 fusion Effects 0.000 claims description 5
- 238000011176 pooling Methods 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 5
- 230000002776 aggregation Effects 0.000 claims description 4
- 238000004220 aggregation Methods 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 claims description 3
- 230000000717 retained effect Effects 0.000 claims description 3
- 230000005764 inhibitory process Effects 0.000 claims description 2
- 238000012544 monitoring process Methods 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 7
- 238000005070 sampling Methods 0.000 description 5
- 239000013598 vector Substances 0.000 description 4
- 238000003384 imaging method Methods 0.000 description 2
- 230000001629 suppression Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a target detection and identification method based on real-time video stream, which comprises the following steps: s1, making a patrol plan according to the user requirements; s2, controlling the camera to rotate to a specified prefabrication position according to the inspection plan; s3, detecting whether a target exists in the current video frame sequence in the video stream acquired by the camera, executing the step S4 if the target exists, otherwise, delaying to wait for the current video frame detected by the video stream to be acquired continuously, and jumping to the step S2 to control the camera to rotate to the next specified prefabricated position if the target is not found after the delaying is finished; s4, controlling the camera to focus on the target area when finding the target; s5, intercepting the current area image and identifying the current target category; s6, outputting the detection and identification result; and S7, returning to the step S2 to continuously execute the inspection plan, and realizing real-time monitoring and identification of the moving target by the method, wherein the accuracy and the real-time performance are better.
Description
Technical Field
The invention relates to the technical field of moving target detection and identification, in particular to a target detection and identification method based on real-time video streaming.
Background
The video-based real-time moving target Detection and identification is widely applied at present, and the moving target Detection (Motion Detection) refers to a process of taking an object with a space position change in an image sequence or a video as a foreground for presentation and marking. The method is always a very popular research field and is widely applied to the fields of intelligent monitoring, multimedia application and the like.
In the years, depending on different application occasions, technical methods and the like, scholars propose various moving object detection methods, which are suitable for the environment and the change of the scholars, and simultaneously have the detection accuracy and the real-time performance. At present, the following basic methods exist for computer vision inspection of moving objects: frame differencing, optical flow, and background subtraction, in addition to feature matching, KNN, and variations of these (three-frame differencing, five-frame differencing). The background subtraction algorithm is widely applied due to the characteristics of simple algorithm, easy realization, good adaptability and the like.
In some scenes with a large camera detection view field range, the imaging area of a moving target in a picture is small, noise interference is large, detection of the moving target is difficult to achieve, particularly, the false detection rate is high in a scene with a fuzzy background, the number of feature points is insufficient, and the type of the target is difficult to identify.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a target detection and identification method based on real-time video stream, which adopts a polling mode to decompose a scene with a larger view field range into video stream to realize effective detection of small targets in a large area;
in order to realize the purpose of the invention, the invention adopts the following technical scheme:
a target detection and identification method based on real-time video stream includes the following steps:
s1, making a patrol plan according to the user requirements;
s2, controlling the camera to rotate to a specified prefabrication position according to the inspection plan;
s3, detecting whether a target exists in the current video frame sequence in the video stream acquired by the camera, executing the step S4 if the target exists, otherwise, delaying to wait for the current video frame detected by the video stream to be acquired continuously, and jumping to the step S2 to control the camera to rotate to the next specified prefabricated position if the target is not found after the delaying is finished;
s4, controlling the camera to focus on the target area when finding the target;
s5, intercepting the current area image and identifying the current target category;
s6, outputting the detection and identification result;
and S7, returning to the step S2 to continuously execute the inspection plan.
Preferably, when step S3 is executed, after the sequence of video frames is obtained from the camera, the first frame is selected as a background frame, meanwhile, a background modeling is performed on the video frames by using median filtering, a background threshold is calculated based on the background frame, and finally, a moving object active region is quickly obtained by using a frame difference method; in step S4, the camera is adjusted in azimuth and focal length according to its relative position. The method integrates the accuracy of background subtraction and the rapid performance of a frame difference method; the purpose is to judge whether the target has an area where the target is quickly obtained.
Preferably, after the current region image is intercepted in step S5, the target class is identified by inputting the deep learning network model.
Preferably, the step of inputting the deep learning network model to identify the target category includes:
sa 1: inputting an image to be processed to a deep learning network model;
sa 2: the deep learning network maps the image to a high-dimensional feature space through initialization convolution:
sa 3: obtaining the feature information of each target in the image through a feature extraction network and a feature enhancement module; the feature extraction network extracts features of different levels, wherein shallow features are beneficial to small target detection, and deep features are beneficial to target identification;
sa 4: and predicting output, and obtaining target category and position information through classification and regression.
Preferably, the initialization convolution in step Sa2 includes convolution a of 3 × 3 × 1, convolution b of 3 × 3 × 2, batch normalization BN, and activation function Relu in this order; firstly, increasing the number of channels of an image to be processed through convolution a, performing downsampling through convolution b of 3 multiplied by 2 to obtain a characteristic diagram, then performing batch normalization BN (boron nitride) processing, and inputting as a next-stage network after an activation function Relu;
preferably, the feature extraction network in step Sa3 is composed of 10-30 residual convolution modules, and each of several residual convolutions is connected with a convolution b of 3 × 3 × 2 for down-sampling; each residual convolution module sequentially comprises 1 multiplied by 1 convolution c, batch normalization BN, activation function Relu, 3 multiplied by 1 convolution a, batch normalization BN and activation function Relu from input to output; the convolution b can change the size of the feature map, so that a feature map of a higher level is obtained, and feature information of the higher level is extracted through cascade of residual convolution.
Preferably, in step Sa3, the feature enhancement module performs fusion with the shallow features through a path aggregation network PAN after passing through spatial pyramid pooling SPP by using the deep features, and mainly functions to improve target detection and recognition accuracy, especially small target detection and recognition, through multi-level feature learning;
preferably, when the output is predicted specifically in step Sa4, the class confidence of the object and the coordinates of the object frame are obtained through the sofmax function; outputting offsets delta x and delta y specifically formed into a target frame, scaling scales a and b of an anchor point, the probability of detecting the target and the confidence coefficient that the target belongs to each category; according to the coordinates of the target frame, marking the position of the target on the original image, and displaying the predicted category confidence; and the probability of detecting the target is used for primarily screening the target frame.
Preferably, during prediction, prediction under three scales is carried out, the feature maps of the last three scales are taken, and after SPP and PAN are carried out, the features of the three scales are input into a detection module to be regressed and classified to obtain an output result; the detection module consists of 3 residual modules plus a convolution c with a fixed number of channels equal to (number of classes +5) × 3.
Preferably, when the probability of detecting the target is greater than a set threshold, wherein the threshold is set to be between 0.4 and 0.5, that is, the probability that the current pixel point belongs to the target to be detected is greater than a set value, the result is retained, the target frame is subjected to non-maximum value inhibition screening and deduplication, the target frame with the intersection ratio being maximum at IoU at the position is determined, and finally the target frame and the confidence thereof are output and displayed as a final result.
Compared with the prior art, the invention has the following beneficial effects:
1) the method realizes the detection and identification of the target in the large area based on the camera, can be matched with the inspection task to realize the real-time monitoring and identification of the moving target under the condition that the vision field range of the camera is large or the target imaging area is small, and has better accuracy and real-time performance.
2) The method adopts a continuous two-stage process, and firstly carries out motion detection on the monitoring video. When the target exists in the current video, the deep learning network detection and identification process of the second stage is started, and the hardware calculation cost is reduced.
3) The motion detection result provides a high-reliability target area for subsequent target detection and identification, so that the subsequent target detection and identification can rely on a lightweight network to realize high-precision result output, and the real-time performance is further improved.
4) The method adopts a frame difference method combined with median filtering to carry out background modeling, and improves the accuracy of motion detection.
5) In the small target detection and identification problem, the method combines the path aggregation network and the spatial pyramid pooling, and reserves enough rich small target characteristic information as much as possible in the convolution process.
Drawings
FIG. 1 is a flow chart of a real-time video stream based object detection and identification method of the present invention;
FIG. 2 is a flow chart of the method for detecting and identifying targets based on real-time video streaming, in which the image input deep learning network model identifies the target category.
FIG. 3 is a block diagram of a feature extraction network in a real-time video stream based target detection and identification method of the present invention;
FIG. 4 is a block diagram of a residual convolution module in a real-time video stream based target detection and identification method according to the present invention;
FIG. 5 is a block diagram of SPPs in a real-time video stream based object detection and recognition method of the present invention;
fig. 6 is a block diagram of a PAN in a real-time video stream based object detection and recognition method of the present invention.
Detailed Description
The technical solution in the embodiment of the present invention will be clearly and completely described below:
as shown in fig. 1, in an embodiment of the present invention, a method for detecting and identifying an object based on a real-time video stream includes the following steps:
s1, making a patrol plan according to the user requirements;
s2, controlling the camera to rotate to a specified prefabrication position according to the inspection plan;
s3, detecting whether a target exists in the current video frame sequence in the video stream acquired by the camera, executing the step S4 if the target exists, otherwise, delaying to wait for the current video frame detected by the video stream to be acquired continuously, and jumping to the step S2 to control the camera to rotate to the next specified prefabricated position if the target is not found after the delaying is finished;
s4, controlling the camera to focus on the target area when finding the target;
s5, intercepting the current area image and identifying the current target category;
s6, outputting the detection and identification result;
and S7, returning to the step S2 to continuously execute the inspection plan.
Specifically, when step S3 is executed, after a video frame sequence is obtained from a camera, a first frame is selected as a background frame, meanwhile, a background modeling is performed on the video frame by using median filtering, a background threshold is calculated based on the background frame, and finally, a moving object active region is quickly obtained by using a frame difference method; in step S4, the camera is adjusted in azimuth and focal length according to its relative position. The method integrates the accuracy of background subtraction and the rapid performance of a frame difference method; the purpose is to judge whether the target has an area where the target is quickly obtained.
Specifically, after the current area image is intercepted in step S5, the target category is identified by inputting the deep learning network model.
As shown in fig. 2, specifically, the step of identifying the target category by the input deep learning network model includes:
sa 1: inputting an image to be processed to a deep learning network model;
sa 2: the deep learning network maps the image to a high-dimensional feature space through initialization convolution:
sa 3: obtaining the feature information of each target in the image through a feature extraction network and a feature enhancement module; the feature extraction network extracts features of different levels, wherein shallow features are beneficial to small target detection, and deep features are beneficial to target identification;
sa 4: and predicting output, and obtaining target category and position information through classification and regression.
Specifically, the initialization convolution in step Sa2 sequentially includes convolution a of 3 × 3 × 1, convolution b of 3 × 3 × 2, batch normalization BN, and an activation function Relu; firstly, increasing the number of channels of an image to be processed through convolution a, performing downsampling through convolution b of 3 multiplied by 2 to obtain a characteristic diagram, then performing batch normalization BN (boron nitride) processing, and inputting as a next-stage network after an activation function Relu;
in specific implementation, the size of an image to be processed is H multiplied by W multiplied by 3 (three channels of length, width and RGB), the number of the channels of the image to be processed is increased to 32 by initialization convolution a, the image to be processed is changed into H multiplied by W multiplied by 32, downsampling is carried out through convolution b of 3 multiplied by 2 to obtain a characteristic diagram of H/2 multiplied by W/2 multiplied by 64, batch normalization BN processing is carried out, and the characteristic diagram is used as the input of a next-stage network after an activation function Relu;
specifically, the feature extraction network in step Sa3 is composed of 10-30 residual convolution (block) modules, and each of the residual convolutions is connected with a convolution b of 3 × 3 × 2 for down-sampling (as shown in fig. 3); each residual convolution module sequentially comprises 1 multiplied by 1 convolution c, batch normalization BN, activation function Relu, 3 multiplied by 1 convolution a, batch normalization BN and activation function Relu from input to output (as shown in FIG. 4); the convolution b can change the size of the feature map, so that a feature map of a higher level is obtained, and feature information of the higher level is extracted through cascade of residual convolution.
Specifically, in step Sa3, the feature enhancement module performs fusion with the shallow features through a path aggregation network PAN after the deep features pass through the spatial pyramid pooling SPP, and mainly functions to improve target detection and recognition accuracy, especially small target detection and recognition, through multi-level feature learning.
The structure of the SPP is shown in fig. 5, and the SPP operation is performed by dividing a feature map into block areas of different sizes, such as 4x4, 2x2, and 1x1 in fig. 5. Maximum pooling is performed for each zone. Therefore, under the three different division modes, one feature map can be represented by 16, 4 and 1 values respectively. After concatenating the values, a feature map can be represented as a vector of 21 values. When 256 feature maps are input, after the SPP operation, 21 × 256-dimensional vectors are obtained. SPP has two main roles: 1. the fixed-length feature vectors can be obtained by inputting with different sizes, so that subsequent full-connection layer operation is facilitated. 2. Multi-scale pooled feature information in a feature map can be fused in a feature vector. While at the same time the computation is less consuming.
The PAN performs feature fusion between feature maps of different levels (as shown in fig. 6). And the size of the feature map is gradually reduced through a down-sampling process. As shown in the following figures: characteristic diagram N of i-th layeriDirectly obtaining a feature map with the same size as the i +1 th layer through down sampling, and performing feature extraction on the feature map and the i +1 th layer feature output P through a residual error modulei+1And performing serial fusion. To obtain Ni+1A feature map of the layer. Shallow feature information is preserved in the deep feature map as much as possible. The detection rate of the small target is improved. Meanwhile, deep features are subjected to up-sampling and are serially connected and fused with shallow feature output, and the classification precision of the target is favorably improved.
The specific implementation process is as follows, starting from initialization convolution, through feature network extraction, and then to the feature enhancement module:
wherein conv-BN-relu is complete convolution, conv is discrete convolution layer (representing convolution a, b, c process), and is defined as:
after passing through BN layer (batch normalization layer), normalizing the obtained result y to probability distribution with mean value of 0 and variance of 1Then the final output of the BN layer is obtained through two super parametersThe operation formula is as follows.
The Relu layer is an activation layer with Relu as an activation function. An activation operation of the neuron is performed. Relu is defined as:
specifically, when the output is specifically predicted in step Sa4, the class confidence of the target and the coordinates of the target frame are obtained through the sofmax function; outputting offsets delta x and delta y specifically formed into a target frame, scaling scales a and b of an anchor point, the probability of detecting the target and the confidence coefficient that the target belongs to each category; according to the coordinates of the target frame, marking the position of the target on the original image, and displaying the predicted category confidence; and the probability of detecting the target is used for primarily screening the target frame.
During prediction, prediction under three scales is carried out, the feature maps of the last three scales are taken, and after SPP and PAN are carried out, the features of the three scales are input into a detection module to be regressed and classified to obtain an output result; the detection module is composed of 3 residual modules and a convolution c of fixed channel number (classification category number +5) × 3.
Specifically, when the probability of detecting the target is greater than a set threshold (generally set to be between 0.4 and 0.5), that is, the probability that the current pixel point belongs to the target to be detected is greater than a set value, the result is retained, a Non-Maximum Suppression (Non-Maximum Suppression) is performed on a target frame to screen and remove duplication, the target frame with the intersection ratio greater than IoU at the position is determined, and finally the target frame and the confidence coefficient thereof are output and displayed as a final result;
the above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be able to cover the technical scope of the present invention by equivalent replacement or change according to the technical solution and the modified concept of the present invention within the technical scope of the present invention.
Claims (10)
1. A target detection and identification method based on real-time video stream is characterized in that: the method comprises the following steps:
s1, making a patrol plan according to the user requirements;
s2, controlling the camera to rotate to a specified prefabrication position according to the inspection plan;
s3, detecting whether a target exists in the current video frame sequence in the video stream acquired by the camera, executing the step S4 if the target exists, otherwise, delaying to wait for the current video frame detected by the video stream to be acquired continuously, and jumping to the step S2 to control the camera to rotate to the next specified prefabricated position if the target is not found after the delaying is finished;
s4, controlling the camera to focus on the target area when finding the target;
s5, intercepting the current area image and identifying the current target category;
s6, outputting the detection and identification result;
and S7, returning to the step S2 to continuously execute the inspection plan.
2. The method of claim 1, wherein the real-time video stream-based object detection and identification method comprises: when step S3 is executed, after a video frame sequence is obtained from a camera, a first frame is selected as a background frame, and meanwhile, a background modeling is performed on the video frame by using median filtering, and a background threshold is calculated therefrom, and finally, a moving object active region is quickly obtained by using a frame difference method; in step S4, the camera is adjusted in azimuth and focal length according to its relative position.
3. The method of claim 1, wherein the real-time video stream-based object detection and identification method comprises: after intercepting the current area image at step S5, the target category is identified by inputting the deep learning network model.
4. A method for real-time video stream based object detection and recognition as claimed in claim 3, wherein:
the step of identifying the target category by the input deep learning network model comprises the following steps:
sa 1: inputting an image to be processed to a deep learning network model;
sa 2: the deep learning network maps the image to a high-dimensional feature space through initialization convolution;
sa 3: obtaining the feature information of each target in the image through a feature extraction network and a feature enhancement module; the feature extraction network extracts features of different levels, wherein shallow features are beneficial to small target detection, and deep features are beneficial to target identification;
sa 4: and predicting output, and obtaining target category and position information through classification and regression.
5. The method of claim 4, wherein the real-time video stream-based object detection and identification method comprises: the initialization convolution in step Sa2 sequentially includes convolution a of 3 × 3 × 1, convolution b of 3 × 3 × 2, batch normalization BN, and an activation function Relu; the image to be processed is firstly subjected to convolution a to increase the number of channels, is subjected to downsampling through convolution b of 3 multiplied by 2 to obtain a feature map, is subjected to batch normalization BN (boron nitride) processing, and is input as a next-stage network after being subjected to an activation function Relu.
6. The method of claim 4, wherein the real-time video stream-based object detection and identification method comprises: in the step Sa3, the feature extraction network is composed of 10-30 residual convolution modules, and each residual convolution module is connected with a convolution b of 3 × 3 × 2 for downsampling; each residual convolution module sequentially comprises 1 multiplied by 1 convolution c, batch normalization BN, activation function Relu, 3 multiplied by 1 convolution a, batch normalization BN and activation function Relu from input to output; the convolution b can change the size of the feature map, so that a feature map of a higher level is obtained, and feature information of the higher level is extracted through cascade of residual convolution.
7. The method of claim 4, wherein the real-time video stream-based object detection and identification method comprises: in step Sa3, the feature enhancement module performs fusion with the shallow features via a path aggregation network PAN after the deep features pass through spatial pyramid pooling SPP, and mainly functions to improve target detection and recognition accuracy, especially small target detection and recognition, through multi-level feature learning.
8. The method of claim 4, wherein the real-time video stream-based object detection and identification method comprises: when output is predicted specifically in the step Sa4, obtaining a category confidence of the target and coordinates of the target frame through a sofmax function; outputting offsets delta x and delta y specifically formed into a target frame, scaling scales a and b of an anchor point, the probability of detecting the target and the confidence coefficient that the target belongs to each category; according to the coordinates of the target frame, marking the position of the target on the original image, and displaying the predicted category confidence; and the probability of detecting the target is used for primarily screening the target frame.
9. The method of claim 8, wherein the real-time video stream-based object detection and identification method comprises: during prediction, prediction under three scales is carried out, the feature maps of the last three scales are taken, and after SPP and PAN are carried out, the features of the three scales are input into a detection module to be regressed and classified to obtain an output result; the detection module consists of 3 residual modules plus a convolution c with a fixed number of channels equal to (number of classes +5) × 3.
10. The method of claim 8, wherein the real-time video stream-based object detection and identification method comprises: and when the probability of detecting the target is greater than a set threshold, wherein the threshold is set to be 0.4-0.5, namely the probability that the current pixel point belongs to the target to be detected is greater than a set value, the result is retained, the target frame is subjected to non-maximum value inhibition screening and duplicate removal, the target frame with the intersection ratio being greater than IoU at the position is determined, and finally the target frame and the confidence thereof are output and displayed as a final result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011128268.5A CN112257569B (en) | 2020-10-21 | 2020-10-21 | Target detection and identification method based on real-time video stream |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011128268.5A CN112257569B (en) | 2020-10-21 | 2020-10-21 | Target detection and identification method based on real-time video stream |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112257569A true CN112257569A (en) | 2021-01-22 |
CN112257569B CN112257569B (en) | 2021-11-19 |
Family
ID=74245259
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011128268.5A Expired - Fee Related CN112257569B (en) | 2020-10-21 | 2020-10-21 | Target detection and identification method based on real-time video stream |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112257569B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112906552A (en) * | 2021-02-07 | 2021-06-04 | 上海卓繁信息技术股份有限公司 | Inspection method and device based on computer vision and electronic equipment |
CN113177529A (en) * | 2021-05-27 | 2021-07-27 | 腾讯音乐娱乐科技(深圳)有限公司 | Method, device and equipment for identifying screen splash and storage medium |
CN113177486A (en) * | 2021-04-30 | 2021-07-27 | 重庆师范大学 | Dragonfly order insect identification method based on regional suggestion network |
CN113269071A (en) * | 2021-05-18 | 2021-08-17 | 河北农业大学 | Automatic real-time sheep behavior identification method |
CN113569702A (en) * | 2021-07-23 | 2021-10-29 | 闽江学院 | Deep learning-based truck single-tire and double-tire identification method |
CN113983737A (en) * | 2021-10-18 | 2022-01-28 | 海信(山东)冰箱有限公司 | Refrigerator and food material positioning method thereof |
CN114241386A (en) * | 2021-12-21 | 2022-03-25 | 江苏翰林正川工程技术有限公司 | Method for detecting and identifying hidden danger of power transmission line based on real-time video stream |
CN114581798A (en) * | 2022-02-18 | 2022-06-03 | 广州中科云图智能科技有限公司 | Target detection method and device, flight equipment and computer readable storage medium |
CN114943986A (en) * | 2022-05-31 | 2022-08-26 | 武汉理工大学 | Regional pedestrian detection and illumination method and system based on camera picture segmentation |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105336074A (en) * | 2015-10-28 | 2016-02-17 | 小米科技有限责任公司 | Alarm method and device |
CN106060466A (en) * | 2016-06-20 | 2016-10-26 | 西安工程大学 | Video image sequence-based insulator tracking monitor method |
CN107590456A (en) * | 2017-09-06 | 2018-01-16 | 张栖瀚 | Small micro- mesh object detection method in a kind of high-altitude video monitoring |
CN108280844A (en) * | 2018-02-05 | 2018-07-13 | 厦门大学 | A kind of video object localization method based on the tracking of region candidate frame |
CN109035260A (en) * | 2018-07-27 | 2018-12-18 | 京东方科技集团股份有限公司 | A kind of sky areas dividing method, device and convolutional neural networks |
CN109271856A (en) * | 2018-08-03 | 2019-01-25 | 西安电子科技大学 | Remote sensing image object detection method based on expansion residual error convolution |
CN110062205A (en) * | 2019-03-15 | 2019-07-26 | 四川汇源光通信有限公司 | Motion estimate, tracking device and method |
CN110517288A (en) * | 2019-07-23 | 2019-11-29 | 南京莱斯电子设备有限公司 | Real-time target detecting and tracking method based on panorama multichannel 4k video image |
CN111191586A (en) * | 2019-12-30 | 2020-05-22 | 安徽小眯当家信息技术有限公司 | Method and system for inspecting wearing condition of safety helmet of personnel in construction site |
-
2020
- 2020-10-21 CN CN202011128268.5A patent/CN112257569B/en not_active Expired - Fee Related
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105336074A (en) * | 2015-10-28 | 2016-02-17 | 小米科技有限责任公司 | Alarm method and device |
CN106060466A (en) * | 2016-06-20 | 2016-10-26 | 西安工程大学 | Video image sequence-based insulator tracking monitor method |
CN107590456A (en) * | 2017-09-06 | 2018-01-16 | 张栖瀚 | Small micro- mesh object detection method in a kind of high-altitude video monitoring |
CN108280844A (en) * | 2018-02-05 | 2018-07-13 | 厦门大学 | A kind of video object localization method based on the tracking of region candidate frame |
CN109035260A (en) * | 2018-07-27 | 2018-12-18 | 京东方科技集团股份有限公司 | A kind of sky areas dividing method, device and convolutional neural networks |
CN109271856A (en) * | 2018-08-03 | 2019-01-25 | 西安电子科技大学 | Remote sensing image object detection method based on expansion residual error convolution |
CN110062205A (en) * | 2019-03-15 | 2019-07-26 | 四川汇源光通信有限公司 | Motion estimate, tracking device and method |
CN110517288A (en) * | 2019-07-23 | 2019-11-29 | 南京莱斯电子设备有限公司 | Real-time target detecting and tracking method based on panorama multichannel 4k video image |
CN111191586A (en) * | 2019-12-30 | 2020-05-22 | 安徽小眯当家信息技术有限公司 | Method and system for inspecting wearing condition of safety helmet of personnel in construction site |
Non-Patent Citations (2)
Title |
---|
何凯华: "基于目标检测网络的交通标志识别", 《软件工程》 * |
管军霖 等: "基于YOLOv4卷积神经网络的口罩佩戴检测方法", 《现代信息科技》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112906552A (en) * | 2021-02-07 | 2021-06-04 | 上海卓繁信息技术股份有限公司 | Inspection method and device based on computer vision and electronic equipment |
CN113177486A (en) * | 2021-04-30 | 2021-07-27 | 重庆师范大学 | Dragonfly order insect identification method based on regional suggestion network |
CN113269071A (en) * | 2021-05-18 | 2021-08-17 | 河北农业大学 | Automatic real-time sheep behavior identification method |
CN113177529A (en) * | 2021-05-27 | 2021-07-27 | 腾讯音乐娱乐科技(深圳)有限公司 | Method, device and equipment for identifying screen splash and storage medium |
CN113177529B (en) * | 2021-05-27 | 2024-04-23 | 腾讯音乐娱乐科技(深圳)有限公司 | Method, device, equipment and storage medium for identifying screen |
CN113569702A (en) * | 2021-07-23 | 2021-10-29 | 闽江学院 | Deep learning-based truck single-tire and double-tire identification method |
CN113569702B (en) * | 2021-07-23 | 2023-10-27 | 闽江学院 | Truck single-double tire identification method based on deep learning |
CN113983737A (en) * | 2021-10-18 | 2022-01-28 | 海信(山东)冰箱有限公司 | Refrigerator and food material positioning method thereof |
CN114241386A (en) * | 2021-12-21 | 2022-03-25 | 江苏翰林正川工程技术有限公司 | Method for detecting and identifying hidden danger of power transmission line based on real-time video stream |
CN114581798A (en) * | 2022-02-18 | 2022-06-03 | 广州中科云图智能科技有限公司 | Target detection method and device, flight equipment and computer readable storage medium |
CN114943986A (en) * | 2022-05-31 | 2022-08-26 | 武汉理工大学 | Regional pedestrian detection and illumination method and system based on camera picture segmentation |
CN114943986B (en) * | 2022-05-31 | 2024-09-27 | 武汉理工大学 | Method and system for detecting and illuminating pedestrian in subarea based on camera picture segmentation |
Also Published As
Publication number | Publication date |
---|---|
CN112257569B (en) | 2021-11-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112257569B (en) | Target detection and identification method based on real-time video stream | |
CN108154118B (en) | A kind of target detection system and method based on adaptive combined filter and multistage detection | |
US20220417590A1 (en) | Electronic device, contents searching system and searching method thereof | |
CN109284670B (en) | Pedestrian detection method and device based on multi-scale attention mechanism | |
US20190114804A1 (en) | Object tracking for neural network systems | |
CN111914664A (en) | Vehicle multi-target detection and track tracking method based on re-identification | |
CN110929593A (en) | Real-time significance pedestrian detection method based on detail distinguishing and distinguishing | |
KR20140095333A (en) | Method and apparratus of tracing object on image | |
CN111353496B (en) | Real-time detection method for infrared dim targets | |
CN114220126A (en) | Target detection system and acquisition method | |
CN112101113B (en) | Lightweight unmanned aerial vehicle image small target detection method | |
CN112926552A (en) | Remote sensing image vehicle target recognition model and method based on deep neural network | |
CN111260686A (en) | Target tracking method and system for anti-shielding multi-feature fusion of self-adaptive cosine window | |
CN115115973A (en) | Weak and small target detection method based on multiple receptive fields and depth characteristics | |
CN114708615A (en) | Human body detection method based on image enhancement in low-illumination environment, electronic equipment and storage medium | |
CN111914625B (en) | Multi-target vehicle tracking device based on detector and tracker data association | |
Caefer et al. | Point target detection in consecutive frame staring IR imagery with evolving cloud clutter | |
CN113052136A (en) | Pedestrian detection method based on improved Faster RCNN | |
CN111160099B (en) | Intelligent segmentation method for video image target | |
CN116917954A (en) | Image detection method and device and electronic equipment | |
CN116152699B (en) | Real-time moving target detection method for hydropower plant video monitoring system | |
CN117237844A (en) | Firework detection method based on YOLOV8 and fusing global information | |
Xie et al. | Pedestrian detection and location algorithm based on deep learning | |
CN117011655A (en) | Adaptive region selection feature fusion based method, target tracking method and system | |
CN116912763A (en) | Multi-pedestrian re-recognition method integrating gait face modes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20211119 |
|
CF01 | Termination of patent right due to non-payment of annual fee |