[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112257527B - Mobile phone detection method based on multi-target fusion and space-time video sequence - Google Patents

Mobile phone detection method based on multi-target fusion and space-time video sequence Download PDF

Info

Publication number
CN112257527B
CN112257527B CN202011079614.5A CN202011079614A CN112257527B CN 112257527 B CN112257527 B CN 112257527B CN 202011079614 A CN202011079614 A CN 202011079614A CN 112257527 B CN112257527 B CN 112257527B
Authority
CN
China
Prior art keywords
frame
mobile phone
anchor
detection
video image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011079614.5A
Other languages
Chinese (zh)
Other versions
CN112257527A (en
Inventor
龚勋
王琛中
王立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Jiaotong University
Original Assignee
Southwest Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Jiaotong University filed Critical Southwest Jiaotong University
Priority to CN202011079614.5A priority Critical patent/CN112257527B/en
Publication of CN112257527A publication Critical patent/CN112257527A/en
Application granted granted Critical
Publication of CN112257527B publication Critical patent/CN112257527B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a mobile phone detection method based on multi-target fusion and a space-time video sequence, which comprises the steps of training an improved yolo model to obtain a detection model, and inputting video image frames to operate the detection model to obtain a first frame predicted value; decoding the first frame predicted value, removing a frame with the score value lower than a preset value, realizing NMS (network management system) by using a Diou threshold value, and inhibiting a mobile phone frame when only the mobile phone frame appears according to the decoding result of a certain frame of image; taking the inhibited result as a target template, inputting a video image frame as a candidate frame search area, inputting the candidate frame search area to a full-connection twin network, and selecting a result with the largest score map similarity to mark a mobile phone frame in the video image frame; if the set number of frames has been tracked, the above steps are repeated until the video image input is finished. The invention is based on the lightweight detection network in the One-stage detection algorithm, finely modifies the network structure and the training and detection modes, and obtains higher detection precision under the condition of not reducing the detection speed.

Description

Mobile phone detection method based on multi-target fusion and space-time video sequence
Technical Field
The invention relates to the technical field of image processing, in particular to a mobile phone detection method for multi-target fusion and space-time video sequences.
Background
Detection precision and speed are always the core problems of target detection, and in the process of target detection, in order to obtain a more accurate detection effect, a heavyweight detection algorithm capable of obtaining high precision is usually selected, so that the inference speed of the system on mobile terminal equipment is greatly limited.
The Chinese patent application with the application number of 202010048048.5 discloses an intelligent monitoring method, equipment and a readable medium for recognizing mobile phone anti-photographing, which performs machine learning on massive mobile phone appearances through an intelligent monitoring system; erecting a camera probe at a place where anti-shooting needs to be arranged, wherein the camera probe is in real-time communication with an intelligent monitoring system; the camera transmits the shot image to an intelligent monitoring system in real time; identifying whether a mobile phone exists or not through an intelligent monitoring system; if the mobile phone exists, the intelligent monitoring system judges whether a mobile phone is used for shooting according to the shot image; the intelligent monitoring system judges that the mobile phone is used for photographing, and then outputs alarm information in real time to remind workers of timely reminding. Using a detection algorithm taking Darknet53 as a Backbone to carry out primary detection, and then monitoring by matching with methods such as bone generation, action recognition and the like; in addition, some methods use similar algorithms to perform initial positioning, and then perform detection by searching from the whole to the local. But the mode of the detection system at the mobile end is not real-time basically due to the mode of the detection system.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a mobile phone detection method for multi-target fusion and space-time video sequences, and solves the defects of the existing detection method.
The purpose of the invention is realized by the following technical scheme: the mobile phone detection method based on multiple target fusion and space-time video sequences comprises the following steps:
training the improved yolo model to obtain a detection model, and inputting a video image frame to operate the detection model to obtain a first frame predicted value;
decoding the first frame predicted value, removing a frame with the score value lower than a preset value, realizing NMS (non-maximum suppression) by using a Diou threshold value, and suppressing the mobile phone frame when only the mobile phone frame appears according to the decoding result of a certain frame image;
taking the inhibited result as a target template, inputting a video image frame as a candidate frame search area, simultaneously inputting the video image frame and the candidate frame search area into a full-connection twin network, and selecting the result with the largest score map similarity to carry out frame marking on the mobile phone in the video image frame;
if the set number of frames has been tracked, the above steps are repeated until the video image input is finished.
Further, the mobile phone detection method further comprises the step of repeatedly taking the suppressed result as a target template, inputting the video image frame as a candidate frame search area, simultaneously inputting the video image frame and the candidate frame search area to the full-connection twin network, and selecting the result with the largest score map similarity to perform frame marking on the mobile phone in the video image frame if the number of frames is not set.
Further, the mobile phone detection method further comprises the step of acquiring a training set and a test set before the step of training the improved yolo model to obtain a detection model and inputting the video image frame to run the detection model to obtain the first frame prediction value.
Further, the step of acquiring the training set and the test set includes: the method comprises the steps of performing frame division processing on a recorded video, labeling processed video pictures, extracting partial pictures at intervals of frames to construct a data set, and dividing the data set into a training set and a test set according to a certain proportion.
Further, the decoding the first frame prediction value, removing a frame with a score value lower than a preset value, implementing NMS with a Diou threshold, and suppressing the mobile phone frame when only the mobile phone frame appears according to a decoding result of a certain frame image includes:
sigmoid (t) according to decoding formula bx x )+cx、by=sigmoid(t y )+cy、bw=p w e tw 、bh=p h e th Decoding the first frame prediction value by conf ═ sigmoid (raw _ conf) and prob ═ sigmoid (raw _ prob);
removing the box with confidence or category probability not meeting the requirement with score threshold of 0.4 and implementing NMS with Diou threshold of 0.1;
and (3) rejecting a prediction frame related to the mobile phone in the corresponding image if the mobile phone frame does not have a human body frame, a hand frame or a camera frame according to the decoding result of the certain frame of image, so as to restrain the mobile phone frame.
Further, improvements to the yolo model include the following:
increasing an s branch for detecting a small object for yolov3-tiny to improve the detection effect of small objects such as a camera;
on the basis of the structure of the former step, an SPP (spatial Pyramid Power), SAM (spatial Attention Module) and CAM (channel Attention Module) module is added to be connected with the residual error, so as to improve the feature extraction capability.
The invention has the following advantages: a mobile phone detection method based on multi-target fusion and a space-time video sequence is based on a lightweight detection network in an One-stage detection algorithm, the network structure, training and detection modes are modified finely, high detection precision is obtained under the condition that the detection speed is not reduced, detected targets are tracked by using a tracking algorithm, the problem that difficult samples with large shielding and angle inclination exist is solved, the consumption of a system to resources is reduced, and the reasoning speed of the system at a mobile end is improved greatly on the whole.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application. The invention is further described below with reference to the accompanying drawings.
As shown in fig. 1, the present invention relates to a mobile phone detection method based on multi-objective fusion and spatio-temporal video sequence, which specifically comprises the following contents:
and S1, performing framing processing on the video recorded by the camera in the actual application scene, and randomly extracting partial pictures at intervals to construct a data set. And labeling the mobile phone, the human body, the hand and the camera in each image by using LabelImg labeling software, and dividing the data set into a training set and a test set according to a certain proportion.
S2, training the detection model by using the improved yolov3 network, wherein the network training input is training set pictures and corresponding label labels, and the network output is predicted t x ,t y ,t w ,t h Offset value, original confidence and original class probability.
Further, in the training process, for focalloss of confidence loss, considering that the imbalance of positive and negative samples of yolov3 network model is much lower than that of retianenet, the value of α is selected to be 0.4, and the calculation formula of confidence loss is as follows:
L focalloss =-α t (1-p t ) μ * γ log(p t )
and S3, operating the detection model to obtain a predicted value of the first frame.
S4, decoding the predicted value according to the following decoding formula, removing the box with lower confidence coefficient or category probability by a score threshold of 0.40, and realizing NMS by a Diou threshold of 0.1;
bx=sigmoid(t x )+cx
by=sigmoid(t y )+cy
bw=p w e tw
bh=p h e th
conf=sigmoid(raw_conf)
prob=sigmoid(raw_prob)
wherein: bx, by, bh, and bw respectively represent the horizontal and vertical coordinates of the center and the height and width of the prediction frame, p h And p w Representing the height and width of the prior box, respectively. t is t x And t y The predicted offset of the center of the object from the position of the upper left corner of the grid is shown, tw and th show the predicted offset of the object relative to the prior frame, cx and cy show the coordinates of the upper left corner of the grid, and score ═ conf (confidence) × prob (class probability).
And S5, if the mobile phone frame appears and the human body frame, the hand frame or the camera frame does not appear in the decoding result of a certain frame picture, the mobile phone frame is restrained.
And S6, taking the suppressed result as a target template, taking the video image frame as a candidate frame search area, and simultaneously sending the target template and the video image frame into a full-connection twin network to obtain a similarity measurement result score map through template matching.
And S7, selecting the result with the maximum similarity to mark the mobile phone in the video image frame.
S8, judging whether the set frame number is tracked or not, if not, repeating the steps S6-S8, if so, executing the step S9;
s9, steps S3-S9 are repeated until the input of the video image is finished.
In terms of multi-target association, the contribution points of the invention are as follows:
it was found that the position loss (position loss used in the present invention) based on the giou (generalized Intersection over union) may have an imbalance opposite to the position loss based on the difference, for this reason, the average label box size and the average position loss of the s, m, l branches are counted, and a negative exponential function (a · e) is used in combination with the number ratio of each branch box -b/x ) Unbalanced fitting correction is carried out on the basis function, and the problem of unbalanced position loss of the large frame and the small frame based on the GIoU is solved.
Following the premise assumption that the average position loss of each branch frame is almost equal when the data volume is large enough, the average label frame size and the average position loss of the s, m and l branches of the first arm-up epoch (the preheating period, namely the period with small learning rate at the beginning of training) are calculated in a partial generalization manner in the training process, and a negative exponential function (a.e) is adopted by combining the quantity proportion of each branch frame -b/x ) Unbalanced fitting correction is carried out on the basic function, loss weight of each branch position in the subsequent iteration process is adjusted, and the problem of unbalanced position loss of the large frame and the small frame based on the GIoU is solved.
The problem of rewriting labels in yolo is found, namely, a displacement anchor box assigned to a certain object has a probability of being covered by a subsequent object, so that covered training cannot be carried out, and the specific improvement steps are as follows:
if a certain anchor frame is endowed with a label by a certain original object, judging whether the original object has a unique frame;
if the original object has the unique frame, judging whether the current object can be assigned with an anchor, if so, cancelling the assignment of the current object to a certain anchor frame, otherwise, searching the next anchor frame with the highest iou value for assignment;
if the original object has no unique frame, judging whether the anchor of the highest iou of the existing object and the anchor of the non-highest iou of the original object cover the original assignment; if so, judging whether an anchor of the non-highest iou of the existing object and an anchor of the highest iou of the original object exist or not; if yes, judging whether the current object can have an assignment anchor, if yes, cancelling assignment of the current object to a certain anchor frame, and otherwise, covering the original assignment; if the anchor of the existing object which is not the highest iou and the anchor of the original object which is not the highest iou exist, the person with the lower iou is covered.
The method takes account of the fact that a main and auxiliary distinguishing is needed between the mobile phone and other auxiliary detection targets, all losses of the mobile phone are multiplied by a priority coefficient, and the coefficient is 1.10.
The threshold obtained by atss (adaptive Training Sample selection) is limited, and when the threshold is smaller than a certain value, the quality of the Training Sample corresponding to the obtained threshold is considered to be low, so that the selection mode of the threshold is abandoned, and only the highest Training Sample IoU in the Training samples to be selected is selected. In the present invention, the predetermined value is 0.10.
The association which basically does not consume computing resources is carried out on the multi-target object, and the computing resource consumption of a cognitive detection mode is reduced.
In the aspect of space-time information fusion, the invention utilizes the context information of two dimensions of time domain and space domain, and obviously improves the problems of shielding and drifting in the tracking process.
The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (6)

1. The mobile phone detection method based on the fusion of various targets and the space-time video sequence is characterized in that: the mobile phone detection method comprises the following steps:
a1, training the improved yolo model to obtain a detection model, wherein the yolo model is specifically improved as follows:
if a certain anchor frame is endowed with a label by a certain original object, judging whether the original object has a unique frame;
if the original object has the unique frame, judging whether the current object can assign an anchor, if so, cancelling the assignment of the current object to a certain anchor frame, otherwise, searching the next anchor frame with the highest iou value for assignment;
if the original object has no unique frame, judging whether the anchor of the highest iou of the existing object and the anchor of the non-highest iou of the original object cover the original assignment; if so, judging whether an anchor of the non-highest iou of the existing object and an anchor of the highest iou of the original object exist or not; if yes, judging whether the current object can have an assignment anchor, if yes, canceling the assignment of the current object to a certain anchor frame, and if not, covering the original assignment; if the anchor of the existing object which is not the highest iou and the anchor of the original object which is not the highest iou exist, covering the person with the low iou;
a2, inputting a video image frame to operate a detection model to obtain a first frame prediction value;
a3, decoding the first frame prediction value, removing a frame with a score value lower than a preset value, realizing NMS (network management system) by using a Diou threshold value, and inhibiting a mobile phone frame according to the decoding result of a certain frame image when only the mobile phone frame appears;
a4, taking the suppressed result as a target template, inputting a video image frame as a candidate frame search area, simultaneously inputting the video image frame and the candidate frame search area to a full-connection twin network, and selecting the result with the largest score map similarity to mark the mobile phone in the video image frame;
a5, if the set frame number has been tracked, repeating steps A2-A4 until the video image input is finished.
2. The method for mobile phone detection based on multi-target fusion and spatio-temporal video sequence according to claim 1, wherein: if the frame number is not set, the steps of repeatedly taking the suppressed result as a target template, inputting a video image frame as a candidate frame search area, simultaneously inputting the video image frame and the candidate frame search area to a full-connection twin network, and selecting the result with the largest score map similarity to mark the frames of the mobile phones in the video image frame are repeated.
3. The mobile phone detection method based on multi-target fusion and spatio-temporal video sequence according to claim 1, characterized in that: the mobile phone detection method further comprises the step of acquiring a training set and a test set before the step of training the improved yolov3 model to obtain a detection model and inputting the video image frame to run the detection model to obtain a first frame predicted value.
4. The mobile phone detection method based on multi-target fusion and spatio-temporal video sequence according to claim 3, characterized in that: the step of obtaining the training set test set comprises: the method comprises the steps of performing frame division processing on a recorded video, labeling processed video pictures, extracting partial pictures at intervals of frames to construct a data set, and dividing the data set into a training set and a test set according to a certain proportion.
5. The mobile phone detection method based on multi-target fusion and spatio-temporal video sequence according to claim 1, characterized in that: the decoding the first frame prediction value, removing a frame with a score value lower than a preset value, realizing NMS (network management system) by using a Diou threshold value, and inhibiting a mobile phone frame when only the mobile phone frame appears according to a decoding result of a certain frame image comprises the following steps:
according to the decoding formula bx-sigmoid (t) x )+cx、by=sigmoid(t y )+cy、bw=p w e tw 、bh=p h e th Decoding the first frame prediction value by conf ═ sigmoid (raw _ conf) and prob ═ sigmoid (raw _ prob), wherein bx, by, bh, and bw respectively represent a center horizontal and vertical coordinate and a height and width of a prediction frame, and p represents a center horizontal and vertical coordinate and a height and width of the prediction frame h And p w Respectively representing the height and width of the prior box, t x And t y The predicted offset of the center of the object from the position of the upper left corner of the grid is represented, the predicted offset of the object relative to the prior frame is represented by tw and th, the cx and cy represent coordinates of the upper left corner of the grid, conf is a confidence coefficient, and prob is a category probability;
removing the box with confidence or category probability not meeting the requirement with score threshold of 0.4 and implementing NMS with Diou threshold of 0.1;
and (3) rejecting a prediction frame related to the mobile phone in the corresponding image if the mobile phone frame does not have a human body frame, a hand frame or a camera frame according to the decoding result of the certain frame of image, so as to restrain the mobile phone frame.
6. The mobile phone detection method based on multi-target fusion and spatio-temporal video sequence according to claim 1, characterized in that: improvements to the yolo model include the following:
increasing an s branch for detecting a small object for yolov3-tiny to improve the detection effect of small objects such as a camera;
on the basis of the model structure of the previous step, the SPP module, the SAM module and the CAM module are added to be connected with the residual error, so that the feature extraction capability is improved.
CN202011079614.5A 2020-10-10 2020-10-10 Mobile phone detection method based on multi-target fusion and space-time video sequence Active CN112257527B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011079614.5A CN112257527B (en) 2020-10-10 2020-10-10 Mobile phone detection method based on multi-target fusion and space-time video sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011079614.5A CN112257527B (en) 2020-10-10 2020-10-10 Mobile phone detection method based on multi-target fusion and space-time video sequence

Publications (2)

Publication Number Publication Date
CN112257527A CN112257527A (en) 2021-01-22
CN112257527B true CN112257527B (en) 2022-09-02

Family

ID=74242754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011079614.5A Active CN112257527B (en) 2020-10-10 2020-10-10 Mobile phone detection method based on multi-target fusion and space-time video sequence

Country Status (1)

Country Link
CN (1) CN112257527B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967289A (en) * 2021-02-08 2021-06-15 上海西井信息科技有限公司 Security check package matching method, system, equipment and storage medium
CN112733821B (en) * 2021-03-31 2021-07-02 成都西交智汇大数据科技有限公司 Target detection method fusing lightweight attention model
CN113139092B (en) * 2021-04-28 2023-11-03 北京百度网讯科技有限公司 Video searching method and device, electronic equipment and medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108614894A (en) * 2018-05-10 2018-10-02 西南交通大学 A kind of face recognition database's constructive method based on maximum spanning tree
CN109508710A (en) * 2018-10-23 2019-03-22 东华大学 Based on the unmanned vehicle night-environment cognitive method for improving YOLOv3 network
CN109934121A (en) * 2019-02-21 2019-06-25 江苏大学 A kind of orchard pedestrian detection method based on YOLOv3 algorithm
CN110443210A (en) * 2019-08-08 2019-11-12 北京百度网讯科技有限公司 A kind of pedestrian tracting method, device and terminal
CN110472467A (en) * 2019-04-08 2019-11-19 江西理工大学 The detection method for transport hub critical object based on YOLO v3
CN110619309A (en) * 2019-09-19 2019-12-27 天津天地基业科技有限公司 Embedded platform face detection method based on octave convolution sum YOLOv3
WO2020047854A1 (en) * 2018-09-07 2020-03-12 Intel Corporation Detecting objects in video frames using similarity detectors
CN111161311A (en) * 2019-12-09 2020-05-15 中车工业研究院有限公司 Visual multi-target tracking method and device based on deep learning
AU2020100705A4 (en) * 2020-05-05 2020-06-18 Chang, Jiaying Miss A helmet detection method with lightweight backbone based on yolov3 network
CN111753767A (en) * 2020-06-29 2020-10-09 广东小天才科技有限公司 Method and device for automatically correcting operation, electronic equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108614894A (en) * 2018-05-10 2018-10-02 西南交通大学 A kind of face recognition database's constructive method based on maximum spanning tree
WO2020047854A1 (en) * 2018-09-07 2020-03-12 Intel Corporation Detecting objects in video frames using similarity detectors
CN109508710A (en) * 2018-10-23 2019-03-22 东华大学 Based on the unmanned vehicle night-environment cognitive method for improving YOLOv3 network
CN109934121A (en) * 2019-02-21 2019-06-25 江苏大学 A kind of orchard pedestrian detection method based on YOLOv3 algorithm
CN110472467A (en) * 2019-04-08 2019-11-19 江西理工大学 The detection method for transport hub critical object based on YOLO v3
CN110443210A (en) * 2019-08-08 2019-11-12 北京百度网讯科技有限公司 A kind of pedestrian tracting method, device and terminal
CN110619309A (en) * 2019-09-19 2019-12-27 天津天地基业科技有限公司 Embedded platform face detection method based on octave convolution sum YOLOv3
CN111161311A (en) * 2019-12-09 2020-05-15 中车工业研究院有限公司 Visual multi-target tracking method and device based on deep learning
AU2020100705A4 (en) * 2020-05-05 2020-06-18 Chang, Jiaying Miss A helmet detection method with lightweight backbone based on yolov3 network
CN111753767A (en) * 2020-06-29 2020-10-09 广东小天才科技有限公司 Method and device for automatically correcting operation, electronic equipment and storage medium

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
A Temporal Sequence Dual-Branch Network for Classifying Hybrid Ultrasound Data of Breast Cancer;Ziqi Yang 等;《IEEE Access》;20200427;第8卷;82688-82699 *
Pedestrian Alignment Network for Large-scale Person Re-Identification;Zhedong Zheng 等;《IEEE Transactions on Circuits and Systems for Video Technology》;20181004;第29卷(第10期);3037-3045 *
Speed-Up of Object Detection Neural Network with GPU;Takuya Fukagai 等;《2018 25th IEEE International Conference on Image Processing (ICIP)》;20180906;301-305 *
YOLO v3-Tiny: Object Detection and Recognition using one stage improved model;Pranav Adarsh 等;《2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS)》;20200423;687-694 *
基于双阈值-非极大值抑制的Faster R-CNN改进算法;侯志强 等;《光电工程》;20191231;第46卷(第12期);82-92 *
基于增强型Tiny-YOLOV3模型的野鸡识别方法;易诗 等;《农业工程学报》;20200731;第36卷(第13期);141-147 *
面向车间人员宏观行为数字孪生模型快速构建的小目标智能检测方法;刘庭煜 等;《计算机集成制造系统》;20190615;第25卷(第6期);1463-1473 *

Also Published As

Publication number Publication date
CN112257527A (en) 2021-01-22

Similar Documents

Publication Publication Date Title
CN112257527B (en) Mobile phone detection method based on multi-target fusion and space-time video sequence
CN111695622B (en) Identification model training method, identification method and identification device for substation operation scene
CN109919977B (en) Video motion person tracking and identity recognition method based on time characteristics
CN109961019A (en) A kind of time-space behavior detection method
US7668338B2 (en) Person tracking method and apparatus using robot
CN113052876B (en) Video relay tracking method and system based on deep learning
CN110555420B (en) Fusion model network and method based on pedestrian regional feature extraction and re-identification
CN107256386A (en) Human behavior analysis method based on deep learning
CN110390308B (en) Video behavior identification method based on space-time confrontation generation network
CN111914761A (en) Thermal infrared face recognition method and system
CN113901911B (en) Image recognition method, image recognition device, model training method, model training device, electronic equipment and storage medium
CN115661615A (en) Training method and device of image recognition model and electronic equipment
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
CN113096159A (en) Target detection and track tracking method, model and electronic equipment thereof
CN109753901A (en) Indoor pedestrian's autonomous tracing in intelligent vehicle, device, computer equipment and storage medium based on pedestrian's identification
CN114022837A (en) Station left article detection method and device, electronic equipment and storage medium
CN116883883A (en) Marine ship target detection method based on generation of anti-shake of countermeasure network
WO2023070955A1 (en) Method and apparatus for detecting tiny target in port operation area on basis of computer vision
CN109740527B (en) Image processing method in video frame
CN111695404A (en) Pedestrian falling detection method and device, electronic equipment and storage medium
CN110443197A (en) Intelligent understanding method and system for visual scene
US12148248B2 (en) Ensemble deep learning method for identifying unsafe behaviors of operators in maritime working environment
US20230222841A1 (en) Ensemble Deep Learning Method for Identifying Unsafe Behaviors of Operators in Maritime Working Environment
CN116071656B (en) Intelligent alarm method and system for infrared image ponding detection of underground transformer substation
CN110853001B (en) Transformer substation foreign matter interference prevention image recognition method, system and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant