CN110705366A - Real-time human head detection method based on stair scene - Google Patents
Real-time human head detection method based on stair scene Download PDFInfo
- Publication number
- CN110705366A CN110705366A CN201910844880.3A CN201910844880A CN110705366A CN 110705366 A CN110705366 A CN 110705366A CN 201910844880 A CN201910844880 A CN 201910844880A CN 110705366 A CN110705366 A CN 110705366A
- Authority
- CN
- China
- Prior art keywords
- human head
- stair
- fchd
- real
- scene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Probability & Statistics with Applications (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a real-time human head detection method based on a stair scene in the field of computer vision, which comprises the following specific steps of: s1: collecting a large number of picture data sets of a stair scene; s2: labeling the data set, wherein a labeling box needs to contain information of human head and shoulders; s3: dividing a data set into a training set, a testing set and a verification set; s4: enhancing the training set data; s5: extracting marking frame data in a training set, performing kmeans clustering on marking frame information, and selecting categories to obtain different anchor information; s6: constructing an FCHD + FPN network architecture; s7: training by using an FCHD + FPN network model according to the labeled stair head training set; s8: testing the accuracy of the trained model in the verification set; s9: the generated model is used for detecting the human head in a real stair scene, an anchor is selected through a clustering method based on a real-time human head detection method of the stair scene, a labeling area is adjusted by combining shoulder information, and meanwhile, an FCHD method is improved by fusing an FPN network to improve the detection accuracy.
Description
Technical Field
The invention relates to the technical field of computer vision, in particular to a real-time human head detection method based on a stair scene.
Background
The existing human head detection method has two ideas, one is an algorithm idea of regression, a crowd density chart is obtained according to image regression, the method can only show the crowding condition of the human flow, the specific position of a human cannot be positioned, and the requirement on the resolution ratio of an image is high; another method of detecting objects, such as SSD, Yolo, fast-rcnn series of algorithms, detects the number of people, and these algorithms are poor in the case of mutual occlusion and difficult to achieve the requirements in terms of accuracy and speed of detection at the same time.
FCHD is the latest detection algorithm of this kind of scene of human head detection, but FCHD's anchor has only selected two kinds of sizes, and it is not good to have a generalization ability in the actual application, and the undetected rate is higher, because the size of human head is great with camera locating position and people's distance relation.
Based on the method, the real-time human head detection method based on the stair scene is designed, the proper anchors are selected through a clustering method, the labeled area is adjusted by combining shoulder information, and meanwhile, the FCHD method is improved by fusing an FPN network to improve the detection accuracy so as to solve the problems.
Disclosure of Invention
The invention aims to provide a real-time human head detection method based on a stair scene so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme: the real-time human head detection method based on the stair scene comprises the following specific steps:
s1: acquiring a large number of picture data sets of stair scenes in public places;
s2: labeling the data set, wherein a labeling box needs to contain information of human head and shoulders;
s3: dividing a data set into a training set, a testing set and a verification set;
s4: enhancing the training set data;
s5: extracting marking frame data in a training set, performing kmeans clustering on marking frame information, and selecting categories to obtain different anchor information;
s6: constructing an FCHD + FPN network architecture;
s7: training by using an FCHD + FPN network model according to the labeled stair head training set;
s8: testing the accuracy of the trained model in the verification set;
s9: and detecting the human head in a real stair scene by using the generated model.
Preferably, the public place of step S1 includes a shopping mall or a subway.
Preferably, in step S4, the enhancing manner includes horizontal inversion, random cropping, color dithering, scaling, and rotation transformation.
Preferably, in step S6, the FCHD + FPN network architecture is to add an FPN network on the basis of FCHD, and at the same time, resnet101 is used in the FCHD basic model to adjust the NMS algorithm to the SOFT-NMS algorithm.
Compared with the prior art, the invention has the beneficial effects that:
1. on the basis of a common detection framework Faster rcnn, an FCHD (fuzzy C-means high definition) and FPN (field programmable gate array) network framework is fused, and the human head detection speed is greatly improved by fusing a one-stage model of the FCHD and the FPN;
2. the accuracy of detection is improved remarkably by the resnet101+ FPN network, and meanwhile, the candidate frames are partially optimized, so that the missing rate is reduced;
3. in the aspect of data processing, partial human body characteristics are used for marking data, so that the accuracy of model detection is improved;
4. the anchor frame obtained by training set clustering is closer to a real scene, and the missing rate of human head detection is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of the present invention;
FIG. 2 is a block diagram of the FCHD + FPN network model of the present invention;
fig. 3 is a diagram of the last required feature generated by the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-2, the present invention provides a technical solution: the real-time human head detection method based on the stair scene comprises the following specific steps:
s1: acquiring a large number of picture data sets of stair scenes in public places, wherein the public places comprise shopping malls or subways;
s2: labeling the data set, wherein a labeling box needs to contain information of human head and shoulders;
s3: dividing a data set into a training set, a testing set and a verification set;
s4: enhancing the training set data in a horizontal inversion mode, a random pruning mode, a color dithering mode, a scale transformation mode and a rotation transformation mode;
s5: extracting marking frame data in a training set, performing kmeans clustering on marking frame information, and selecting categories to obtain different anchor information;
s6: constructing an FCHD + FPN network architecture, adding an FPN network on the basis of FCHD, and meanwhile, adopting resnet101 in an FCHD basic model to adjust an NMS algorithm to a SOFT-NMS algorithm;
s7: training by using an FCHD + FPN network model according to the labeled stair head training set;
s8: testing the accuracy of the trained model in the verification set;
s9: and detecting the human head in a real stair scene by using the generated model.
As an embodiment of the present invention
kmeans clustering:
1. the clustering data used is a detection data set only with labeling boxes, and after the data is labeled, a file containing the positions and the categories of the labeling boxes is generated, wherein each row contains (x)j,yj,wj,hj) J ∈ {1, 2, …, N }, i.e., the coordinates of grouttuthboxes with respect to the original image, (x)j,yj) Is the center point of the frame, (w)j,hj) The width and height of the frame, and N is the number of all the marked frames;
2. first, k cluster center points (W) are giveni,Hi) I ∈ {1, 2, …, k }, where Wi,HiIs the width and height dimensions of the anchor boxes, and since the anchor boxes are not fixed in position, there are no (x, y) coordinates, only width and height;
3. calculating the distance d between each labeling frame and the center point of each cluster to be 1-IOU (labeling frame, clustering center), wherein the center point of each labeling frame coincides with the clustering center during calculation, so that the IOU value can be calculated, namely d is 1-IOU [ (x)j,yj,wj,hj),(xj,yj,Wi,Hi)]J is {1, 2, …, N }, i is {1, 2, …, k }. Allocating the marking frame to the nearest clustering center;
4. after all the label boxes are distributed, recalculating the cluster center point for each cluster in the way ofThe number of the marking frames of the ith cluster is the average value of the width and the height of all the marking frames in the cluster;
5. repeating the steps 3 and 4 until the change amount of the cluster center is small.
As another embodiment of the present invention
FCHD + FPN network model:
FPN module
The pre-trained model resnet101 is used as the base model for the entire framework. Firstly, the high-level feature is up-sampled by 2 times (nearest up-sampling method), then the convolution kernel of 1 x 1 is carried out to make the front and back channels consistent, and simultaneously the front and back channels are combined with the corresponding previous-level feature, and the combination mode is the addition between pixels. This process is iterated until the finest feature map is generated. The start of the iteration, the already fused signatures are processed with a 3 x 3 convolution kernel (to eliminate aliasing effects of the upsampling) to generate the final desired signature, as shown in fig. 3.
Data set preparation
Brainwash public dataset: 11917 pieces, 91146 marking boxes, source store monitoring video data
SCUT _ HEAD public data set: 4405, 111251 boxes for labels, Source classroom Surveillance video and Web crawler data
Personal annotation data set: 2000, source subway video data
Loss function
The loss function used to train the model is a multitask loss function, and the equation is as follows:
where i is the index of the selected anchor, p is the prediction probability of i, Lcls represents the classification penalty, and Lreg represents the regression penalty. Lcls is calculated over all anchors, while Lreg is calculated only over the correct anchors. Lcls is the largest loss between the two classes (head and background). And Lreg is a defined smooth L1 penalty. Both loss terms are normalized by Ncls and Nreg, which are the number of samples classified and regressed, respectively.
Hyper-parametric design
The base model is initialized using the pre-trained resnet101, and all and new layers of the pre-trained model are retrained. The new layer is initialized with random weights sampled from the standard normal distribution. The weights attenuation during training was 0.0005. The entire model was fine-tuned using SGD. The learning rate for the training was set to 0.001 and the model was trained for 30 rounds, approaching 440k iterations. After the completion of 15 periods, the learning rate was attenuated by a proportion of 0.1.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.
Claims (4)
1. A real-time human head detection method based on a stair scene is characterized by comprising the following steps: the method comprises the following specific steps:
s1: acquiring a large number of picture data sets of stair scenes in public places;
s2: labeling the data set, wherein a labeling box needs to contain information of human head and shoulders;
s3: dividing a data set into a training set, a testing set and a verification set;
s4: enhancing the training set data;
s5: extracting marking frame data in a training set, performing kmeans clustering on marking frame information, and selecting categories to obtain different anchor information;
s6: constructing an FCHD + FPN network architecture;
s7: training by using an FCHD + FPN network model according to the labeled stair head training set;
s8: testing the accuracy of the trained model in the verification set;
s9: and detecting the human head in a real stair scene by using the generated model.
2. The stair scene-based real-time human head detection method according to claim 1, wherein: the public place of the step S1 includes a shopping mall or a subway.
3. The stair scene-based real-time human head detection method according to claim 1, wherein: in step S4, the enhancing method includes horizontal inversion, random trimming, color dithering, scaling, and rotation transformation.
4. The stair scene-based real-time human head detection method according to claim 1, wherein: in step S6, the FCHD + FPN network architecture is to add an FPN network on the basis of FCHD, and at the same time, adjust the NMS algorithm to the SOFT-NMS algorithm by using resnet101 in the FCHD basic model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910844880.3A CN110705366A (en) | 2019-09-07 | 2019-09-07 | Real-time human head detection method based on stair scene |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910844880.3A CN110705366A (en) | 2019-09-07 | 2019-09-07 | Real-time human head detection method based on stair scene |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110705366A true CN110705366A (en) | 2020-01-17 |
Family
ID=69194806
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910844880.3A Pending CN110705366A (en) | 2019-09-07 | 2019-09-07 | Real-time human head detection method based on stair scene |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110705366A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111368749A (en) * | 2020-03-06 | 2020-07-03 | 创新奇智(广州)科技有限公司 | Automatic identification method and system for stair area |
CN111832465A (en) * | 2020-07-08 | 2020-10-27 | 星宏集群有限公司 | Real-time head classification detection method based on MobileNet V3 |
CN111950612A (en) * | 2020-07-30 | 2020-11-17 | 中国科学院大学 | FPN-based weak and small target detection method for fusion factor |
CN113505771A (en) * | 2021-09-13 | 2021-10-15 | 华东交通大学 | Double-stage article detection method and device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110070074A (en) * | 2019-05-07 | 2019-07-30 | 安徽工业大学 | A method of building pedestrian detection model |
CN110135243A (en) * | 2019-04-02 | 2019-08-16 | 上海交通大学 | A kind of pedestrian detection method and system based on two-stage attention mechanism |
-
2019
- 2019-09-07 CN CN201910844880.3A patent/CN110705366A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110135243A (en) * | 2019-04-02 | 2019-08-16 | 上海交通大学 | A kind of pedestrian detection method and system based on two-stage attention mechanism |
CN110070074A (en) * | 2019-05-07 | 2019-07-30 | 安徽工业大学 | A method of building pedestrian detection model |
Non-Patent Citations (1)
Title |
---|
ADITYA VORA: "FCHD: A fast and accurate head detector", 《ARXIV》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111368749A (en) * | 2020-03-06 | 2020-07-03 | 创新奇智(广州)科技有限公司 | Automatic identification method and system for stair area |
CN111368749B (en) * | 2020-03-06 | 2023-06-13 | 创新奇智(广州)科技有限公司 | Automatic identification method and system for stair area |
CN111832465A (en) * | 2020-07-08 | 2020-10-27 | 星宏集群有限公司 | Real-time head classification detection method based on MobileNet V3 |
CN111832465B (en) * | 2020-07-08 | 2022-03-29 | 星宏集群有限公司 | Real-time head classification detection method based on MobileNet V3 |
CN111950612A (en) * | 2020-07-30 | 2020-11-17 | 中国科学院大学 | FPN-based weak and small target detection method for fusion factor |
CN113505771A (en) * | 2021-09-13 | 2021-10-15 | 华东交通大学 | Double-stage article detection method and device |
CN113505771B (en) * | 2021-09-13 | 2021-12-03 | 华东交通大学 | Double-stage article detection method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10019652B2 (en) | Generating a virtual world to assess real-world video analysis performance | |
CN113963445B (en) | Pedestrian falling action recognition method and equipment based on gesture estimation | |
CN101470809B (en) | Moving object detection method based on expansion mixed gauss model | |
US11854244B2 (en) | Labeling techniques for a modified panoptic labeling neural network | |
CN110705366A (en) | Real-time human head detection method based on stair scene | |
CN112766160A (en) | Face replacement method based on multi-stage attribute encoder and attention mechanism | |
CN108460403A (en) | The object detection method and system of multi-scale feature fusion in a kind of image | |
CN111598030A (en) | Method and system for detecting and segmenting vehicle in aerial image | |
CN110163188B (en) | Video processing and method, device and equipment for embedding target object in video | |
CN112784736B (en) | Character interaction behavior recognition method based on multi-modal feature fusion | |
CN111553397A (en) | Cross-domain target detection method based on regional full convolution network and self-adaption | |
CN111832489A (en) | Subway crowd density estimation method and system based on target detection | |
CN111583390B (en) | Three-dimensional semantic graph reconstruction method of convolutional neural network based on depth semantic fusion | |
CN112084869A (en) | Compact quadrilateral representation-based building target detection method | |
CN109948593A (en) | Based on the MCNN people counting method for combining global density feature | |
CN114519819B (en) | Remote sensing image target detection method based on global context awareness | |
CN114117614A (en) | Method and system for automatically generating building facade texture | |
CN113762009B (en) | Crowd counting method based on multi-scale feature fusion and double-attention mechanism | |
CN107767416A (en) | The recognition methods of pedestrian's direction in a kind of low-resolution image | |
CN113378812A (en) | Digital dial plate identification method based on Mask R-CNN and CRNN | |
CN116453121B (en) | Training method and device for lane line recognition model | |
CN112633220A (en) | Human body posture estimation method based on bidirectional serialization modeling | |
CN116883588A (en) | Method and system for quickly reconstructing three-dimensional point cloud under large scene | |
CN115829915A (en) | Image quality detection method, electronic device, storage medium, and program product | |
CN113205028A (en) | Pedestrian detection method and system based on improved YOLOv3 model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200117 |