CN112907616A - Pedestrian detection method based on thermal imaging background filtering - Google Patents
Pedestrian detection method based on thermal imaging background filtering Download PDFInfo
- Publication number
- CN112907616A CN112907616A CN202110460457.0A CN202110460457A CN112907616A CN 112907616 A CN112907616 A CN 112907616A CN 202110460457 A CN202110460457 A CN 202110460457A CN 112907616 A CN112907616 A CN 112907616A
- Authority
- CN
- China
- Prior art keywords
- thermal imaging
- image
- background
- pedestrian detection
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001931 thermography Methods 0.000 title claims abstract description 90
- 238000001514 detection method Methods 0.000 title claims abstract description 71
- 238000001914 filtration Methods 0.000 title claims abstract description 34
- 239000000203 mixture Substances 0.000 claims abstract description 21
- 230000011218 segmentation Effects 0.000 claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 10
- 238000003384 imaging method Methods 0.000 claims abstract description 8
- 238000000034 method Methods 0.000 claims description 27
- 238000013527 convolutional neural network Methods 0.000 claims description 20
- 238000012549 training Methods 0.000 claims description 14
- 239000011159 matrix material Substances 0.000 claims description 13
- 238000013507 mapping Methods 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 7
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 239000002131 composite material Substances 0.000 abstract 1
- 238000010606 normalization Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 3
- 230000007613 environmental effect Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/40—Image enhancement or restoration using histogram techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/136—Segmentation; Edge detection involving thresholding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10048—Infrared image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention discloses a pedestrian detection method based on thermal imaging background filtering, which comprises the following steps of: firstly, histogram equalization processing is carried out on an original thermal imaging picture acquired by a thermal imaging infrared camera, then a suitable threshold value is set for threshold value segmentation to obtain a preliminary candidate region for pedestrian detection, meanwhile, a foreground and a background are separated from the relation between a front frame and a rear frame of the picture based on a Gaussian mixture model to obtain a background subtraction picture, and a composite picture obtained by connecting the foreground and the rear frames is sent to a follow-up improved Faster R-CNN frame to complete pedestrian detection. According to the invention, the problem of temperature drift of the imaging result of the thermal imaging camera is solved through normalization, the background is filtered by using threshold segmentation and background subtraction, the characteristics of a thermal imaging picture are fully utilized, and the pedestrian detection precision under the low-light and no-light environments is improved.
Description
Technical Field
The invention relates to a pedestrian detection method based on thermal imaging background filtering, and belongs to the field of target detection of image processing.
Background
Vision is the most direct and dominant method for obtaining environmental information biologically, and the amount of information obtained by vision is also very abundant, so processing of visual information plays a crucial role in environmental information processing. Vision-based target detection is a research hotspot in the field of computer vision at present.
In recent years, with the development of fields such as artificial intelligence and deep learning, visual target detection has been developed. Different from the traditional target detection method based on feature extraction, the target detection method based on deep learning extracts deep information of images through a deep neural network, and uses massive data for training, so that the accuracy and speed of target detection are greatly improved.
In the field of object detection, pedestrian detection is an important component. The pedestrian detection is to use a computer technology to judge whether a pedestrian exists in a picture or a video and select the pedestrian position in the picture. Pedestrian detection has important application in fields such as autopilot, unmanned aerial vehicle, control. The pedestrian detection method currently mainstream includes: global detection, local-based detection, motion-based detection, multi-camera stereo vision detection.
Target detection based on visible light images has received extensive attention and research because of the characteristics of low equipment cost, wide application range and the like. However, visible light images are very susceptible to environmental influences. Factors such as appearance change, shading and illumination condition change can have great influence on target detection based on visible light. The appearance of infrared thermal imaging cameras provides ideas for solving the problems. Thermographic images have a distinct advantage over visible light images, in which an object is represented by its temperature and radiant heat, which means that thermographic images can be used both day and night. In addition, thermal images eliminate the effect of color and illumination changes on the appearance of the object. With the remarkable development of heat sensors in recent years, much research has been conducted on pedestrian detection and tracking in thermal images.
Disclosure of Invention
The invention aims to solve the defects of a visible light pedestrian detection method under the conditions of weak light and no light. The invention provides a pedestrian detection method based on thermal imaging background filtering. The method uses a thermal imaging sensor to obtain a thermal imaging image of the environment, and improves the pedestrian detection precision through a preprocessing method of background filtering and a pedestrian detection model based on improved FasterR-CNN.
The invention adopts the following specific technical scheme:
a pedestrian detection method based on thermal imaging background filtering comprises the following steps:
s1: firstly, processing a thermal imaging image acquired by a thermal imaging camera by using a histogram equalization method, so as to solve the problems of deviation and drift of the thermal imaging image and obtain a histogram equalization enhanced image;
s2: based on a Gaussian mixture model, separating foreground and background from the histogram equalization enhanced image obtained after the processing of S1 according to the relation between the previous frame and the next frame to obtain a binary background subtraction image;
s3: performing double-threshold segmentation on a thermal imaging image acquired by a thermal imaging camera by using upper and lower thresholds of imaging of pedestrians in the thermal imaging image to obtain a binary threshold segmentation image after segmentation of the pedestrians and a background;
s4: superposing the binary background subtraction image obtained in the step S2 and the binary threshold segmentation image obtained in the step S3 to obtain a binary background filtering image for distinguishing a foreground from a background, and performing background removal on the histogram equalization enhanced image obtained in the step S1 by using the binary background filtering image to obtain a background filtering image which is only promising;
s5: inputting the background filtered image obtained in S4 into a pre-constructed and trained pedestrian detection network based on improved FasterR-CNN for human body candidate region extraction and pedestrian detection, in the pedestrian detection process, firstly performing feature extraction on the background filtered image by a convolutional neural network to obtain a feature map, then extracting three proportional target suggestion frames respectively corresponding to a head, a half body and a human body from the feature map by an improved RPN, then projecting the target suggestion frames onto the feature map to obtain corresponding feature matrices, sequentially passing each feature matrix through a ROIpooling layer and a full connection layer to obtain a category probability and a boundary frame regression parameter, and finally combining the intersection relations between the three proportional target suggestion frames by taking the head as a reference to obtain a final thermal imaging pedestrian detection result.
Preferably, the specific implementation method in S1 is:
converting the thermal imaging image into a thermal imaging gray image, then counting to obtain a cumulative normalized histogram, and then mapping the thermal imaging gray image pixel by pixel according to a mapping relation to form a histogram equalization enhanced image, wherein the mapping relation is as follows:
p′i=min{x}+si·(max{x}-min{x})
in the formula: p'iRepresenting the equalized gray value s in the histogram equalized enhanced image obtained by mapping the pixel with the gray value i in the thermal imaging gray level imageiThe histogram probability accumulated value of a pixel with the gray level i in the thermal imaging gray level image is obtained from the accumulated normalized histogram; min { x } represents the minimum grayscale value in the thermal imaging grayscale map, and max { x } represents the maximum grayscale value in the thermal imaging grayscale map.
Preferably, the specific implementation method of S2 is as follows:
s21: training a Gaussian mixture model by using a plurality of enhanced images in the histogram equalization enhanced image; during training, firstly, initializing a basic Gaussian mixture matrix by using a first frame of enhanced image, then inputting the enhanced image frame by frame, comparing each newly added pixel with the mean value of the prior Gaussian mixture model, updating matrix coefficients if the newly added pixel is within 3 times of the variance with the mean value, or creating a new Gaussian distribution;
s22: and matching the histogram equalization enhanced image to be segmented pixel by adopting a Gaussian mixture model obtained in the step S21, and if one pixel value can be matched with one Gaussian mixture matrix, considering the pixel as a background, otherwise, considering the pixel as a foreground.
Preferably, the specific implementation method of S3 is as follows:
s31: calibrating a thermal imaging camera for acquiring a thermal imaging image, and determining upper and lower thresholds of pedestrian imaging in the thermal imaging camera;
s32: pixels between the upper threshold value and the lower threshold value in the thermal imaging map are regarded as a pedestrian area, and the rest pixels are regarded as a background area.
Preferably, the specific implementation method of S4 is as follows:
s41: adding the binary background subtraction image obtained in the step S2 and the binary threshold segmentation image obtained in the step S3 to obtain a binary background filtered image with a foreground pixel value of 1 and a background pixel value of 0;
s42: and multiplying the binary background filtering image and the histogram equalization enhanced image obtained in the step S1 by pixel points one by one to obtain a final background filtering image.
Preferably, in S5, the pedestrian detection network based on the improved fasterrr-CNN includes a convolutional neural network, an improved RPN network, a roiploling layer, and a full link layer, wherein the thermal imaging pedestrian detection result is obtained as follows:
s51: inputting the background filtered image into a convolutional neural network to obtain a corresponding characteristic map;
s52: feeding the feature map obtained in the step S51 into a target suggestion box which is possibly existed in the improved RPN network extraction target; for each position in the image, initializing 9 possible candidate frames according to the area of three sizes and the orthogonal combination of the proportions of the three sizes; the minimum proportion corresponds to a target suggestion frame of a human head, the middle proportion corresponds to a target suggestion frame of a half body, the half body is an upper half body or a lower half body, and the maximum proportion corresponds to a target suggestion frame of a human body;
s53: projecting the target suggestion boxes obtained in the step S52 to the feature map obtained in the step S51 to obtain corresponding feature matrixes, scaling each feature matrix to 7 × 7 through a ROIploling layer, and then flattening and sending the feature matrixes to a full-connection layer to obtain final class probability and regression parameters of the bounding box;
s54: taking the human head target frame with the highest reliability as a reference, and regarding each human head target frame, if a half body target frame exists or a human body target frame intersects with the human body target frame, combining the human body target frame and the human body target frame into a human body together, if no other target frame intersects with the human body target frame, regarding the human head target frame as the human head detected by the human body occlusion, and taking the human head target frame as a final target frame; and regarding the human body target frame, if the human body target frame does not intersect with any human head target frame, judging the human body target frame to be misjudged, and abandoning the human body target frame.
Further, the areas of the three sizes are 128 × 128,256 × 256,384 × 384, respectively.
Further, the ratio of the three sizes is 1:1,1:2 and 1:3 respectively.
Further, whether intersection exists between the target frames is judged through intersection and comparison between the target frames.
Preferably, the improved FasterR-CNN based pedestrian detection network is trained in advance using labeled thermal imaging datasets.
The invention solves the problem of temperature drift of the imaging result of the thermal imaging camera by a histogram equalization method, uses threshold segmentation and background subtraction to filter the background, can fully utilize the characteristics of a thermal imaging picture, and improves the pedestrian detection precision in low-light and no-light environments.
Drawings
FIG. 1 is an overall flow chart of a thermal imaging background filtering based pedestrian detection algorithm as disclosed in the present invention.
FIG. 2 is a diagram of the neural network architecture of the improved Faster R-CNN.
Fig. 3 is a thermal imaging diagram used as an example.
Fig. 4 is a histogram equalization enhanced image obtained after the histogram equalization method processing.
Fig. 5 is a binary background subtraction image obtained based on a gaussian mixture model.
Fig. 6 is a binarized dual-threshold-segmented image obtained by dual-threshold segmentation.
Fig. 7 is a background-filtered image obtained based on a binary background subtraction image and a binary threshold segmentation image.
Fig. 8 is a head target suggestion box, a half-length target suggestion box, and a whole-body target suggestion box obtained by the RPN network being improved.
Fig. 9 is the final target detection result.
Detailed Description
The invention will be further elucidated and described with reference to the drawings and the detailed description. The technical features of the embodiments of the present invention can be combined correspondingly without mutual conflict.
In a preferred embodiment of the present invention, the open source tool Pytorch based on deep learning implements a pedestrian detection method based on thermal imaging background filtering. As shown in fig. 1, the method for detecting a pedestrian based on background filtering disclosed by the invention comprises two parts of thermal imaging image background filtering and deep learning model construction, training and detection, and the specific implementation process is as follows:
first, background filtering treatment of thermal imaging image
A thermal imaging camera is first used to acquire a segment of thermal imaging video, which consists of a series of successive frames of thermal imaging images. For the image at the time t, as shown in fig. 1, a histogram equalization method is firstly used for background filtering to improve the deviation and drift problems of the thermal imaging image, and the specific implementation process is as follows:
1) and performing histogram equalization method processing on the thermal imaging graph needing target detection.
Firstly, converting an original thermal imaging image into a thermal imaging gray scale map, expressing the thermal imaging gray scale map as { x }, counting the ratio of the total pixels of each gray scale on the image, and obtaining the occurrence probability of the pixels with the gray scale i in the image as follows:
in the formula: n represents the number of all pixels in the image;
p obtained as described abovex(i) Is the histogram of the gray scale map.
Then, the cumulative normalized histogram of the thermographic image is obtained by accumulation:
in the formula: skA histogram probability accumulation value representing a pixel with a gray level of k;
finally, mapping the thermal imaging gray level image pixel by pixel according to a mapping relation to form a histogram equalization enhanced image, wherein the mapping relation is as follows:
p′i=min{x}+si·(max{x}-min{x})
in the formula: p'iExpressing the gray value after equalization in the histogram equalization enhanced image obtained by mapping the pixel with the gray value i in the thermal imaging gray level image, namely as the pixel value in the histogram equalization enhanced image, siThe histogram probability accumulated value of the pixel with the gray level i in the thermal imaging gray level image can be obtained from the accumulated normalized histogram; min { x } represents the minimum grayscale value in the thermal imaging grayscale map, and max { x } represents the maximum grayscale value in the thermal imaging grayscale map.
Taking fig. 3 as an example, after the above operation is performed on the original image, each pixel may be mapped to a new pixel, and a histogram equalization enhanced image may be obtained, as shown in fig. 4.
2) Based on a Gaussian mixture model, separating foreground and background according to the relation between the front frame and the rear frame in the histogram equalization enhanced image obtained after 1) processing to obtain a background subtraction image. In the process, the enhanced image of the previous t frames of the enhanced image obtained by histogram equalization is used for training a Gaussian mixture model, and the value of specific t can be adjusted according to needs. The training process of the Gaussian mixture model is as follows:
firstly, a first frame of enhanced image is used for initializing a basic Gaussian mixture matrix, and a Gaussian mixture model is established for each pixel point on an image at the moment t:
in the formula: xtIs the pixel value of the pixel point at the time t, k is the number of Gaussian distribution functions, wi,t、μi,t、Respectively representing the weight coefficient, the mean value and the variance corresponding to the ith Gaussian model,is a gaussian density function.
And then, inputting subsequent enhanced images frame by frame, comparing the newly added pixels with the mean value of the existing Gaussian mixture model, if the newly added pixels and the mean value are within 3 times of the variance, updating the matrix coefficient, and otherwise, creating a new Gaussian distribution. The model update formula is as follows:
wi,t=(1+α)wi,t-1
μi,t=ρμi,t-1+(1-ρ)Xt
in the formula: alpha is a model weight updating coefficient, rho is a model mean value updating coefficient,
and finally, carrying out background pixel matching on the subsequent histogram equalization enhanced image to be segmented by adopting the mixed Gaussian model obtained in the previous step. If a pixel value can match one of the gaussian mixture matrices, the pixel is considered as background and is recorded as 0, otherwise, the pixel is considered as foreground and is recorded as 1, and thus the final binary background subtraction image is shown in fig. 5.
3) And performing double-threshold segmentation on the thermal imaging image acquired by the thermal imaging camera by using the upper and lower thresholds of the image of the pedestrian in the thermal imaging image to obtain a threshold segmentation image after segmentation of the pedestrian and the background.
Before segmentation, the thermal imaging camera needs to be calibrated, the upper and lower threshold boundaries of pedestrian imaging for the thermal imaging camera are determined, and the upper and lower thresholds are set to be T respectivelyu,TdBased on these two thresholds, the image can be divided into two parts: the first part is greater than or equal to TdAnd is less than or equal to TuIs marked as 1, and the second part is smaller than TdOr greater than TuIs denoted as 0, and the formula is as follows:
in the formula: p (x, y) represents a pixel value of a point (x, y) in the image, and f (x, y) represents the resulting binary threshold-divided image, as shown in fig. 6 in this embodiment.
4) Adding the binary background subtraction image and the binary threshold segmentation image to obtain a binary background filtered image for distinguishing the foreground from the background, and performing pixel-by-pixel multiplication on the binary background filtered image and the histogram equalization enhanced image obtained by histogram equalization, so as to remove the background of the histogram equalization enhanced image by using the binary background filtered image to obtain a background filtered image which is only promising, as shown in fig. 7.
Construction, training and detection of pedestrian detection network based on improved FasterR-CNN
In the part, the obtained background filtering image is sent to a trained improved FasterR-CNN framework for reasoning, so that human body candidate region extraction and pedestrian detection are realized, and a pedestrian detection result based on a thermal imaging image under the condition of weak light or no light is finally obtained. The pedestrian detection network structure based on the improved FasterR-CNN is shown in FIG. 2:
firstly, inputting the background filtering image obtained in the previous step into a convolutional neural network to obtain a characteristic diagram of the image. In this embodiment, the convolutional neural network may employ a ResNet-101 network.
Then, the feature map of the image is sent to an improved RPN (region suggestion network) to extract a candidate region where the target may exist. In the improved RPN network, compared with the common RPN network, the improved RPN network is characterized in that three proportion target suggestion boxes respectively corresponding to a human head, a human body (the half body can be an upper half body or a lower half body) and the human body are extracted from a characteristic diagram, wherein the three proportions are specifically determined according to a detection target. In this example, for each location in the image, 9 possible candidate boxes, 128 × 128(1:1), 128 × 256(1:2), 128 × 384(1:3), 256 × 256(1:1), 256 × 512(1:2), 256 × 768(1:3), 384 × 384(1:1), 384 × 768(1:2), 384 × 1152(1:3), were initialized in three sizes of areas (128 × 128,256 × 256,384 × 384) and the three sizes of ratios (1:1,1:2,1:3) orthogonally combined. The minimum ratio of 1:1 corresponds to the goal suggestion box for the human head, the intermediate ratio of 1:2 corresponds to the goal suggestion box for the half-length, and the maximum ratio of 1:3 corresponds to the goal suggestion box for the human body, which may be subsequently used to combine to form a complete human body, as shown in fig. 8.
And then, projecting the target suggestion frame on the characteristic diagram to obtain corresponding characteristic matrixes, zooming each characteristic matrix to 7 × 7 in turn through ROIploling layers, and flattening and sending the characteristic matrixes into a full-connection layer to obtain class probability and regression parameters of the boundary frame.
And finally, combining the intersection relationship among the three proportional target suggestion frames by taking the human head as a reference to obtain a final thermal imaging pedestrian detection result. Taking the human head as a reference means that the human head target frame with the highest reliability is taken as a reference, for each human head target frame, if a half body target frame exists or a human body target frame and the human body target frame have an intersection, the human head target frame and the human body target frame are combined together to form a human body, if no other target frame and the human body target frame have an intersection, the human head is considered to be the human head detected by the human body being blocked, and the human head target frame is taken as a final target frame; and regarding the human body target frame, if the human body target frame does not intersect with any human head target frame, judging the human body target frame to be misjudged, and abandoning the human body target frame.
Specifically, whether or not there is an intersection between the target frames is determined by the intersection ratio between the target frames. Defining the target frame to be represented by the lower left corner coordinate and the upper right corner coordinate of the target frame, and then representing the human head detection frame as Dhead(xhead-bl,yhead-bl,xhead-ur,yhead-ur) The half-length object box is denoted as Dhalf(xhalf-bl,yhalf-bl,xhalf-ur,yhalf-ur) The human target box is denoted as Dbody(xbody-bl,ybody-bl,xbody-ur,ybody-ur). Because of thermal imagingIn the figure, the head imaging characteristics are most obvious, so the reliability of the detected human head target frames is higher, and for each human head target frame, if a half body or a human body target frame has an Intersection with the human head target frame, namely the Intersection over Unit (IoU) between the target frames is larger than zero, a human body is combined; if no other detection frame intersects with the human head target frame, the human head target frame is considered to be the human head detected by the human body being blocked, and the obtained final target frame is the human head target frame. And regarding the human body target frame, if no human head has intersection, judging as misjudgment, and abandoning the target frame.
In the formula: IoU1-2Representing the intersection ratio between target boxes 1 and 2. D1Representing the target frame 1, with coordinates (x)1-bl,y1-bl,x1-ur,y1-ur),D1Representing the target frame 2, with coordinates (x)2-bl,y2-bl,x2-ur,y2-ur)。
Combined overall pedestrian target frame DpeopleThe coordinate is (x)p-bl,yp-bl,xp-ur,yp-ur) Wherein:
xp-bl=min(x1-bl,x2-bl)
yp-bl=min(y1-bl,y2-bl)
xp-ur=max(x1-ur,x2-ur)
yp-ur=max(y1-ur,y2-ur)
the final pedestrian detection result is shown in fig. 9.
In addition, the pedestrian detection network based on the improved FasterR-CNN needs to be trained by using a thermal imaging data set with labels in advance before being used for actual detection, and the training method belongs to the prior art. In this embodiment, the specific implementation manner that the training process can adopt is as follows:
1. initializing the parameters of the preposed convolutional layer by using an ImageNet pre-training classification model, and training an RPN network;
2. training a classification and bounding box regression network by using the obtained target suggestion box;
3. fine tuning the RPN by using the trained pre-convolutional network layer;
4. fine-tuning the classification and bounding box regression network by using the trained pre-convolutional network layer;
5. the RPN network and the classification and bounding box regression network share the trained pre-convolutional network layer to form a complete network model.
The above-described embodiments are merely preferred embodiments of the present invention, which should not be construed as limiting the invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, the technical scheme obtained by adopting the mode of equivalent replacement or equivalent transformation is within the protection scope of the invention.
Claims (10)
1. A pedestrian detection method based on thermal imaging background filtering is characterized by comprising the following steps:
s1: firstly, processing a thermal imaging image acquired by a thermal imaging camera by using a histogram equalization method, so as to solve the problems of deviation and drift of the thermal imaging image and obtain a histogram equalization enhanced image;
s2: based on a Gaussian mixture model, separating foreground and background from the histogram equalization enhanced image obtained after the processing of S1 according to the relation between the previous frame and the next frame to obtain a binary background subtraction image;
s3: performing double-threshold segmentation on a thermal imaging image acquired by a thermal imaging camera by using upper and lower thresholds of imaging of pedestrians in the thermal imaging image to obtain a binary threshold segmentation image after segmentation of the pedestrians and a background;
s4: superposing the binary background subtraction image obtained in the step S2 and the binary threshold segmentation image obtained in the step S3 to obtain a binary background filtering image for distinguishing a foreground from a background, and performing background removal on the histogram equalization enhanced image obtained in the step S1 by using the binary background filtering image to obtain a background filtering image which is only promising;
s5: inputting the background filtered image obtained in S4 into a pre-constructed and trained pedestrian detection network based on improved FasterR-CNN for human body candidate region extraction and pedestrian detection, in the pedestrian detection process, firstly performing feature extraction on the background filtered image by a convolutional neural network to obtain a feature map, then extracting three proportional target suggestion frames respectively corresponding to a head, a half body and a human body from the feature map by an improved RPN, then projecting the target suggestion frames onto the feature map to obtain corresponding feature matrices, sequentially passing each feature matrix through a ROIpooling layer and a full connection layer to obtain a category probability and a boundary frame regression parameter, and finally combining the intersection relations between the three proportional target suggestion frames by taking the head as a reference to obtain a final thermal imaging pedestrian detection result.
2. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 1, wherein the specific implementation method in S1 is:
converting the thermal imaging image into a thermal imaging gray image, then counting to obtain a cumulative normalized histogram, and then mapping the thermal imaging gray image pixel by pixel according to a mapping relation to form a histogram equalization enhanced image, wherein the mapping relation is as follows:
p′i=min{x}+si·(max{x}-min{x})
in the formula: p'iRepresenting the equalized gray value s in the histogram equalized enhanced image obtained by mapping the pixel with the gray value i in the thermal imaging gray level imageiThe histogram probability accumulated value of a pixel with the gray level i in the thermal imaging gray level image is obtained from the accumulated normalized histogram; min { x } represents the minimum grayscale value in the thermal imaging grayscale map, and max { x } represents the maximum grayscale value in the thermal imaging grayscale map.
3. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 1, wherein the specific implementation method of S2 is as follows:
s21: training a Gaussian mixture model by using a plurality of enhanced images in the histogram equalization enhanced image; during training, firstly, initializing a basic Gaussian mixture matrix by using a first frame of enhanced image, then inputting the enhanced image frame by frame, comparing each newly added pixel with the mean value of the prior Gaussian mixture model, updating matrix coefficients if the newly added pixel is within 3 times of the variance with the mean value, or creating a new Gaussian distribution;
s22: and matching the histogram equalization enhanced image to be segmented pixel by adopting a Gaussian mixture model obtained in the step S21, and if one pixel value can be matched with one Gaussian mixture matrix, considering the pixel as a background, otherwise, considering the pixel as a foreground.
4. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 1, wherein the specific implementation method of S3 is as follows:
s31: calibrating a thermal imaging camera for acquiring a thermal imaging image, and determining upper and lower thresholds of pedestrian imaging in the thermal imaging camera;
s32: pixels between the upper threshold value and the lower threshold value in the thermal imaging map are regarded as a pedestrian area, and the rest pixels are regarded as a background area.
5. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 1, wherein the specific implementation method of S4 is as follows:
s41: adding the binary background subtraction image obtained in the step S2 and the binary threshold segmentation image obtained in the step S3 to obtain a binary background filtered image with a foreground pixel value of 1 and a background pixel value of 0;
s42: and multiplying the binary background filtering image and the histogram equalization enhanced image obtained in the step S1 by pixel points one by one to obtain a final background filtering image.
6. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 1, wherein in S5, the pedestrian detection network based on modified FasterR-CNN comprises a convolutional neural network, a modified RPN network, a roiploling layer and a full connectivity layer, wherein the thermal imaging pedestrian detection result is obtained as follows:
s51: inputting the background filtered image into a convolutional neural network to obtain a corresponding characteristic map;
s52: feeding the feature map obtained in the step S51 into a target suggestion box which is possibly existed in the improved RPN network extraction target; for each position in the image, initializing 9 possible candidate frames according to the area of three sizes and the orthogonal combination of the proportions of the three sizes; the minimum proportion corresponds to a target suggestion frame of a human head, the middle proportion corresponds to a target suggestion frame of a half body, the half body is an upper half body or a lower half body, and the maximum proportion corresponds to a target suggestion frame of a human body;
s53: projecting the target suggestion boxes obtained in the step S52 to the feature map obtained in the step S51 to obtain corresponding feature matrixes, scaling each feature matrix to 7 × 7 through a ROIploling layer, and then flattening and sending the feature matrixes to a full-connection layer to obtain final class probability and regression parameters of the bounding box;
s54: taking the human head target frame with the highest reliability as a reference, and regarding each human head target frame, if a half body target frame exists or a human body target frame intersects with the human body target frame, combining the human body target frame and the human body target frame into a human body together, if no other target frame intersects with the human body target frame, regarding the human head target frame as the human head detected by the human body occlusion, and taking the human head target frame as a final target frame; and regarding the human body target frame, if the human body target frame does not intersect with any human head target frame, judging the human body target frame to be misjudged, and abandoning the human body target frame.
7. The method of claim 6, wherein the three sizes of areas are 128 x 128,256 x 256,384 x 384, respectively.
8. The pedestrian detection method based on thermal imaging background filtering according to claim 6, wherein the ratio of the three sizes is 1:1,1:2,1:3, respectively.
9. The pedestrian detection method based on thermal imaging background filtering as claimed in claim 6, wherein whether there is an intersection between the target frames is determined by an intersection ratio between the target frames.
10. The pedestrian detection method based on thermal imaging background filtering according to claim 1, wherein the pedestrian detection network based on modified FasterR-CNN is trained using labeled thermal imaging data sets in advance.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110460457.0A CN112907616B (en) | 2021-04-27 | 2021-04-27 | Pedestrian detection method based on thermal imaging background filtering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110460457.0A CN112907616B (en) | 2021-04-27 | 2021-04-27 | Pedestrian detection method based on thermal imaging background filtering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112907616A true CN112907616A (en) | 2021-06-04 |
CN112907616B CN112907616B (en) | 2022-05-03 |
Family
ID=76108934
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110460457.0A Active CN112907616B (en) | 2021-04-27 | 2021-04-27 | Pedestrian detection method based on thermal imaging background filtering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112907616B (en) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2264643A1 (en) * | 2009-06-19 | 2010-12-22 | Universidad de Castilla-La Mancha | Surveillance system and method by thermal camera |
CN106504274A (en) * | 2016-10-10 | 2017-03-15 | 广东技术师范学院 | A kind of visual tracking method and system based under infrared camera |
CN108710838A (en) * | 2018-05-08 | 2018-10-26 | 河南工程学院 | Thermal infrared facial image recognition method under a kind of overnight sight |
KR20180125278A (en) * | 2017-05-15 | 2018-11-23 | 한국전자통신연구원 | Apparatus and method for detecting pedestrian |
CN110490877A (en) * | 2019-07-04 | 2019-11-22 | 西安理工大学 | Binocular stereo image based on Graph Cuts is to Target Segmentation method |
CN110717393A (en) * | 2019-09-06 | 2020-01-21 | 北京富吉瑞光电科技有限公司 | Forest fire automatic detection method and system based on infrared panoramic system |
CN111046880A (en) * | 2019-11-28 | 2020-04-21 | 中国船舶重工集团公司第七一七研究所 | Infrared target image segmentation method and system, electronic device and storage medium |
CN111340765A (en) * | 2020-02-20 | 2020-06-26 | 南京邮电大学 | Thermal infrared image reflection detection method based on background separation |
CN111461036A (en) * | 2020-04-07 | 2020-07-28 | 武汉大学 | Real-time pedestrian detection method using background modeling enhanced data |
US20200279121A1 (en) * | 2014-10-01 | 2020-09-03 | Apple Inc. | Method and system for determining at least one property related to at least part of a real environment |
CN112200764A (en) * | 2020-09-02 | 2021-01-08 | 重庆邮电大学 | Photovoltaic power station hot spot detection and positioning method based on thermal infrared image |
CN112529065A (en) * | 2020-12-04 | 2021-03-19 | 浙江工业大学 | Target detection method based on feature alignment and key point auxiliary excitation |
-
2021
- 2021-04-27 CN CN202110460457.0A patent/CN112907616B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2264643A1 (en) * | 2009-06-19 | 2010-12-22 | Universidad de Castilla-La Mancha | Surveillance system and method by thermal camera |
US20200279121A1 (en) * | 2014-10-01 | 2020-09-03 | Apple Inc. | Method and system for determining at least one property related to at least part of a real environment |
CN106504274A (en) * | 2016-10-10 | 2017-03-15 | 广东技术师范学院 | A kind of visual tracking method and system based under infrared camera |
KR20180125278A (en) * | 2017-05-15 | 2018-11-23 | 한국전자통신연구원 | Apparatus and method for detecting pedestrian |
CN108710838A (en) * | 2018-05-08 | 2018-10-26 | 河南工程学院 | Thermal infrared facial image recognition method under a kind of overnight sight |
CN110490877A (en) * | 2019-07-04 | 2019-11-22 | 西安理工大学 | Binocular stereo image based on Graph Cuts is to Target Segmentation method |
CN110717393A (en) * | 2019-09-06 | 2020-01-21 | 北京富吉瑞光电科技有限公司 | Forest fire automatic detection method and system based on infrared panoramic system |
CN111046880A (en) * | 2019-11-28 | 2020-04-21 | 中国船舶重工集团公司第七一七研究所 | Infrared target image segmentation method and system, electronic device and storage medium |
CN111340765A (en) * | 2020-02-20 | 2020-06-26 | 南京邮电大学 | Thermal infrared image reflection detection method based on background separation |
CN111461036A (en) * | 2020-04-07 | 2020-07-28 | 武汉大学 | Real-time pedestrian detection method using background modeling enhanced data |
CN112200764A (en) * | 2020-09-02 | 2021-01-08 | 重庆邮电大学 | Photovoltaic power station hot spot detection and positioning method based on thermal infrared image |
CN112529065A (en) * | 2020-12-04 | 2021-03-19 | 浙江工业大学 | Target detection method based on feature alignment and key point auxiliary excitation |
Non-Patent Citations (2)
Title |
---|
ZUHAIB AHMED SHAIKH 等: "Automatic annotation of pedestrians in thermal images using background/foreground segmentation for training deep neural networks", 《2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI)》 * |
吴迪: "基于红外图像的行人检测算法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 * |
Also Published As
Publication number | Publication date |
---|---|
CN112907616B (en) | 2022-05-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111274976B (en) | Lane detection method and system based on multi-level fusion of vision and laser radar | |
CN111209810B (en) | Boundary frame segmentation supervision deep neural network architecture for accurately detecting pedestrians in real time through visible light and infrared images | |
CN108304873B (en) | Target detection method and system based on high-resolution optical satellite remote sensing image | |
Maddalena et al. | Stopped object detection by learning foreground model in videos | |
CN111784747B (en) | Multi-target vehicle tracking system and method based on key point detection and correction | |
CN109685045B (en) | Moving target video tracking method and system | |
CN109919026B (en) | Surface unmanned ship local path planning method | |
CN107273870A (en) | The pedestrian position detection method of integrating context information under a kind of monitoring scene | |
CN104766065B (en) | Robustness foreground detection method based on various visual angles study | |
WO2016165064A1 (en) | Robust foreground detection method based on multi-view learning | |
CN109086803B (en) | Deep learning and personalized factor-based haze visibility detection system and method | |
CN113158943A (en) | Cross-domain infrared target detection method | |
Naufal et al. | Preprocessed mask RCNN for parking space detection in smart parking systems | |
CN111582074A (en) | Monitoring video leaf occlusion detection method based on scene depth information perception | |
Huerta et al. | Exploiting multiple cues in motion segmentation based on background subtraction | |
CN113223044A (en) | Infrared video target detection method combining feature aggregation and attention mechanism | |
Lu et al. | A cross-scale and illumination invariance-based model for robust object detection in traffic surveillance scenarios | |
Surkutlawar et al. | Shadow suppression using rgb and hsv color space in moving object detection | |
Nosheen et al. | Efficient Vehicle Detection and Tracking using Blob Detection and Kernelized Filter | |
Li et al. | A self-attention feature fusion model for rice pest detection | |
CN115116132B (en) | Human behavior analysis method for depth perception in Internet of things edge service environment | |
CN107103301B (en) | Method and system for matching discriminant color regions with maximum video target space-time stability | |
CN114998801A (en) | Forest fire smoke video detection method based on contrast self-supervision learning network | |
KR102171384B1 (en) | Object recognition system and method using image correction filter | |
CN109583408A (en) | A kind of vehicle key point alignment schemes based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |