[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN103605983A - Remnant detection and tracking method - Google Patents

Remnant detection and tracking method Download PDF

Info

Publication number
CN103605983A
CN103605983A CN201310531106.XA CN201310531106A CN103605983A CN 103605983 A CN103605983 A CN 103605983A CN 201310531106 A CN201310531106 A CN 201310531106A CN 103605983 A CN103605983 A CN 103605983A
Authority
CN
China
Prior art keywords
target
mrow
frame
video image
image sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310531106.XA
Other languages
Chinese (zh)
Other versions
CN103605983B (en
Inventor
苏育挺
刘安安
马莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201310531106.XA priority Critical patent/CN103605983B/en
Publication of CN103605983A publication Critical patent/CN103605983A/en
Application granted granted Critical
Publication of CN103605983B publication Critical patent/CN103605983B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a remnant detection and tracking method. The method comprises the following steps of carrying out pretreatment of graying, filtering and the like on an original monitoring video image sequence so as to acquire an initial video image sequence; carrying out background modeling on the initial video image sequence collected by a camera, extracting a foreground area through an obtained background modeling result, carrying out de-noising processing on the foreground area so as to acquire a foreground target; through a positive example picture and a negative example picture, carrying out offline training on a support vector machine so as to acquire a target remnant model and a human body model respectively, inputting each foreground target into the two models respectively so as to carry out determination, and outputting a target remnant; using a Meanshift algorithm to track the target remnant and acquiring position coordinates of the target remnant in each previous frame; reversely traversing the initial video image sequence and tracking the position coordinates of the target remnant in each frame before the current frame, carrying out statistics and analysis, and outputting an image containing the target remnant in the initial video image sequence.

Description

Method for detecting and tracking abandoned object
Technical Field
The invention relates to the field of carry-over detection, in particular to a carry-over detection and tracking method.
Background
Carryover refers to an unattended static object. In intelligent video surveillance systems, detection of an object left behind has wide application in many fields, such as: real-time monitoring suspicious objects or lost luggage and the like in open places such as buildings, squares, military control areas and the like[1]. The monitoring video has important significance for obtaining evidence after abnormal conditions occur, and once abnormal conditions occur in the places, monitoring personnel cannot find the abnormal conditions in time, and the conditions at that time can be inquired only by replaying the video. However, the monitoring mode mainly based on human precaution and video playback afterwards not only wastes a lot of manpower and material resources, but also often causes the report omission of abnormal events, and cannot solve the safety problem widely existing at present.
In an intelligent video surveillance system, the detection of the carry-over is based on digital image processing, digital video processing, computer vision, pattern recognition and other technologies, and the data in the surveillance video is analyzed by means of the computer processing technology. The method has the advantages that the remnants in the public place can be automatically detected and tracked, when an emergency happens, security personnel can carry out targeted inspection according to alarm information triggered by the intelligent video monitoring system in real time, and accordingly the security work can achieve accurate target positioning, high speed and strong pertinence.
Most of the existing detection methods for the left objects determine that the objects with long enough stationary time in the foreground area are left objects, and do not consider whether the foreground area is a stationary pedestrian or an object; meanwhile, the relationship between people and objects cannot be considered, and false detection is easily caused.
Therefore, how to further distinguish whether the foreground region is a human body or an object, and judge the relationship between the human body and the object, and reducing the false detection rate of the remnant is an urgent problem to be solved in the detection of the remnant.
Disclosure of Invention
The invention provides a method for detecting and tracking a remnant, which reduces the false detection rate of the remnant by reducing the false report of an abnormal event, and is described in detail in the following description:
a carryover detection and tracking method, the method comprising the steps of:
(1) carrying out preprocessing such as graying, filtering and the like on an original monitoring video image sequence to obtain an initial video image sequence;
(2) carrying out background modeling on an initial video image sequence acquired by a camera, extracting a foreground region according to an obtained background modeling result, carrying out denoising processing on the foreground region, and obtaining a foreground target Fq
(3) Performing off-line training on the support vector machine through positive example pictures and negative example pictures to respectively obtain a target left object model M and a human body model N, and enabling each foreground target F obtained in the step (2) to beqRespectively inputting into two models for determination and outputtingThe target remaining object P is output;
(4) tracking the target legacy object P by adopting a Meanshift algorithm, and acquiring the position coordinate of the target legacy object P in each previous frame;
(5) and traversing the initial video image sequence reversely, tracking the position coordinates of the target object P in each frame before the current frame, performing statistical analysis, and outputting an image containing the target object P in the initial video image sequence.
Each foreground target F obtained in the step (2)qThe two models are respectively input for judgment, and the operation of outputting the target object P is specifically as follows:
1) if the output of the human body model N is 1 and the output of the target object model M is 1, determining that the target type object in the scene is watched by people and is not a left object, and not executing the step (4);
2) if the output of the human body model N is 1 and the output of the target object model M is 0, determining that no target type object needs to be found in the scene, and not executing the step (4);
3) if the output of the human body model N is 0 and the output of the target object model M is 0, determining that no target type object needs to be found in the scene, and not executing the step (4);
4) if the output of the human body model N is 0 and the output of the target object model M is 1, the target type object in the scene is determined to be unattended, the target object P is determined to be the target object P, and the next step (4) is carried out.
The operation of tracking the target legacy object P by using the Meanshift algorithm and acquiring the position coordinates of each frame before the target legacy object P is specifically as follows:
1) color space conversion;
2) sampling the hue value of each pixel in the circumscribed rectangle of the target legacy object P to obtain a color histogram of the hue value;
according to the formula
<math> <mrow> <mi>t</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>h</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>255</mn> </munderover> <mi>h</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> <mi>n</mi> <mo>=</mo> <mn>0,1</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mn>255</mn> </mrow> </math>
Obtaining a back projection t (n) of the circumscribed rectangle; n represents the horizontal axis coordinate of the color histogram and represents a pixel with the value of n in the image area; h (n) represents the vertical axis of the color histogram, which is the statistics of the number of pixels with the value of n;
3) calculating the position coordinates of the target legacy object P in the previous frame;
in the k-th frame, the coordinates of all pixels within the bounding rectangle are multiplied by the back projection tkThe pixel values of the corresponding points are added and summed to obtain the coordinate (x) of the last frame of the target object P0,y0);
Figure BDA0000405381830000031
Wherein x and y are coordinate values of all pixels in the external rectangle in the HSV space on the x axis and the y axis; t is tk(nx,y) Is a reverse projection t under coordinates (x, y)kA pixel value of (a);
calculating the distance between the results of two iterations
Figure BDA0000405381830000032
If d is less than or equal to T2The iteration is ended, and the position coordinate of the target legacy object P in the last frame is (x)0,y0)。
The backward traversal of the initial video image sequence, tracking the position coordinates of the target object P in each frame before the current frame, performing statistical analysis, and outputting the image containing the target object P in the initial video image sequence specifically comprises:
traversing the initial video image sequence forwards and backwards from the moment of finding the target leaving the object P, and tracking and recording the position coordinates of the initial video image sequence;
1) if m is more than or equal to 1 and less than or equal to k in the mth frame, the condition of
pm(x)≤T3Or X-pm(x)≤T3
Take m' = mminIf m' represents the minimum frame number meeting the above condition, the target legacy object P is considered to enter the scene for the first time at the mth frame; p is a radical ofm(x) Representing the X coordinate of the target legacy object P at the mth frame, X representing the width pixel value of the initial video image sequence;
outputting an image frame containing the target legacy object P in the initial video image sequence every A frames from the m' th frame, namely outputting a set O of image frames:
wherein,representative pair
Figure BDA0000405381830000035
Carrying out upward rounding;
2) otherwise, an image frame containing the target carry-over object P is output every a frames from the 1 st frame of the initial video image sequence. I.e. the set O of output image frames:
Figure BDA0000405381830000036
the technical scheme provided by the invention has the beneficial effects that: the method comprises the steps of obtaining a foreground target through a background difference method, judging a target object to be left by using an offline-trained support vector machine model, obtaining position coordinates of each frame before a current frame, finding the moment when the target object to be left enters a scene through statistical analysis, and outputting an image frame to remind security personnel of paying attention; the method reduces the missing report of the abnormal events, reduces the false detection rate of the remnants, and improves the working efficiency of the monitoring equipment and the security personnel.
Drawings
FIG. 1 is a flow chart of a sequential logic based carryover detection method.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
In order to reduce the computation complexity and the computation amount, implement real-time detection of the carry-over, and reduce the false detection rate of the carry-over, an embodiment of the present invention provides a carry-over detection method based on sequential logic, and refer to fig. 1, which is described in detail below:
101: carrying out preprocessing such as graying, filtering and the like on an original monitoring video image sequence to obtain an initial video image sequence;
the embodiment of the invention firstly carries out graying processing on an original monitoring video image sequence, and then adopts a Gaussian filtering method in reference document [2] for further processing to obtain an initial video image sequence.
102: carrying out background modeling on an initial video image sequence acquired by a camera, extracting a foreground region according to an obtained background modeling result, carrying out denoising processing on the foreground region, and obtaining a foreground target Fq
Since HOG (histogram of gradient directions) detection is performed by traversal on each frame of image, a foreground region is first extracted from each frame of the surveillance video for this purpose. The background modeling operation for the video content and the operation for extracting the foreground region according to the obtained background modeling result can adopt the most common methods such as a background difference method, an optical flow method, an inter-frame difference method and the like. The interframe difference method [3] generally can not completely extract all related pixel points, and void phenomena are easily generated inside a moving entity. The optical flow method [4] is relatively complex in calculation method and poor in noise resistance. Because the environment of the object to be researched is ideal, and the requirement of the algorithm used by the method on background updating is not very high (the background needs to be updated once in a long time), the foreground object is extracted by adopting a background difference method. The method is simple to implement, high in operation speed, and suitable for the static scene of the camera, and a static background image of the current scene needs to be obtained. The method comprises the following specific steps:
1) obtaining a static background image b which does not contain a target object in a current scene;
2) the current frame (i.e. the k frame) image fkDifference operation is carried out on (x, y) and the background image b to obtain a difference image Dk(x,y),
Dk(x,y)=|fk(x,y)-b|。
3) For the difference image Dk(x, y) binarizing to obtain a binarized image Rk(x,y),Wherein the threshold value T1Can be set according to actual conditions, and T is set in the experiment1And = 25. In particular, the embodiment of the present invention is not limited to this.
Figure BDA0000405381830000041
4) For the phenomenon of occurrence of voids and burrs in the foreground region, reference is used [5 ]]The morphological method proposed in (1) for binarized image Rk(x, y) processing to eliminate noise interference.
Namely, for the binary image RkAnd (x, y) performing morphological filtering processing to eliminate isolated noise points and repair the holes in the target area. And finally detecting and segmenting a foreground target F through connectivity analysisq(Q =1,2.. Q, where Q represents the total number of segmented foreground target objects), extracting the foreground target FqIs externally connected with a rectangle UqAnd will be externally connected with a rectangle UqThe size of the picture is uniformly changed to 64 × 128 pixels, so that feature vectors can be extracted later.
103: performing off-line training on the support vector machine through the positive example picture and the negative example picture, respectively obtaining a target left object model M and a human body model N, and performing off-line training on each foreground target F obtained in the step 102qRespectively inputting the two models for judgment, and outputting a target remaining object P;
in order to determine the foreground object F extracted in step 102qWhether the object belongs to a human body or a target type left object (in the method, the target object is described by taking a suitcase, a traveling bag, a backpack or a case as an example, and can be set as other types of objects in specific implementation, which is not limited by the embodiment of the invention), a parcel type picture N is selected from a network picture and a personal shot picture1Selecting the part without package in the monitoring video as a negative example picture, wherein the negative example picture is N2A web; similarly, a human body picture M is selected from a network picture and a personal shot picture1Selecting the part without human body in the monitoring video as another negative example picture, wherein the negative example picture is M in total2A web; the sizes of the pictures are unified into 64 × 128 pixels, and the HOG features of the positive example picture and the negative example picture are respectively extracted.
The HOG feature is a local region descriptor, is formed by calculating a gradient direction histogram on a local region, is insensitive to illumination change and small offset, and is extracted by the following specific method:
1) gradient calculation:
assuming that the gray value at the pixel point (x, y) in the input image is I (x, y), the corresponding gradients in the x and y directions can be represented by the following formula:
Gx(x,y)=|I(x+1,y)-I(x-1,y)|
Gy(x,y)=|I(x,y+1)-I(x,y-1)|
wherein G isx(x,y)、Gy(x, y) respectively representing the horizontal direction gradient and the vertical direction gradient of the pixel point (x, y) in the input image; i (x +1, y) represents the gray value at the pixel point (x +1, y); i (x-1, y) represents the gray value at the pixel point (x-1, y); i (x, y +1) represents the gray value at the pixel point (x, y + 1); i (x, y-1) represents the gray value at pixel point (x, y-1).
The gradient magnitude G (x, y) and the gradient direction α (x, y) at the pixel point (x, y) can be represented by the following equations:
G ( x , y ) = G x ( x , y ) 2 + G y ( x , y ) 2
<math> <mrow> <mi>&alpha;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mi>tan</mi> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mo>[</mo> <mfrac> <mrow> <msub> <mi>G</mi> <mi>y</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>G</mi> <mi>x</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>]</mo> </mrow> </math>
2) will be provided with
Figure BDA0000405381830000053
The interval is evenly divided into 9 intervals, a histogram channel is established in each interval, the input image is divided into a plurality of units (cells) with 8 x 8 pixels, the amplitudes of all pixel points in the units (cells) are accumulated into the corresponding histogram channels according to the gradient direction values, and a 9-dimensional feature vector is obtained;
3) combining every adjacent 4 cells (cells) into a block (block), combining the feature vectors of the 4 cells (cells) to obtain 36-dimensional feature vectors of the block (block), and respectively normalizing in each block (block), wherein the normalization rule is as follows:
Figure BDA0000405381830000061
wherein v is a vector to be normalized, | v | | purple calculation1Is a norm of v, epsilon is a small constant, and epsilon =0.04 is set in this experiment. The value of epsilon can be determined according to actual conditions, and in the specific implementation, the embodiment of the present invention does not limit this.
4) The image is scanned in blocks with a scanning step of one cell. And finally, connecting the normalized features of all the blocks in series to obtain an HOG feature vector of the input image, and obtaining a target left object model M and a human body model N.
The target left-behind object model M and the human body model N are trained offline by adopting a support vector machine proposed in reference [6], namely the HOG feature vector of each positive example picture is marked as 1, the HOG feature vector of each negative example picture is marked as 0, and the HOG feature vectors are input into the support vector machine for training to obtain the target left-behind object model M and the human body model N.
The number of positive example pictures and negative example pictures is set according to the requirements in practical application, when the number is large, the trained support vector machine has high precision but consumes long time, and in the specific implementation, the embodiment of the invention does not limit the precision. In the experiment, N is selected1=M1=500,N2=M2=800。
The foreground object FqIs externally connected with a rectangle UqThe determination is performed by sequentially inputting the results into the target left-behind object model M and the human body model N (in the experiment, only the case where there is one target object in the scene is considered, and if there are a plurality of target objects, the following determination is performed in sequence).
1) If the output of the human body model N is 1 and the output of the target object model M is 1, the target type object in the scene is determined to be watched by people and not to be a left object, and the step 104 is not executed any more.
2) If the output of the human body model N is 1 and the output of the target object model M is 0, it is determined that there is no target type object to be found in the scene, and the step 104 is not executed.
3) If the output of the human body model N is 0 and the output of the target object model M is 0, it is determined that there is no object of the target type to be found in the scene, and the step 104 is not executed.
4) If the output of the human body model N is 0 and the output of the target object model M is 1, the target type object in the scene is determined to be unattended and is the target object P, and the next step 104 is carried out.
104: tracking the target remaining object P obtained in the step 103 by using a Meanshift algorithm, and acquiring the position coordinates of the target remaining object P in each previous frame;
the Meanshift algorithm is a density gradient-based non-parameter estimation method, and the basic principle of the method is an iterative process. After the target object P is detected in step 103, iteratively calculating the position coordinates of the target object P in the previous image frame by using a Meanshift algorithm until the coordinate distance calculated by two iterations is less than a certain threshold or the iteration exceeds a certain number of times, specifically comprising the following steps:
1) color space conversion:
setting coordinates (x ', y') for storing the result of the last iteration, and initializing the coordinates into the position coordinates of the target legacy object P; if the iteration number L =0, the maximum number of times of L is set to 8 in the method, and in a specific implementation, this is not limited in the embodiment of the present invention.
In order to eliminate the influence of illumination variation and shadow, the input video image in RGB color space is converted into HSV color space, wherein hue H contains the most essential information of color and is independent of brightness.
The formula for converting the RGB color space image into the HSV color space image is as follows:
V=max(R,G,B)
Figure BDA0000405381830000071
<math> <mrow> <mi>H</mi> <mo>=</mo> <mfenced open='{' close=''> <mtable> <mtr> <mtd> <mrow> <mo>(</mo> <mi>G</mi> <mo>-</mo> <mi>B</mi> <mo>)</mo> </mrow> <mo>&times;</mo> <mn>60</mn> <mo>/</mo> <mi>S</mi> <mo>,</mo> </mtd> <mtd> <mi>if</mi> </mtd> <mtd> <mi>V</mi> <mo>=</mo> <mi>R</mi> </mtd> </mtr> <mtr> <mtd> <mn>120</mn> <mo>+</mo> <mrow> <mo>(</mo> <mi>B</mi> <mo>-</mo> <mi>R</mi> <mo>)</mo> </mrow> <mo>&times;</mo> <mn>60</mn> <mo>/</mo> <mi>S</mi> <mo>,</mo> </mtd> <mtd> <mi>if</mi> </mtd> <mtd> <mi>V</mi> <mo>=</mo> <mi>G</mi> </mtd> </mtr> <mtr> <mtd> <mn>240</mn> <mo>+</mo> <mrow> <mo>(</mo> <mi>R</mi> <mo>-</mo> <mi>G</mi> <mo>)</mo> </mrow> <mo>&times;</mo> <mn>60</mn> <mo>/</mo> <mi>S</mi> <mo>,</mo> </mtd> <mtd> <mi>if</mi> </mtd> <mtd> <mi>V</mi> <mo>=</mo> <mi>B</mi> </mtd> </mtr> </mtable> </mfenced> </mrow> </math>
H=H+360,if H<0
wherein H represents hue, S represents saturation, and V represents brightness; r, G, B denote red, green and blue pixels, respectively. And converting each pixel point in the external rectangle of the target legacy object P from the RGB color space to the HSV color space by the formula.
2) And sampling the hue value of each pixel in the circumscribed rectangle of the target legacy object P to obtain a color histogram of the pixel. According to the formula
<math> <mrow> <mi>t</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>h</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>255</mn> </munderover> <mi>h</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> <mi>n</mi> <mo>=</mo> <mn>0,1</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mn>255</mn> </mrow> </math>
The back projection t (n) of the circumscribed rectangle is obtained. Wherein n represents the horizontal axis coordinate of the color histogram and represents a pixel with a value of n in the image area; h (n) represents the vertical axis of the color histogram, which is the statistic for the number of pixels with value n.
3) Calculating the position coordinates of the target legacy object P in the previous frame:
in the k-th frame of the video sequence,multiplying the coordinates of all pixels in the circumscribed rectangle by the back projection tkThe pixel values of the corresponding points are added and summed to obtain the coordinate (x) of the last frame of the target object P0,y0)。
Figure BDA0000405381830000081
And x and y are coordinate values of all pixels in the external rectangle in the HSV space on the x axis and the y axis. t is tk(nx,y) Is a reverse projection t under coordinates (x, y)kThe pixel value of (2).
Calculating the distance between the results of two iterations
Figure BDA0000405381830000082
If d is less than or equal to T2The iteration ends (threshold T)2The determination may be based on the actual situation, for example: t is2=3, which is not limited in this embodiment of the present invention when implementing specifically), the position coordinate of the target left-behind object P in the last frame is (x)0,y0). Otherwise (i.e. d)>T2) Updating the coordinates (x ', y ') to (x ') of the center position of the circumscribed rectangle0,y0) Obtaining a new external rectangle, increasing the iteration number L = L +1, returning to the step 2) to calculate the reverse projection diagram, calculating again according to the step 3), and repeating the operation; the method sets the termination condition that the iteration times are more than or equal to 8, namely when the iteration times are more than or equal to 8, the loop is jumped out, and the process is ended.
105: and traversing the initial video image sequence reversely, tracking the position coordinates of the target object P in each frame before the current frame, performing statistical analysis, outputting an image containing the target object P in the initial video image sequence, and reminding security personnel of paying attention.
And traversing the initial video image sequence from the moment (k frame) when the target object P is found to be left to the front and back, and tracking and recording the position coordinates of the initial video image sequence.
1) If it is satisfied at the m-th frame (1. ltoreq. m. ltoreq.k)
pm(x)≤T3Or X-pm(x)≤T3
Take m' = mminAnd m' represents the minimum frame number satisfying the above condition. The target legacy object P at the m' th frame is considered to enter the scene for the first time. Wherein p ism(x) Representing the X-coordinate of the target legacy object P at frame m, X representing the width pixel value of the initial video image sequence. T is set in this experiment3= 10. Threshold value T3The method and the device for processing the image data can be determined according to actual conditions, and when the method and the device are specifically implemented, the method and the device for processing the image data are not limited to this.
And outputting the image frames containing the target legacy object P in the initial video image sequence from the m' th frame to every A frames. I.e. the set O of output image frames:
Figure BDA0000405381830000083
wherein,representative pairAnd rounding up. In this experiment, a =15 is set, a value of a may be determined according to an actual situation, and in a specific implementation, this is not limited in the embodiment of the present invention.
2) Otherwise, an image frame containing the target carry-over object P is output every a frames from the 1 st frame of the initial video image sequence. I.e. the set O of output image frames:
Figure BDA0000405381830000091
reference to the literature
[1]K.Smith,P.Quelhas,D.Gatica-Perez.Detecting abandoned luggage items in a publicspace[C].Proceedings of the9th IEEE International Workshop on Performance Evaluation inTracking and Surveillance(PETS'06),2006:75~82.
[2]Lin,H.C.,Wang,L.L.,&Yang,S.N.(1996).Automatic determination of the spread parameterin Gaussian smoothing.Pattern Recognition Letters,17(12),1247-1252.
[3]Abdi J,Nekoui M A.Determined prediction of nonlinear time series via emotional temporaldifference learning[C].Control and Decision Conference,2008.
[4]Ahmad M,Taslima T,Lata L,et al.A combined local-global optical flow approach for cranialultrasonogram image sequence analysis.11th International Conference on Computer andInformation Technology,2008.
[5]Comer,Mary L.,and Edward J.Delp.Morphological operations for color image processing.Journal of electronic imaging8.3(1999):279-289.
[6]Zhang,Hao,et al.SVM-KNN:Discriminative nearest neighbor classification for visualcategory recognition.Computer Vision and Pattern Recognition,2006IEEE Computer SocietyConference on.Vol.2.IEEE,2006.
Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-described embodiments of the present invention are merely provided for description and do not represent the merits of the embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (4)

1. A carryover detection and tracking method, the method comprising the steps of:
(1) carrying out preprocessing such as graying, filtering and the like on an original monitoring video image sequence to obtain an initial video image sequence;
(2) carrying out background modeling on an initial video image sequence acquired by a camera, extracting a foreground region according to an obtained background modeling result, carrying out denoising processing on the foreground region, and obtaining a foreground target Fq
(3) By positive and negative example picturesOff-line training the support vector machine by the patch pair, respectively obtaining a target left object model M and a human body model N, and obtaining each foreground target F obtained in the step (2)qRespectively inputting the two models for judgment, and outputting a target remaining object P;
(4) tracking the target legacy object P by adopting a Meanshift algorithm, and acquiring the position coordinate of the target legacy object P in each previous frame;
(5) and traversing the initial video image sequence reversely, tracking the position coordinates of the target object P in each frame before the current frame, performing statistical analysis, and outputting an image containing the target object P in the initial video image sequence.
2. The carry-over detection and tracking method according to claim 1, wherein each foreground object F obtained in the step (2) is used as a referenceqThe two models are respectively input for judgment, and the operation of outputting the target object P is specifically as follows:
1) if the output of the human body model N is 1 and the output of the target object model M is 1, determining that the target type object in the scene is watched by people and is not a left object, and not executing the step (4);
2) if the output of the human body model N is 1 and the output of the target object model M is 0, determining that no target type object needs to be found in the scene, and not executing the step (4);
3) if the output of the human body model N is 0 and the output of the target object model M is 0, determining that no target type object needs to be found in the scene, and not executing the step (4);
4) if the output of the human body model N is 0 and the output of the target object model M is 1, the target type object in the scene is determined to be unattended, the target object P is determined to be the target object P, and the next step (4) is carried out.
3. The method according to claim 1, wherein the operation of tracking the target object P by using the Meanshift algorithm and obtaining the position coordinates of each frame before is specifically:
1) color space conversion;
2) sampling the hue value of each pixel in the circumscribed rectangle of the target legacy object P to obtain a color histogram of the hue value;
according to the formula
<math> <mrow> <mi>t</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>h</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> <mrow> <munderover> <mi>&Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>255</mn> </munderover> <mi>h</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> <mi>n</mi> <mo>=</mo> <mn>0,1</mn> <mo>,</mo> <mo>.</mo> <mo>.</mo> <mo>.</mo> <mo>,</mo> <mn>255</mn> </mrow> </math>
Obtaining a back projection t (n) of the circumscribed rectangle; n represents the horizontal axis coordinate of the color histogram and represents a pixel with the value of n in the image area; h (n) represents the vertical axis of the color histogram, which is the statistics of the number of pixels with the value of n;
3) calculating the position coordinates of the target legacy object P in the previous frame;
in the k-th frame, the coordinates of all pixels within the bounding rectangle are multiplied by the back projection tkThe pixel values of the corresponding points are added and summed to obtain the coordinate (x) of the last frame of the target object P0,y0);
Figure FDA0000405381820000021
Wherein x and y are coordinate values of all pixels in the external rectangle in the HSV space on the x axis and the y axis;tk(nx,y) Is a reverse projection t under coordinates (x, y)kA pixel value of (a);
calculating the distance between the results of two iterations
Figure FDA0000405381820000022
If d is less than or equal to T2The iteration is ended, and the position coordinate of the target legacy object P in the last frame is (x)0,y0)。
4. The method according to claim 1, wherein the backward traversal of the initial video image sequence tracks the position coordinates of the target object P in each frame before the current frame, performs statistical analysis, and outputs the image of the initial video image sequence containing the target object P specifically comprises:
traversing the initial video image sequence forwards and backwards from the moment of finding the target leaving the object P, and tracking and recording the position coordinates of the initial video image sequence;
1) if m is more than or equal to 1 and less than or equal to k in the mth frame, the condition of
pm(x)≤T3Or X-pm(x)≤T3
Take m' = mminIf m' represents the minimum frame number meeting the above condition, the target legacy object P is considered to enter the scene for the first time at the mth frame; p is a radical ofm(x) Representing the X coordinate of the target legacy object P at the mth frame, X representing the width pixel value of the initial video image sequence;
outputting an image frame containing the target legacy object P in the initial video image sequence every A frames from the m' th frame, namely outputting a set O of image frames:
Figure FDA0000405381820000023
wherein,
Figure FDA0000405381820000024
representative pair
Figure FDA0000405381820000025
Carrying out upward rounding;
2) otherwise, an image frame containing the target carry-over object P is output every a frames from the 1 st frame of the initial video image sequence. I.e. the set O of output image frames:
Figure FDA0000405381820000026
CN201310531106.XA 2013-10-30 2013-10-30 Remnant detection and tracking method Active CN103605983B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310531106.XA CN103605983B (en) 2013-10-30 2013-10-30 Remnant detection and tracking method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310531106.XA CN103605983B (en) 2013-10-30 2013-10-30 Remnant detection and tracking method

Publications (2)

Publication Number Publication Date
CN103605983A true CN103605983A (en) 2014-02-26
CN103605983B CN103605983B (en) 2017-01-25

Family

ID=50124203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310531106.XA Active CN103605983B (en) 2013-10-30 2013-10-30 Remnant detection and tracking method

Country Status (1)

Country Link
CN (1) CN103605983B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106128023A (en) * 2016-07-18 2016-11-16 四川君逸数码科技股份有限公司 A kind of wisdom gold eyeball identification foreign body leaves over alarm method and device
CN106204650A (en) * 2016-07-11 2016-12-07 北京航空航天大学 A kind of vehicle target tracking based on vacant lot video corresponding technology
CN106650638A (en) * 2016-12-05 2017-05-10 成都通甲优博科技有限责任公司 Abandoned object detection method
CN108476304A (en) * 2016-01-25 2018-08-31 松下知识产权经营株式会社 It abandons object monitoring device and has the discarding article surveillance system of the discarding object monitoring device and abandon article surveillance method
CN109636795A (en) * 2018-12-19 2019-04-16 安徽大学 Monitor video remnant object detection method without tracking in real time
CN110321808A (en) * 2019-06-13 2019-10-11 浙江大华技术股份有限公司 Residue and robber move object detecting method, equipment and storage medium
CN110648352A (en) * 2018-06-26 2020-01-03 杭州海康威视数字技术股份有限公司 Abnormal event detection method and device and electronic equipment
CN110751107A (en) * 2019-10-23 2020-02-04 北京精英系统科技有限公司 Method for detecting event of discarding articles by personnel
CN111415347A (en) * 2020-03-25 2020-07-14 上海商汤临港智能科技有限公司 Legacy object detection method and device and vehicle
CN113780231A (en) * 2021-09-22 2021-12-10 国网内蒙古东部电力有限公司信息通信分公司 Legacy tool detection method and device, electronic equipment and storage medium
CN114639113A (en) * 2020-11-30 2022-06-17 风林科技(深圳)有限公司 Data processing method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101231696A (en) * 2008-01-30 2008-07-30 安防科技(中国)有限公司 Method and system for detection of hangover
CN101552910A (en) * 2009-03-30 2009-10-07 浙江工业大学 Lave detection device based on comprehensive computer vision
US20100128930A1 (en) * 2008-11-24 2010-05-27 Canon Kabushiki Kaisha Detection of abandoned and vanished objects
CN103226701A (en) * 2013-04-24 2013-07-31 天津大学 Modeling method of video semantic event
CN103324906A (en) * 2012-03-21 2013-09-25 日电(中国)有限公司 Method and equipment for detecting abandoned object

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101231696A (en) * 2008-01-30 2008-07-30 安防科技(中国)有限公司 Method and system for detection of hangover
US20100128930A1 (en) * 2008-11-24 2010-05-27 Canon Kabushiki Kaisha Detection of abandoned and vanished objects
CN101552910A (en) * 2009-03-30 2009-10-07 浙江工业大学 Lave detection device based on comprehensive computer vision
CN103324906A (en) * 2012-03-21 2013-09-25 日电(中国)有限公司 Method and equipment for detecting abandoned object
CN103226701A (en) * 2013-04-24 2013-07-31 天津大学 Modeling method of video semantic event

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
富吉勇: "基于全方位视觉的遗留物及其放置者检测的研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108476304B (en) * 2016-01-25 2020-08-11 松下知识产权经营株式会社 Discarded object monitoring device, discarded object monitoring system provided with same, and discarded object monitoring method
CN108476304A (en) * 2016-01-25 2018-08-31 松下知识产权经营株式会社 It abandons object monitoring device and has the discarding article surveillance system of the discarding object monitoring device and abandon article surveillance method
CN106204650A (en) * 2016-07-11 2016-12-07 北京航空航天大学 A kind of vehicle target tracking based on vacant lot video corresponding technology
CN106128023A (en) * 2016-07-18 2016-11-16 四川君逸数码科技股份有限公司 A kind of wisdom gold eyeball identification foreign body leaves over alarm method and device
CN106650638A (en) * 2016-12-05 2017-05-10 成都通甲优博科技有限责任公司 Abandoned object detection method
CN110648352A (en) * 2018-06-26 2020-01-03 杭州海康威视数字技术股份有限公司 Abnormal event detection method and device and electronic equipment
CN109636795A (en) * 2018-12-19 2019-04-16 安徽大学 Monitor video remnant object detection method without tracking in real time
CN109636795B (en) * 2018-12-19 2022-12-09 安徽大学 Real-time non-tracking monitoring video remnant detection method
CN110321808B (en) * 2019-06-13 2021-09-14 浙江大华技术股份有限公司 Method, apparatus and storage medium for detecting carry-over and stolen object
CN110321808A (en) * 2019-06-13 2019-10-11 浙江大华技术股份有限公司 Residue and robber move object detecting method, equipment and storage medium
CN110751107A (en) * 2019-10-23 2020-02-04 北京精英系统科技有限公司 Method for detecting event of discarding articles by personnel
CN111415347A (en) * 2020-03-25 2020-07-14 上海商汤临港智能科技有限公司 Legacy object detection method and device and vehicle
CN111415347B (en) * 2020-03-25 2024-04-16 上海商汤临港智能科技有限公司 Method and device for detecting legacy object and vehicle
CN114639113A (en) * 2020-11-30 2022-06-17 风林科技(深圳)有限公司 Data processing method and device, electronic equipment and storage medium
CN113780231A (en) * 2021-09-22 2021-12-10 国网内蒙古东部电力有限公司信息通信分公司 Legacy tool detection method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN103605983B (en) 2017-01-25

Similar Documents

Publication Publication Date Title
CN103605983B (en) Remnant detection and tracking method
Sengar et al. Moving object detection based on frame difference and W4
CN107330372B (en) Analysis method of video-based crowd density and abnormal behavior detection system
Braham et al. Deep background subtraction with scene-specific convolutional neural networks
Tian et al. Robust and efficient foreground analysis for real-time video surveillance
CN109918971B (en) Method and device for detecting number of people in monitoring video
CN110232330B (en) Pedestrian re-identification method based on video detection
CN107301376B (en) Pedestrian detection method based on deep learning multi-layer stimulation
Patil et al. Motion saliency based generative adversarial network for underwater moving object segmentation
Karpagavalli et al. Estimating the density of the people and counting the number of people in a crowd environment for human safety
Abbas et al. Realization of multiple human head detection and direction movement using Raspberry Pi
Su et al. A new local-main-gradient-orientation HOG and contour differences based algorithm for object classification
Ansari et al. A fusion of dolphin swarm optimization and improved sine cosine algorithm for automatic detection and classification of objects from surveillance videos
Roy et al. A comprehensive survey on computer vision based approaches for moving object detection
Lee et al. Estimation and analysis of urban traffic flow
Moutakki et al. Real-time video surveillance system for traffic management with background subtraction using codebook model and occlusion handling
Moayed et al. Traffic intersection monitoring using fusion of GMM-based deep learning classification and geometric warping
Yang et al. A modified method of vehicle extraction based on background subtraction
Jehad et al. Developing and validating a real time video based traffic counting and classification
Yang et al. Dual frame differences based background extraction algorithm
Delibaşoğlu Vehicle Detection from Aerial Images with Object and Motion Detection
Nagaraju et al. Foreground Moving Object Detection using Support Vector Machine (SVM)
Basalamah et al. Pedestrian crowd detection and segmentation using multi-source feature descriptors
KR101203050B1 (en) Background Modeling Device and Method Using Bernoulli Distribution
Ramamoorthy et al. Intelligent video surveillance system using background subtraction technique and its analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant