[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN103985114A - Surveillance video person foreground segmentation and classification method - Google Patents

Surveillance video person foreground segmentation and classification method Download PDF

Info

Publication number
CN103985114A
CN103985114A CN201410108137.9A CN201410108137A CN103985114A CN 103985114 A CN103985114 A CN 103985114A CN 201410108137 A CN201410108137 A CN 201410108137A CN 103985114 A CN103985114 A CN 103985114A
Authority
CN
China
Prior art keywords
matrix
prospect
pixel
image
foreground
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410108137.9A
Other languages
Chinese (zh)
Other versions
CN103985114B (en
Inventor
郭延文
缪丽姬
夏元轶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201410108137.9A priority Critical patent/CN103985114B/en
Publication of CN103985114A publication Critical patent/CN103985114A/en
Application granted granted Critical
Publication of CN103985114B publication Critical patent/CN103985114B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a surveillance video person foreground segmentation and classification method. The method comprises the following steps: 1) the person foreground of a surveillance video is extracted, the foreground and the background are segmented by adopting a Gaussian mixture model method, and the foreground person is represented by a bounding box to form a foreground person short video; 2) foreground features are extracted, key frames are extracted for the short video, frames having a moderate foreground occupied area ratio and complete foreground shape and color are selected as the key frames, and a plurality of features are extracted for the key frames; 3) and feature fusion and classification are performed, the non-person foregrounds, such as a car, etc., are classified according to the features, subspaces are studies for person features by adopting the canonical correlation coefficient feature fusion method, the features are projected to the subspace having better class features, and different clustering methods are adopted for the projected features so as to classify foreground persons having similar shape and color into a same class.

Description

A kind of method of monitor video personage foreground segmentation and classification
Technical field
A kind of method that the present invention relates to monitor video personage foreground segmentation and classification, belongs to the fields such as computer video, machine learning techniques.
Background technology
Bringing people, the while has also brought some potential safety hazards to the modern life easily, in order to eliminate these hidden danger, taked many kinds of measures, the monitor video in each corner is exactly a kind of in many measures, but when there is unsafe incidents, in the face of the huge monitor video of quantity, supervisory personnel often needs to spend long time and searches dangerous target, affected the efficiency of eliminating unsafe incidents, there are now some means to shorten monitor video from time and two, space angle, shorten the proportion of the nonsensical shared video of part, although do like this, the time of browsing nonsensical video can effectively be reduced, but or need in a plurality of monitor videos, pick out target, and at present much to the research of foreground classification mainly around be that the kind of prospect is classified, and various hidden dangers are caused by people under normal circumstances, such as prospect being divided into various vehicles and personage, be divided into plant, animal and building etc., seldom there is pair research that personage's prospect is classified, what the research of prospect kind classification mainly adopted is measure of supervision, after conversion scene, conventionally also needing to gather the training of new scene data expends larger, and unsupervised personage's foreground classification will shorten hunting zone effectively, the time that greatly reduces checking monitoring video increases work efficiency, the sort research of monitor video foreground people becomes relevant studying a question.
Traditional monitor video foreground classification mainly adopts the method for supervised learning to divide the kind of prospect, during the monitoring scene that the scope that is suitable for is larger is applied, but the monitoring scene classification that is mainly people for prospect among a small circle seldom has correlative study, and supervised learning need to carry out a large amount of study preparation work larger, adopted in the present invention a plurality of features of cutting apart monitor video prospect and background and extracting key frame and supervised canonical correlation coefficient fusion feature by nothing, adopt lowest rank clustering method, provide the classification results of video.
Summary of the invention
Goal of the invention: technical matters to be solved by this invention is the deficiency for existing research, provides a kind of classification and dividing method of monitor video personage prospect, thereby improves the efficiency of inspection monitor video.
Technical scheme: the invention discloses a kind of method of monitor video personage foreground segmentation and classification, the method is characterized in that the video that can browse in the short time in the different scenes of same personage, specifically comprise the following steps:
1, the prospect of monitoring of separation video and background: generally speaking the prospect of monitor video refers to movable things be people or car conventionally, and the background of monitor video refer to scenery static in video.Adopt mixed Gauss model by the prospect of monitor video and background separation out, and each prospect is surrounded with the minimum bounding box that can surround prospect completely, form independently prospect small video.The detailed step of step 1 prospect and background segment is as follows:
Step 1-1, initialization Gauss model: read the first two field picture of video, for each pixel in image builds a mixed Gauss model that contains K Gauss model, K span 3~5, pixel j is at moment t value x jprobability P (x j) can be represented by this K Gauss model: wherein represent the constantly weight of i gaussian component in the mixed Gauss model of pixel j of t, satisfied: with represent t average and the covariance of i the gaussian component of pixel j constantly, represent Gaussian probability-density function, be expressed as follows:
N ( x j , u j , t i , Σ j , t i ) = 1 2 π d 2 | Σ j , t i | 1 2 exp [ - 1 2 ( x j - u j , t i ) T ( Σ j , t i ) - 1 ( x j - u j , t i ) ] ,
Wherein d is x jdimension, for RGB color space, each pixel has 3 passages, x jfor tri-vector, covariance matrix wherein wherein be illustrated in t i the Gauss model variance of pixel j constantly, during initialization value is i represents unit matrix, initial phase, the weight of each Gaussian distribution σ init 2 = 900 ;
Step 1-2, upgrade Gauss model: continue to read monitor video, often read a two field picture of monitor video and just mixed Gauss model is done and upgraded, by each gauss component in mixed Gauss model according to descending sequence, the pixel value x of the current new frame reading j, t+1if meet following formula with i Gauss model in mixed Gauss model:
| x j , t + 1 - u j , t i | ≤ δ * σ j , t i
Upgrade i gauss component, all the other gauss components remain unchanged, and judge pixel x j, t+1in present frame, be background parts pixel, parameter δ is matching threshold, δ span 1~2, and the computing method of upgrading i gauss component are as follows:
ω j , t + 1 i = ( 1 - α ) ω j , t i + α
u j , t + 1 i = ( 1 - ρ ) u j , t i + ρx j
( σ j , t + 1 i ) 2 = ( 1 - ρ ) ( σ j , t i ) 2 + ρ ( x j - u j , t i ) T ( x j - u j , t i )
ρ = α ω j , t i
Wherein α is the learning rate of gauss hybrid models, span 0~1, the learning rate that ρ is parameter alpha; If pixel x j, t+1all unmatched with K gauss component, judge that this pixel is the foreground pixel of present frame, to construct new gauss component and replace the gauss component after sequence is leaned on, the average of new gauss component is made as pixel x j, t+1value, standard deviation and weight are set to respectively σ initand ω init, average and the variance of the gauss component of reservation remain unchanged, and weight is upgraded according to the following formula:
ω j , t + 1 i = ( 1 - α ) ω j , t i
Step 1-3, completes video foreground and background segment work: pixel x j, t+1k gauss component undated parameter after, weight to K gauss component is normalized, and repeats preceding step 1-1 and 1-2, retains the foreground pixel in every two field picture, until monitor video reads end, obtain the video of non-display background with the display foreground of the same resolution of original monitor video;
Step 1-4, extract the minimum bounding box that surrounds foreground people video: the monitor video obtaining in read step 1-3, every two field picture is first carried out to dilation and erosion operation, thereby the noise in removal of images, progressive scanning picture again, in document image, pixel value is not long l and the wide w of the rectangle that forms of 0 pixel, because the video background pixel value that preceding step 1-2 obtains is 0, so this pixel that represents that pixel value is non-zero is prospect, bounding box for same each frame of personage's prospect has long l and wide w, select l and w the longest in all frames, as the bounding box of this personage's prospect, obtain thus surrounding the small video of personage's prospect video.
2, the feature of extraction prospect small video: prospect small video is extracted to one group of key frame, considering that monitor video moves mainly comprises people and vehicle in prospect, and fundamental purpose of the present invention is in order to carry out personage's classification, thereby each prospect is recorded to area and two features of translational speed, before carrying out human classification first by automobile foreground classification out, personage's profile is very important for distinguishing different personages with colouring information, so after carrying out dilation and erosion operation, to each personage's prospect key-frame extraction color histogram feature, the feature of local two value tags and these three relevant CFs of word bag.Step 2 foreground people feature extraction detailed step is as follows:
Step 2-1, the key frame of extraction personage prospect: fixedly choose the middle F frame of personage's video f 1, f 2..., f fimage is as key frame, and F is desirable 20~40, and in the middle of choosing, F frame is because a framing is than start frame and the more complete profile that has represented personage and the color of end frame in the middle of personage's small video, and personage's prospect to account for the size of video moderate;
Step 2-2, extracts color histogram information: to F frame f 1, f 2..., f fcolor characteristic histogram is extracted in personage subregion, establishes the total m of column subregion of color histogram cindividual, computed image f ithe column subregion id that tri-Color Channel RGB values of middle pixel p are corresponding, i gets 1~F, and R represents red color channel value, and G represents green channel value, and B represents blue channel value, formula is as follows:
id = R 256 + G 256 m c 2 3 + B 256 m c 1 3 ,
The number of adding up pixel in each column subregion id obtains f icolor histogram, it is m that color histogram is finally expressed as length cvectorial υ c, all key frames are repeated to this step and obtain m cthe matrix M of * F 1;
Step 2-3, extracting local two-value is Local Binary Pattern feature, is called for short LBP feature: calculate F frame f 1, f 2..., f npart two value tags of image, first by image f igray processing, the radius of the two-value LBP of the portion operator of setting a trap is r, and r gets 3 or 4 or 5, with the window of r*r, in image, moves, and location of pixels of every movement just calculates window center pixel p one time centerlBP value, computing method are as follows: will with center pixel p centeran adjacent r*r pixel respectively with center pixel p centervalue compare, adjacent pixel values is greater than center pixel p centerthe position of this pixel is marked as 1, otherwise is marked as 0, obtains thus r*r-1 bit, final window has obtained the local two-value LBP feature of whole image while moving to last center pixel position, then by the local two-value LBP feature histogram graph representation of image; The two-value LBP of the portion feature of setting a trap histogram subregion has m lindividual, the height value of each component of histogram is together in series and obtains final part two value tags: length is m lvectorial υ 1, all key frames are repeated to this step, until obtain m lthe matrix M of * F 2;
Step 2-4, extracting word bag is bag of words feature, is called for short BOW feature: first calculate F frame f 1, f 2..., f fdimension rotation unchangeability sift unique point (articles of reference: object recognition from local scale-invariant features), establishing word list length in word bag bag of words model is m b, adopting K-means clustering method, K-means cluster centre is 64, and the close sift unique point of the meaning of a word is merged and obtains m bindividual class, class center forms the word list of word bag BOW, with cluster centre, replaces each sift unique point in each frame, adds up the corresponding sift unique point of each vocabulary of all words number, finally obtains image f ithe frequency of each vocabulary, length is m bvectorial υ b.All key frames are repeated to this step, until obtain m b* F matrix M 3;
Step 2-5, extracts area and velocity characteristic.Calculate F frame f 1, f 2..., f fthe area s of the prospect of each two field picture in image 1, s 2..., s fwith speed υ 1, υ 2..., υ f-1the area of prospect is also number of pixels non-zero in prospect, get the mean value of F display foreground area as the area value s of this prospect, the displacement of the center that prospect speed is surrounded frame by the rectangle of prospect in original monitor video determines, F two field picture calculates F-1 speed, gets the intermediate value of speed as the speed v of this prospect.
3, Fusion Features and classification: first by the separation vehicle in prospect out, it is automobile and personage that the prospect occurring in general monitor video is divided into two large classes conventionally, on the interlude axle of the prospect video of same shot record, one group of image has good robustness to the transparent effect of camera lens, in the middle of personage's prospect video, personage's foreground area of a framing is far smaller than the area of automobile prospect in the middle framing of automobile prospect video conventionally, the speed of personage's prospect is generally also much smaller than the translational speed of automobile prospect, each foreground area obtaining and speed are sorted out to automobile according to the threshold value of setting, the personage's foreground extraction color histogram feature, local two value tags and the word bag feature that again classification are obtained, adopt again the method for canonical correlation coefficient to carry out nothing supervision Fusion Features, obtain one and distinguish different classes of space T, three eigenmatrixes are projected to space T, color characteristic after projection is done to lowest rank subspace clustering, LBP after projection and BOW feature are done to K-means cluster, according to cluster result, foreground people small video is classified, it is as follows that the foreground people of step 3 monitor video is classifyed step in detail:
Step 3-1; the threshold value of foreground area and speed is set; generally the speed of automobile wants large with corresponding speed and the area numerical value of area numeric ratio personage prospect; and foreground object track under shot record or draw near or from the close-by examples to those far off; it is less that middle image size is generally subject to transparent effect impact, here area threshold area thresh=800pixel, threshold speed speed thresh=25pixel/image, pixel represents pixel, image presentation video, the foreground partition that area surpasses area threshold is class of vehicles, when area features does not surpass area threshold, the threshold value if prospect velocity characteristic outpaces, foreground partition is class of vehicles otherwise is divided into personage's classification;
Step 3-2, uniform data dimension: the color histogram matrix m for F image that step 2 is obtained c* F, LBP eigenmatrix m l* F and BOW eigenmatrix m b* F, call principal component analysis (PCA) Principal Component Analysis(articles of reference: On Lines and Planes of Closest Fit to Systems of Points in Space), be reduced to unified dimension m, all eigenvectors matrixs become m * F;
Step 3-3, Fusion Features: establishing and having matrix T dimension is m * n, n is determined by matrix A below, three eigenvectors matrix M 1, M 2, M 3project in space, matrix T place, can present the projector distance of similar space vector in the T of space very near, and the far characteristic of the projector distance of inhomogeneous space vector in the T of space, initialization T is vector of unit length, the content of the renewal matrix T of iteration, concrete iterative process is as follows:
3-3-1. is to matrix M 1, M 2, M 3carry out the positive triangle decomposition of matrix and upgrade matrix M i : T T M i = φΔ i , M i ′ = M i Δ i - 1 , I gets 1~3;
3-3-2. is to every a pair of M ' i, M ' jcarry out Singular Value Decomposition Using:
3-3-3. solution matrix T, compute matrix A = Σ k 1 = 1 3 Σ k 2 = 1 3 ( M k 1 ′ Q k 1 k 2 - M k 2 ′ Q k 2 ) ( M k 1 ′ Q k 1 k 2 - M k 2 ′ Q k 2 ) T The proper vector of compute matrix A λ is the proper vector t of matrix A ieigenwert, by t iaccording to order sequence formation matrix T, i.e. T={t from big to small 1, t 2..., t n, the different characteristic of matrix A vector number has been determined the size of n here;
Repeating step 3-3-1~3-3-3 is until T convergence repeats above-mentioned steps 3-5 submatrix T and can restrain, and wherein i span 1~3, T tthe transposed matrix of representing matrix T, M ' irepresent M iinverse matrix, represent M ireverse put, the orthogonal matrix after the positive triangle decomposition of φ representing matrix, Δ ithe upper triangular matrix after the positive triangle decomposition of matrix, represent Δ iinverse matrix, Q ijthe unitary matrix of representing matrix svd;
Step 3-4, prospect visual classification: by eigenvectors matrix M 1, M 2, M 3project in the space at T place, i.e. M i=T tm i, i gets 1~3, obtains new eigenvectors matrix M 1, M 2, M 3;
Step 3-5, color histogram feature clustering: color matrix M 1adopt lowest rank Subspace clustering method, the color histogram of different prospects is often presented on different data dimensions and distance in K-means method (articles of reference A K-means Clustering Algorithm) generally adopts Euclidean distance, be not suitable for the distance of color space, so adopt Subspace clustering method can realize preferably category division.Utilize lowest rank method to calculate the similarity w between every two field picture, structural map image is using all foreground images as node, similarity w between image is as weight, adopt again spectral clustering Ncut method (articles of reference: Normalized Cuts and Image Segmentation) figure image is cut apart, thereby complete the classification to image, the computing method of similarity w are as follows:
3-5-1, initiation parameter λ 0, correlation matrix Z, the equivalent matrice J=0 of correlation matrix Z, Z=J, noise is corrected matrix E=0, Lagrangian matrix Y 1=0, Y 2=0, Lagrange punishment parameter μ=10 -6, maximum Lagrange punishment parameter m ax μ=10 10, Lagrange punishment parameter multiple ρ 0=1.1, constant ε=10 -8;
3-5-2, calculates M 1the correlation matrix equivalent matrice J of every column data: fix other matrix update matrix J, J = arg min 1 μ | | J | | * + 1 2 | | J - ( Z + Y 2 / μ ) | | F 2 ;
3-5-3, calculates M 1the correlation matrix Z of every column data: fix other matrix update matrixes Z, Z=(I+M 1 tm 1) -1(M 1 tm 1-M 1 te+J+ (M 1 ty 1-Y 2)/μ);
3-5-4, calculating noise is corrected matrix E: fix other matrix update matrixes E, E + arg min λ 0 μ | | E | | 2,1 + 1 2 | | E - ( M 1 - M 1 Z + Y 1 / μ ) | | F 2 ;
3-5-5, calculates Lagrangian matrix Y 1, Y 2: Y 1, Y 2, Y 1=Y 1+ μ (M 1-M 1z-E), Y 2=Y 2+ μ (Z-J);
3-5-6, upgrades Lagrange punishment parameter μ: μ=min (ρ 0μ, max μ);
3-5-7, judges whether iteration finishes: check || M 1-M 1z-E|| < ε, || Z-J|| whether < ε sets up, and finishes, otherwise continue iteration if set up iteration;
Wherein || || *represent nuclear norm, || || frepresent this norm of Fu Luo Benny crow, || || represent maximum norm, smaller value in A and B is returned in min (A, B) representative, and above-mentioned iterative process obtains matrix Z, the element Z in matrix Z i,j, Z j,isimilar value between sum representative image i, j, build non-directed graph image, the node of image i representative graph image, similarity between image i, j represents the weight between node i and node j, thereby adopts spectral clustering Ncut method that figure image is cut apart and realized the classification between a plurality of prospect key frames.
Step 3-6, LBP and BOW feature clustering: LBP and the difference of BOW characteristic on Spatial Dimension are very little, directly adopt K-means method just can obtain good result, to matrix M 2, M 3adopt K-means method to carry out cluster, the personage's prospect in this method is generally 2-3 class;
Step 3-7, carries out integrated study to result: by step 3-5,3-6,3-7 obtains the classification C under each two field picture under three features i, utilize three classification information ballots to determine the classification of each two field picture, as C 1, C 2, C 3for image f iclassification be respectively classification that 0,0,1 occurrence number is the highest 0 for image f iclassification, thereby all key frame f ican determine its classification information.Calculate personage's prospect video υ iwith υ jbetween similarity for prospect video υ iif, its same υ jbetween similarity than high with other video similarity, υ isame υ jbe divided into same class.
Accompanying drawing explanation
Fig. 1 is the basic flow sheet of the inventive method.
Fig. 2 is original monitor video 1 part picture.
Fig. 3 is original monitor video 2 part pictures.
Fig. 4 is original monitor video 3 part pictures.
Fig. 5 is monitor video 1 part prospect.
Fig. 6 is monitor video 2 part prospects.
Fig. 7 is monitor video 3 part prospects.
Fig. 8 is the extraction schematic diagram of LBP feature.
Fig. 9 is the schematic diagram of BOW clustering method.
Figure 10 is the subspace schematic diagram at two object prospect places in monitor video 1.
The principle schematic of Figure 11 Canonical correlation Fusion Features.
In Figure 12 this method, human classification precision is with the personage's clustering precision comparison after not merging.
Embodiment
Below in conjunction with the drawings and specific embodiments, the present invention is done further and illustrated.
The process flow diagram of this method as shown in Figure 1, is divided into three large processes: be first that employing mixed Gauss model is cut apart prospect to each monitor video and background obtains prospect video; Next be prospect video personage is separated and to one group of key frame of each personage's video extraction for video feature extraction; Key-frame extraction color histogram to each foreground people again, local two value tags, word bag feature, then these three features are carried out to Fusion Features with canonical correlation coefficient, construct one new can the different classes of space of better distinguish, again these three eigenmatrixes are projected in this space with better distinguish and carry out cluster, color matrix after projection is adopted to lowest rank subspace clustering, local two-value feature and word bag feature are adopted to K-means cluster, finally adopt integrated learning approach to obtain the similarity between prospect.
Specifically, as shown in Figure 1, the invention discloses a kind of monitor video foreground people and cut apart and sorting technique, mainly comprise following step:
Step 1, the prospect of monitor video and background segment: generally speaking the prospect of monitor video refers to movable things, be people or car conventionally, and the background of monitor video refers to scenery static in video.Adopt prospect and the background of mixed Gauss model monitoring of separation video, and each foreground people is surrounded with the minimum bounding box that can surround foreground people completely, form independently foreground people small video;
Step 2, extract the feature of foreground people small video: foreground people small video is extracted to one group of key frame, considering that monitor video moves mainly comprises people and vehicle in prospect, and fundamental purpose of the present invention is personage's classification, so each prospect is recorded to two features of Area and Speed, before carrying out human classification first by automobile foreground classification out, personage's profile is very important for distinguishing different personages with colouring information, so after carrying out dilation and erosion operation, to each personage's key-frame extraction color histogram feature, the feature of local two value tags and these three relevant CFs of word bag,
Step 3, Fusion Features and classification: first by the separation vehicle in prospect out, it is automobile and personage that the prospect occurring in general monitor video is generally two large classes, on the interlude axle of the prospect video of same shot record, one group of image has good robustness to the transparent effect of camera lens, in the middle of personage's prospect, personage's area of a framing is far smaller than the automobile area of the middle framing of automobile prospect conventionally, the speed of personage's prospect is generally also much smaller than the translational speed of automobile prospect, to each foreground area and the speed that obtain, according to the threshold value of setting, sort out automobile, the personage's foreground extraction color histogram feature, local two value tags and the word bag feature that again classification are obtained, adopt again the method for canonical correlation coefficient to carry out nothing supervision Fusion Features, obtain one and distinguish different classes of space T, three Projection Characters are arrived to space T, color characteristic after projection is done to lowest rank subspace clustering, LBP after projection and BOW feature are done to K-means cluster, according to cluster result, foreground people small video is classified,
Step 1, the detailed step of prospect and background segment is as follows:
Step 1-1, initialization Gauss model: the first two field picture that reads video is that in image, each one of pixel structure contains K, in invention, K is the mixed Gauss model of 3, with K Gauss model, represents in monitor video in every two field picture that each pixel j is at the value x of moment t j, pixel j is at moment t value x jprobability P (x j) can be represented by this K Gauss model: P ( x j ) = &Sigma; i = 1 K &omega; j , t i * N ( x j , u j , t i , &Sigma; j , t i ) , Wherein represent the constantly weight of i gaussian component in the mixed Gauss model of pixel j of t, satisfied: with represent t average and the covariance of i the gaussian component of pixel j constantly, represent Gaussian probability-density function, be expressed as follows:
N ( x j , u j , t i , &Sigma; j , t i ) = 1 2 &pi; d 2 | &Sigma; j , t i | 1 2 exp [ - 1 2 ( x j - u j , t i ) T ( &Sigma; j , t i ) - 1 ( x j - u j , t i ) ] ,
Wherein d is x jdimension, for RGB color space, each pixel has 3 passage x jfor tri-vector, covariance matrix wherein wherein represent t i the Gauss model variance of pixel j constantly, during initialization value be i represents unit matrix.Initial phase, the variance of each Gaussian distribution the weight of each Gaussian distribution is got ω init=1/K, ω in invention initvalue is 0.3;
Step 1-2, upgrades Gauss model: continue to read monitor video, often read a two field picture of monitor video and just mixed Gauss model is done and upgraded; By each gauss component in mixed Gauss model according to descending sequence, the pixel value x of the current new frame reading j, t+1if meet following formula with i Gauss model in mixed Gauss model:
| x j , t + 1 - u j , t i | &le; &delta; * &sigma; j , t i
Upgrade i gauss component, all the other gauss components remain unchanged, judgement pixel x j, t+1in present frame, be background parts pixel, parameter δ is matching threshold, δ span 1~2, and in invention, δ gets 1.5, and the computing method of upgrading i gauss component are as follows:
&omega; j , t + 1 i = ( 1 - &alpha; ) &omega; j , t i + &alpha;
u j , t + 1 i = ( 1 - &rho; ) u j , t i + &rho;x j
( &sigma; j , t + 1 i ) 2 = ( 1 - &rho; ) ( &sigma; j , t i ) 2 + &rho; ( x j - u j , t i ) T ( x j - u j , t i )
&rho; = &alpha; &omega; j , t i
Wherein α is the learning rate of mixed Gauss model, and α span 0~1 is got the learning rate that 1, ρ is parameter in α invention; If pixel x j, t+1all unmatched with K gauss component, judge that this pixel is the foreground pixel of present frame, to construct new gauss component and replace the gauss component after sequence is leaned on, the average of new gauss component is made as x j, t+1value, standard deviation and weight are set to respectively σ initand ω init, average and the variance of the gauss component of reservation remain unchanged, and weight is upgraded according to the following formula:
&omega; j , t + 1 i = ( 1 - &alpha; ) &omega; j , t i
Step 1-3, completes prospect and background segment work: the pixel x of video j, t+1k gauss component undated parameter after, weight to K gauss component is normalized, repeat preceding step 1-1 and 1-2 and retain the foreground pixel in every two field picture, until monitor video reads end, obtain the video of non-display background with the display foreground of the same resolution of original monitor video;
1-4, extract the minimum bounding box that surrounds foreground people video: the monitor video obtaining in read step 1-3, every two field picture is first carried out to dilation and erosion operation, thereby the noise in removal of images, progressive scanning picture again, in document image, pixel value is not long l and the wide w of the rectangle that forms of 0 pixel, because the video background pixel value that step 1-2 obtains is 0, so this pixel that represents that pixel value is non-zero is prospect, bounding box for same each frame of personage's prospect has long l and wide w, select the longest l in all frames, w obtains surrounding the small video of personage's prospect video thus as the bounding box of this personage's prospect, Fig. 2~4th, original monitor video, Fig. 5~7th, corresponding to the monitor video after the employing mixed Gauss model extraction prospect of Fig. 2~4.
Step 2, foreground people feature extraction detailed step is as follows:
Step 2-1, the key frame of extraction personage prospect: fixedly choose the middle F frame of personage's video f 1, f 2..., f fimage is as key frame, and F is desirable 20~40, and in invention, F gets 20, and in the middle of choosing, F frame is because a framing is than start frame and the more complete profile that has represented personage and the color of end frame in the middle of personage's small video, and personage's prospect to account for the size of video moderate;
Step 2-2, extracts color histogram information: to F frame f 1, f 2..., f fcolor characteristic histogram is extracted in personage subregion, establishes the total m of column subregion of color histogram cindividual, m in invention cget 64, computed image f ithe column subregion id that tri-Color Channel rgb values of middle pixel p are corresponding, i gets 1~F, and R represents red color channel value, and G represents green channel value, and B represents blue channel value, formula is as follows:
id = R 256 + G 256 m c 2 3 + B 256 m c 1 3
The number of adding up pixel in each column subregion id obtains f icolor histogram, it is m that color histogram is finally expressed as length cvectorial υ c; To all these steps of key frame repetitive operation, until obtain m cthe matrix M of * F 1;
Step 2-3, extracting local two-value is Local Binary Pattern feature, is called for short LBP feature, calculates F frame f 1, f 2..., f npart two value tags of image, first by image f igray processing, the radius of establishing LBP operator is r, and in invention, r gets 3, with the window of r*r, in image, moves, and location of pixels of every movement just calculates window center pixel p one time centerlBP value, computing method are as follows: will with center pixel p centeran adjacent r*r pixel respectively with center pixel p centervalue compare, adjacent pixel values is greater than center pixel p center, the position of this pixel is marked as 1, otherwise is marked as 0, as Fig. 8 represents, obtain thus r*r-1 bit, final window has obtained the LBP feature of whole image while moving to last center pixel position, then by the LBP feature histogram graph representation of image.If LBP histogram subregion has m lindividual, m in invention lvalue 64, is together in series the height value of each component of histogram, obtains final part two value tags: length is m lvectorial υ l.All key frames are repeated to this step, until obtain m lthe matrix M of * F 2;
Step 2-4, extracting word bag is bag of words feature, is called for short BOW feature: first calculate F frame f 1, f 2..., f fdimension rotation unchangeability sift unique point, establishing word list length in BOW model is m b, m in invention bget 64, adopt K-means method, in K-means, cluster centre is made as 64, and the close sift unique point of the meaning of a word is merged and obtains m bindividual class, class center forms the word list of BOW, and in invention, word list length is 64, replaces each the sift unique point in each two field picture, as m in Fig. 9 with the vocabulary in word list bvalue be after 3, K-means cluster, to obtain 3 cluster centres, more again with the vocabulary in word list, replace each the sift unique point in each two field picture, unique point sift 1distance-like m 1recently, m 1central point representation feature point sift 1, add up the corresponding sift unique point of each vocabulary of all word lists number, obtain image f ithe frequency of each vocabulary be that length is m bvectorial υ b, the operation that all key frames is repeated to 2-4 obtains m b* F matrix M 3;
Step 2-5, extracts area and velocity characteristic.Calculate F frame f 1, f 2..., f fthe area s of the prospect of each two field picture in image 1, s 2..., s fwith speed υ 1, υ 2..., υ f-1the area of prospect is also number of pixels non-zero in prospect, get the mean value of F display foreground area as the area value s of this prospect, the displacement of the center that prospect speed is surrounded frame by the rectangle of prospect in original monitor video determines, F two field picture calculates F-1 speed, gets the intermediate value of speed as the speed v of this prospect.
Step 3, it is as follows that the foreground people of monitor video is classifyed step in detail:
Step 3-1; the threshold value that foreground area and speed are set sorts out automobile; generally the speed of automobile speed and the area numerical value corresponding with area numeric ratio personage prospect wants large; and foreground object track under shot record or draw near or from the close-by examples to those far off; it is less that middle image size is generally subject to transparent effect impact, here area threshold area thresh=800pixel, threshold speed speed thresh=25pixel/image, pixel represents pixel, image presentation video; The foreground partition that area surpasses area threshold is class of vehicles, when area features does not surpass area threshold, and the threshold value if prospect velocity characteristic outpaces, foreground partition is class of vehicles otherwise is divided into personage's classification;
Step 3-2, uniform data dimension: the color histogram matrix m for F image that step 2 is obtained c* F, LBP eigenmatrix m l* F and BOW eigenmatrix m b* F, eigenmatrix is 64 * 20, calls principal component analytical method and is reduced to unified dimension m, and in invention, establishing the maximum major component size of reservation is 64, m value 64 in inventing, this all eigenvectors matrix just becomes m * F;
Step 3-3, Fusion Features: establishing and having matrix T dimension is m * n, in invention, T size is 64 * 64, three eigenvectors matrix M 1, M 2, M 3project in space, matrix T place, can present space vector P generic in Figure 11 1, P 2projector distance in the T of space is very near, different classes of space vector P 1, P 3the far characteristic initialization T of projector distance in the T of space is vector of unit length, the content of the renewal matrix T of iteration, and concrete iterative process is as follows:
3-3-1, to matrix M 1, M 2, M 3carry out the positive triangle decomposition of matrix and upgrade matrix M i: T T M i = &phi;&Delta; i , M i &prime; = M i &Delta; i - 1 , i = 1 ~ 3 ;
3-3-2, to every a pair of M ' i, M ' jcarry out Singular Value Decomposition Using: i=1~3;
3-3-3, solution matrix T, compute matrix A = &Sigma; k 1 = 1 3 &Sigma; k 2 = 1 3 ( M k 1 &prime; Q k 1 k 2 - M k 2 &prime; Q k 2 ) ( M k 1 &prime; Q k 1 k 2 - M k 2 &prime; Q k 2 ) T The proper vector of compute matrix A λ is proper vector t in matrix A ieigenwert, by t iaccording to order sequence formation matrix T, i.e. T={t from big to small 1, t 2..., t n, the proper vector number that matrix A is different is here determined n size;
Repeating step 3-3-1~3-3-3, until T restrains, repeats above-mentioned steps 3-5 submatrix T and can restrain, wherein T tthe transposed matrix of representing matrix T, M ' irepresent M iinverse matrix, represent M ireverse put, the orthogonal matrix after the positive triangle decomposition of φ representing matrix, Δ ithe upper triangular matrix after the positive triangle decomposition of matrix, represent Δ iinverse matrix, Q ijthe unitary matrix of representing matrix svd, the svd of matrix, positive triangle decomposition, inverse of a matrix, transpose of a matrix method is called svd function, qr function, the symbol of inverting under matlab environment ' and ask transposition symbol T;
Step 3-4, prospect visual classification: by eigenvectors matrix M 1, M 2, M 3project in the space at T place, i.e. M i=T tm i, i gets 1~3, obtains new eigenvectors matrix M 1, M 2, M 3;
Step 3-5, color histogram feature clustering: color matrix M 1adopt lowest rank Subspace clustering method, the color histogram of different prospects is often presented on different data dimensions, as shown in figure 10, and distance in K-means method generally adopts Euclidean distance, be not suitable for the distance of color space, so adopt Subspace clustering method, can realize preferably category division, utilize lowest rank method to calculate the similarity w between every two field picture, in Figure 10, two data acquisitions belong to different subspaces, by lowest rank method, can distinguish this two different subspaces; Structural map image, using all foreground images as node, the similarity w between image is as weight, then adopts spectral clustering Ncut method to cut apart figure image, thereby completes the classification to image, and the computing method of similarity w are as follows:
3-5-1, initiation parameter λ 0, correlation matrix Z, the equivalent matrice J=0 of correlation matrix Z, Z=J, noise is corrected matrix E=0, Lagrangian matrix Y 1=0, Y 2=0, Lagrange punishment parameter μ=10 -6, maximum Lagrange punishment parameter m ax μ=10 10, Lagrange punishment parameter multiple ρ 0=1.1, constant ε=10 -8;
3-5-2, calculates M 1the correlation matrix equivalent matrice J of every column data: fix other matrix update matrix J, J = arg min = 1 &mu; | | J | | * + 1 2 | | J - ( Z + y 2 / &mu; ) | | F 2 ;
3-5-3, calculates M 1the correlation matrix Z of every column data: fix other matrix update matrixes Z, Z = ( I + M 1 t M 1 ) - 1 ( M 1 t M 1 - M 1 t t E + J + ( M 1 t Y 1 - Y 2 ) / &mu; ) ;
3-5-4, calculating noise is corrected matrix E: fix other matrix update matrixes E, E = arg min &lambda; 0 &mu; | | E | | 2,1 + 1 2 | | E - ( M 1 - M 1 Z + Y 1 / &mu; ) | | F 2 ;
3-5-5, calculates Lagrangian matrix Y 1, Y 2: Y 1=Y 1+ μ (M 1-M 1z-E), Y 2=Y 2+ μ (Z-J);
3-5-6, upgrades LaGrange parameter μ: μ=min (ρ 0μ, max μ);
3-5-7, judges whether iteration finishes: check || M 1-M 1z-E|| < ε, || Z-J|| whether < ε sets up, and finishes, otherwise continue iteration if set up iteration;
Wherein || || *represent nuclear norm, || || frepresent this norm of Fu Luo Benny crow, || || represent maximum norm, smaller value in A and B is returned in min (A, B) representative, and above-mentioned iterative process obtains matrix Z, the element Z in matrix Z i,j, Z j,isimilar value between sum representative image i, j, build non-directed graph image, the node of image i representative graph image, similarity between image i and image j represents the weight between node i and node j, adopt spectral clustering Ncut method that thereby figure image is cut apart and realized the classification between a plurality of prospect key frames, in invention, the center of spectral clustering is set to the number of different foreground people.
Step 3-6, LBP and BOW feature clustering: LBP and the difference of BOW characteristic on Spatial Dimension are very little, directly adopt K-means method just can obtain good result, to matrix M 2, M 3adopt K-means method to carry out cluster, K-means cluster centre number is set as the number of foreground people;
Step 3-7, carries out integrated study to result: by step 3-5,3-6,3-7 obtains the classification C under each two field picture under three features i, utilize three classification information ballots to determine the classification of each two field picture, as C 1, C 2, C 3for image f iclassification be respectively classification that 0,0,1 occurrence number is the highest 0 for image f iclassification, thereby all key frame f ican determine its classification information, thus all key frame f ican determine its classification information.Calculating prospect video υ iwith υ jbetween similarity for prospect video υ iif, its same υ jbetween similarity higher than the similarity of video with other, υ isame υ jbe divided into same class.The contrast discovery providing from Figure 12, the method in invention relatively directly adopts the method for feature clustering to improve accuracy.
Embodiment
The Experimental Hardware environment of the present embodiment is: Intel-Core2Duo i321003.1GHz, 4G internal memory, programmed environment is visual studio2010, opencv2.3, matlab R2012a, the monitor video of test use mainly comes from the monitor video in Campus Monitoring System.
Adopt Gauss model number K=3 in mixed Gauss model extraction prospect, matching threshold parameter δ=1.5, initial variance σ init 2=30 2, initial weight ω initget 0.3, learning rate α=1, key frame extraction parameter N=20, area threshold area thresh=800pixel, threshold speed speed thresh=25pixel/image, color histogram parameter m in characteristic extraction procedure cradius r=3 in=64, LBP feature, histogram parameter m l=64, BOW feature word number m b=64, length m=64 after feature employing PCA dimensionality reduction, columns n=64 of matrix T in Fusion Features process.
Histogram parameter is set to 64 can reduce computational data amount, and histogram parameter setting is greater than 64 and can causes cluster result to disperse and be greater than 64 to bring numerous and jumbled calculated amount, and histogram parameter setting is less than 64 phenomenons of likely bringing a plurality of classes to merge.So select 64 to be used for doing histogram column subregion number parameter, in experiment, for the personage's prospect in different scenes, in order to reduce the impact of different surround lightings, adopted for the sift feature of illumination robust and processed, improved the accuracy rate of classification.
The present invention has used without supervision typical coefficient merges the method that a plurality of features promote classification, adopt multiple clustering method to divide a class personage's prospect of profile and color similarity, improve the efficiency of consulting monitor video, to have classification accuracy high in the present invention in a word, effective information ratio is high, without the feature of artificial mark.

Claims (4)

1. a method for monitor video personage foreground segmentation and classification, is characterized in that, comprises the following steps:
Step 1, the prospect of monitoring of separation video and background: adopt prospect and the background of mixed Gauss model monitoring of separation video, and each prospect is surrounded with the minimum bounding box that can surround prospect completely, form independently prospect small video;
Step 2, the feature of extraction prospect small video: prospect small video is extracted to one group of key frame, each prospect is recorded to area and two features of translational speed, carry out human classification before first by automobile foreground classification, the key frame of personage's prospect is carried out after dilation and erosion operation, to each key-frame extraction color histogram, local two value tags and word bag feature;
Step 3, Fusion Features and classification: the Area and Speed to each prospect obtaining arranges threshold value, sort out automobile and obtain personage's prospect; To personage's foreground extraction color histogram feature, local two value tags and word bag feature, adopt the method for canonical correlation coefficient to carry out unsupervised Fusion Features to described three features, obtain one and distinguish inhomogeneous space T, three Projection Characters are arrived to space T, color histogram feature after projection is done to lowest rank subspace clustering, LBP after projection and BOW feature are done to K-means cluster, and according to the result of cluster, foreground people small video is classified.
2. the method for a kind of monitor video personage foreground segmentation as claimed in claim 1 and classification, is characterized in that, step 1 comprises the following steps:
Step 1-1, initialization Gauss model: the first two field picture that reads monitor video, for each pixel in image builds a mixed Gauss model that contains K Gauss model, K span 3~5, represents in monitor video in every two field picture that with K Gauss model each pixel j is at the value x of moment t j, pixel j is at moment t value x jprobability P (x j) can be determined by following formula:
Wherein represent the constantly weight of i gaussian component in the mixed Gauss model of pixel j of t, satisfied: with represent respectively t average and the covariance of i the gaussian component of pixel j constantly, represent Gaussian probability-density function, be expressed as follows:
Wherein d is x jdimension, for RGB color space, each pixel has 3 passages, x jfor tri-vector, covariance matrix wherein wherein be illustrated in t i the Gauss model variance of pixel j constantly, during initialization value is represent unit matrix, initial phase, the weights omega of each Gaussian distribution init=1/K,
Step 1-2, upgrades Gauss model: continue to read monitor video, often read a two field picture of monitor video and just mixed Gauss model is done and upgraded; By each gauss component in mixed Gauss model according to descending sequence, the pixel value x of the current new frame reading j, t+1if meet following formula with i Gauss model in mixed Gauss model:
Upgrade i gauss component, all the other gauss components remain unchanged, and pixel x j, t+1in present frame, be considered to background parts pixel, parameter δ is matching threshold, δ span 1~2, and the computing method of upgrading i gauss component are as follows:
Wherein α is the learning rate of mixed Gauss model, span 0~1, the learning rate that ρ is parameter alpha; If pixel x j, t+1all unmatched with K gauss component, judge that this pixel is the foreground pixel of present frame, to construct new gauss component and replace the gauss component after sequence is leaned on, the average of new gauss component is made as pixel x j, t+1value, standard deviation and weight are set to respectively σ initand ω init, average and the variance of the gauss component of reservation remain unchanged, and weighted value upgrades according to the following formula:
Step 1-3, completes video foreground and background segment work: pixel x j, t+1k gauss component upgrade after, weight to K gauss component is normalized, the step 1-1 and the 1-2 that repeat above retain the foreground pixel in every two field picture, until monitor video reads end, obtain the video of non-display background with the display foreground of the same resolution of original monitor video;
Step 1-4, extract the minimum bounding box that surrounds foreground people video: the monitor video obtaining in read step 1-3, every two field picture is first carried out to dilation and erosion operation, progressive scanning picture again, in document image, pixel value is not long l and the wide w of the rectangle that forms of 0 pixel, for same personage's prospect, the bounding box of each frame has long l and wide w, select l and w the longest in all frames, as the bounding box of this personage's prospect, obtain thus surrounding the small video of personage's prospect video.
3. the method for a kind of monitor video personage foreground segmentation as claimed in claim 2 and classification, is characterized in that, step 2 comprises the following step:
Step 2-1, the key frame of extraction personage prospect: fixedly choose the middle F frame of personage's video f 1, f 2..., f fimage is as key frame, and F gets 20~40;
Step 2-2, extracts color histogram information: to F frame f 1, f 2..., f fimage personage extracted region color characteristic histogram, the column subregion of establishing color histogram has m cindividual, computed image f ithe column subregion id that tri-Color Channel rgb values of middle pixel p are corresponding, i gets 1~F, and R represents red color channel value, and G represents green channel value, and B represents blue channel value, formula is as follows:
The number of adding up pixel in each column subregion id, obtains image f icolor histogram, it is m that color histogram is finally expressed as length cvectorial υ c, all key frames are repeated to this step, until obtain m cthe matrix M of * F 1;
Step 2-3, extracts local two value tags: calculate F frame f 1, f 2..., f fpart two value tags of image, first by image f igray processing, the radius of the portion's two value tag LBP operators of setting a trap is r, and r gets 3 or 4 or 5, with the window of r*r, in image, moves, and location of pixels of every movement just calculates window center pixel p one time centerlBP value, computing method are as follows: will with center pixel p centeran adjacent r*r pixel respectively with center pixel p centervalue compare, adjacent pixel values is greater than center pixel p center, the position of this pixel is marked as 1, otherwise is marked as 0, obtain thus r*r-1 bit, final window has obtained the LBP feature of whole image while moving to last center pixel position, then by the LBP feature histogram graph representation of image, establishing LBP histogram subregion has m lindividual, the height value of each component of histogram is together in series, obtain final part two value tags, length is m lvectorial υ 1, all key frames are repeated to this step, until obtain m lthe matrix M of * F 2;
Step 2-4, extracts word bag feature: first calculate F frame f 1, f 2..., f fthe dimension rotation invariant feature sift unique point of image, establishing word list length in word bag model is m b, adopt K-means clustering method that the close sift unique point of the meaning of a word is merged and obtains m bindividual class, class center forms the word list of word bag, more again with the vocabulary in word list, replaces each the yardstick invariant feature conversion sift unique point in each two field picture, and in statistics word list, the corresponding sift unique point of each vocabulary number, finally obtains image f ithe frequency of each vocabulary, length is m bvectorial υ b, all key frames are repeated to this step, until obtain m b* F matrix M 3;
Step 2-5, extracts area features and velocity characteristic: calculate F frame f 1, f 2..., f fthe area s of the prospect of each two field picture in image 1, s 2..., s fwith speed υ 1, υ 2..., υ f-1the area of prospect is also number of pixels non-zero in prospect, get the mean value of F display foreground area as the area value s of this prospect, the displacement of the center that prospect speed is surrounded frame by the rectangle of prospect in original monitor video determines, a displacement is determined in the encirclement frame center of every two two field pictures, F two field picture calculates F-1 speed, gets the intermediate value of speed as the speed v of this prospect.
4. the method for a kind of monitor video personage foreground segmentation as claimed in claim 3 and classification, is characterized in that, step 3 comprises the following step:
Step 3-1, arranges the threshold value of foreground area and speed, area threshold area thersh=800pixel, threshold speed speed thersh=25pixel/image, pixel represents pixel, image presentation video, the foreground partition that area features surpasses area threshold is class of vehicles, when area features does not surpass area threshold, the threshold value if prospect velocity characteristic outpaces, foreground partition is class of vehicles, otherwise foreground partition is personage's classification;
Step 3-2, uniform data dimension: the color histogram matrix m for F image that step 2 is obtained c* F, local two-value eigenmatrix m l* F and word bag eigenmatrix m b* F, calls principal component analysis (PCA) PCA method, is reduced to unified dimension m, and all eigenvectors matrixs become m * F;
Step 3-3, Fusion Features: establishing and having matrix T dimension is m * n, three eigenvectors matrix M 1, M 2, M 3project in space, matrix T place, similar space vector is near at the projector distance in space, matrix T place, the projector distance far characteristic of inhomogeneous space vector in space, matrix T place, initialization T is vector of unit length matrix, the content of the renewal matrix T of iteration, concrete iterative process is as follows:
3-3-1, to matrix M 1, M 2, M 3carry out the positive triangle decomposition of matrix and upgrade matrix M i:
3-3-2, to every a pair of matrix M ' i, M ' jcarry out Singular Value Decomposition Using:
3-3-3, solution matrix T: compute matrix the proper vector of compute matrix A λ is proper vector t in matrix A ieigenwert, by t iaccording to order sequence formation matrix T, i.e. TT{t from big to small 1, t 2..., t n, the different characteristic of matrix A vector number has been determined the size of n here;
Repeating step 3-3-1~3-3-3 is until T restrains, and wherein i span 1~3, T tthe transposed matrix of representing matrix T, M ' irepresent M iinverse matrix, represent M icontrary transposed matrix, the orthogonal matrix after the positive triangle decomposition of φ representing matrix, Δ ithe upper triangular matrix after the positive triangle decomposition of matrix, represent Δ iinverse matrix, Q ijthe unitary matrix of representing matrix svd, the row dimension n of T is determined by matrix A;
Step 3-4, prospect visual classification: by eigenvectors matrix M 1, M 2, M 3project in the space at T place, i.e. M i=T tm i, i gets 1~3, obtains new eigenvectors matrix M 1, M 2, M 3;
Step 3-5, color histogram feature clustering: color matrix M 1adopt lowest rank Subspace clustering method to calculate the similarity w between every two field picture, structural map image, using all foreground images as node, similarity w between image is as weight, then adopt spectral clustering Ncut method to cut apart figure image, thereby complete the classification to image, the computing method of similarity w are as follows:
3-5-1, initiation parameter λ 0, correlation matrix Z, the equivalent matrice J=0 of correlation matrix Z, Z=J, noise is corrected matrix E=0, Lagrangian matrix Y 1=0, Y 2=0, Lagrange punishment parameter μ=10 -6, maximum Lagrange punishment parameter m ax μ=10 10, Lagrange punishment parameter multiple ρ 0=1.1, constant ε=10 -8;
3-5-2, calculates M 1the correlation matrix equivalent matrice J of every column data: fix other matrix update matrix J,
3-5-3, calculates M 1the correlation matrix Z of every column data: fix other matrix update matrixes Z, Z=(I+M 1 tm 1) -1(M 1 tm 1-M 1 te+J+ (M 1 ty 1-Y 2)/μ);
3-5-4, calculating noise is corrected matrix E: fix other matrix update matrixes E,
3-5-5, calculates Lagrangian matrix Y 1, Y 2: upgrade matrix Y 1, Y 2, Y 1=Y 1+ μ (M 1-M 1z-E), Y 2=Y 2+ μ (Z-J);
3-5-6, upgrades Lagrange punishment parameter μ, μ=min (ρ 0μ, max μ);
3-5-7, judges whether iteration finishes: check || M 1-M 1z-E|| < ε, || Z-J|| whether < ε sets up, and finishes, otherwise continue iteration if set up iteration;
Wherein || || *represent nuclear norm, || || frepresent this norm of Fu Luo Benny crow, || || represent maximum norm, smaller value in A and B is returned in min (A, B) representative, and above-mentioned iterative process obtains matrix Z, the element Z in matrix Z i,j, Z j,isimilar value between sum representative image i, j, build non-directed graph image, the node of image i representative graph image, similarity between image i, j represents the weight between node i and node j, thereby adopts spectral clustering Ncut method that figure image is cut apart and realized the classification between a plurality of prospect key frames;
Step 3-6, local two value tags and word bag feature clustering: to matrix M 2, M 3adopt K-means method to carry out cluster;
Step 3-7, carries out integrated study to result: by step 3-5,3-6,3-7 obtains the classification C under each two field picture under three features i, utilize three classification information ballots to determine the classification of each two field picture, thus all key frame F ican determine its classification information, calculate prospect video υ iwith υ jbetween similarity
For prospect video υ iif, its same υ jbetween similarity higher than the similarity of video with other, υ isame υ jbe divided into same class.
CN201410108137.9A 2014-03-21 2014-03-21 A kind of monitor video personage's foreground segmentation and the method for classification Expired - Fee Related CN103985114B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410108137.9A CN103985114B (en) 2014-03-21 2014-03-21 A kind of monitor video personage's foreground segmentation and the method for classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410108137.9A CN103985114B (en) 2014-03-21 2014-03-21 A kind of monitor video personage's foreground segmentation and the method for classification

Publications (2)

Publication Number Publication Date
CN103985114A true CN103985114A (en) 2014-08-13
CN103985114B CN103985114B (en) 2016-08-24

Family

ID=51277072

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410108137.9A Expired - Fee Related CN103985114B (en) 2014-03-21 2014-03-21 A kind of monitor video personage's foreground segmentation and the method for classification

Country Status (1)

Country Link
CN (1) CN103985114B (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104573671A (en) * 2014-10-28 2015-04-29 清华大学 Method for discovering subject targets from video sequence
CN104881880A (en) * 2015-06-18 2015-09-02 福建师范大学 Shot segmentation method based on sequence characteristic and subspace clustering
CN105208398A (en) * 2015-09-22 2015-12-30 西南交通大学 Method for acquiring real-time background image of road
CN105373768A (en) * 2014-08-14 2016-03-02 三星电子株式会社 Method and apparatus for providing image contents
CN106056573A (en) * 2016-04-26 2016-10-26 武汉科技大学 Method for optimizing energy function in active contour model and application thereof
CN106649505A (en) * 2016-10-12 2017-05-10 厦门美图之家科技有限公司 Video matching method and application and computing equipment
CN107220982A (en) * 2017-04-02 2017-09-29 南京大学 It is a kind of to suppress the ship conspicuousness video detecting method that stern drags line
CN108022429A (en) * 2016-11-04 2018-05-11 浙江大华技术股份有限公司 A kind of method and device of vehicle detection
CN108229290A (en) * 2017-07-26 2018-06-29 北京市商汤科技开发有限公司 Video object dividing method and device, electronic equipment, storage medium and program
CN108418998A (en) * 2017-02-10 2018-08-17 佳能株式会社 System and method for generating virtual visual point image and storage medium
CN108596944A (en) * 2018-04-25 2018-09-28 普联技术有限公司 A kind of method, apparatus and terminal device of extraction moving target
CN108961304A (en) * 2017-05-23 2018-12-07 阿里巴巴集团控股有限公司 Identify the method for sport foreground and the method for determining target position in video in video
CN108960290A (en) * 2018-06-08 2018-12-07 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN108986061A (en) * 2018-06-28 2018-12-11 百度在线网络技术(北京)有限公司 Three-dimensional point cloud road data fusion method, device and storage medium
CN109223178A (en) * 2018-08-29 2019-01-18 合肥工业大学 Hysteroscope intelligence edge calculations system with target positioning function
CN109389582A (en) * 2018-09-11 2019-02-26 广东智媒云图科技股份有限公司 A kind of recognition methods of image subject brightness and device
CN109670486A (en) * 2019-01-30 2019-04-23 深圳前海达闼云端智能科技有限公司 A kind of face identification method based on video, device and calculate equipment
CN110120012A (en) * 2019-05-13 2019-08-13 广西师范大学 The video-splicing method that sync key frame based on binocular camera extracts
CN110147824A (en) * 2019-04-18 2019-08-20 微梦创科网络科技(中国)有限公司 A kind of automatic classification method and device of image
CN110472569A (en) * 2019-08-14 2019-11-19 旭辉卓越健康信息科技有限公司 A kind of method for parallel processing of personnel detection and identification based on video flowing
CN111105350A (en) * 2019-11-25 2020-05-05 南京大学 Real-time video splicing method based on self homography transformation under large parallax scene
CN111292333A (en) * 2018-12-07 2020-06-16 北京京东尚科信息技术有限公司 Method and apparatus for segmenting an image
CN111739084A (en) * 2019-03-25 2020-10-02 上海幻电信息科技有限公司 Picture processing method, atlas processing method, computer device, and storage medium
CN112634273A (en) * 2021-03-10 2021-04-09 四川大学 Brain metastasis segmentation system based on deep neural network and construction method thereof
WO2021068330A1 (en) * 2019-10-12 2021-04-15 平安科技(深圳)有限公司 Intelligent image segmentation and classification method and device and computer readable storage medium
CN112861572A (en) * 2019-11-27 2021-05-28 杭州萤石软件有限公司 Pedestrian detection method, computer-readable storage medium and electronic device
US11503228B2 (en) 2017-09-11 2022-11-15 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Image processing method, image processing apparatus and computer readable storage medium
US20230052101A1 (en) * 2020-01-30 2023-02-16 Nec Corporation Learning apparatus, learning method, and recording medium
CN116564460A (en) * 2023-07-06 2023-08-08 四川省医学科学院·四川省人民医院 Health behavior monitoring method and system for leukemia child patient
TWI816072B (en) * 2020-12-10 2023-09-21 晶睿通訊股份有限公司 Object identification method and related monitoring system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101976258B (en) * 2010-11-03 2013-07-10 上海交通大学 Video semantic extraction method by combining object segmentation and feature weighing
CN102982519B (en) * 2012-11-23 2015-04-01 南京邮电大学 Extracting and splicing method of video images

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105373768A (en) * 2014-08-14 2016-03-02 三星电子株式会社 Method and apparatus for providing image contents
CN105373768B (en) * 2014-08-14 2020-09-22 三星电子株式会社 Method and apparatus for providing image content
CN104573671B (en) * 2014-10-28 2018-02-02 清华大学 One kind finds theme mesh calibration method from video sequence
CN104573671A (en) * 2014-10-28 2015-04-29 清华大学 Method for discovering subject targets from video sequence
CN104881880A (en) * 2015-06-18 2015-09-02 福建师范大学 Shot segmentation method based on sequence characteristic and subspace clustering
CN104881880B (en) * 2015-06-18 2017-10-10 福建师范大学 A kind of shot segmentation method based on sequential nature and subspace clustering
CN105208398A (en) * 2015-09-22 2015-12-30 西南交通大学 Method for acquiring real-time background image of road
CN105208398B (en) * 2015-09-22 2018-06-19 西南交通大学 A kind of method for obtaining the real-time Background of road
CN106056573A (en) * 2016-04-26 2016-10-26 武汉科技大学 Method for optimizing energy function in active contour model and application thereof
CN106649505A (en) * 2016-10-12 2017-05-10 厦门美图之家科技有限公司 Video matching method and application and computing equipment
CN106649505B (en) * 2016-10-12 2020-04-07 厦门美图之家科技有限公司 Method, application and computing device for matching videos
CN108022429A (en) * 2016-11-04 2018-05-11 浙江大华技术股份有限公司 A kind of method and device of vehicle detection
CN108022429B (en) * 2016-11-04 2021-08-27 浙江大华技术股份有限公司 Vehicle detection method and device
CN108418998A (en) * 2017-02-10 2018-08-17 佳能株式会社 System and method for generating virtual visual point image and storage medium
US10699473B2 (en) 2017-02-10 2020-06-30 Canon Kabushiki Kaisha System and method for generating a virtual viewpoint apparatus
CN107220982A (en) * 2017-04-02 2017-09-29 南京大学 It is a kind of to suppress the ship conspicuousness video detecting method that stern drags line
CN108961304A (en) * 2017-05-23 2018-12-07 阿里巴巴集团控股有限公司 Identify the method for sport foreground and the method for determining target position in video in video
US11222211B2 (en) 2017-07-26 2022-01-11 Beijing Sensetime Technology Development Co., Ltd Method and apparatus for segmenting video object, electronic device, and storage medium
CN108229290B (en) * 2017-07-26 2021-03-02 北京市商汤科技开发有限公司 Video object segmentation method and device, electronic equipment and storage medium
CN108229290A (en) * 2017-07-26 2018-06-29 北京市商汤科技开发有限公司 Video object dividing method and device, electronic equipment, storage medium and program
US11503228B2 (en) 2017-09-11 2022-11-15 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Image processing method, image processing apparatus and computer readable storage medium
US11516412B2 (en) 2017-09-11 2022-11-29 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Image processing method, image processing apparatus and electronic device
CN108596944A (en) * 2018-04-25 2018-09-28 普联技术有限公司 A kind of method, apparatus and terminal device of extraction moving target
WO2019233266A1 (en) * 2018-06-08 2019-12-12 Oppo广东移动通信有限公司 Image processing method, computer readable storage medium and electronic device
CN108960290A (en) * 2018-06-08 2018-12-07 Oppo广东移动通信有限公司 Image processing method, device, computer readable storage medium and electronic equipment
CN108986061B (en) * 2018-06-28 2019-09-20 百度在线网络技术(北京)有限公司 Three-dimensional point cloud road data fusion method, device and storage medium
CN108986061A (en) * 2018-06-28 2018-12-11 百度在线网络技术(北京)有限公司 Three-dimensional point cloud road data fusion method, device and storage medium
CN109223178A (en) * 2018-08-29 2019-01-18 合肥工业大学 Hysteroscope intelligence edge calculations system with target positioning function
CN109389582A (en) * 2018-09-11 2019-02-26 广东智媒云图科技股份有限公司 A kind of recognition methods of image subject brightness and device
CN109389582B (en) * 2018-09-11 2020-06-26 广东智媒云图科技股份有限公司 Method and device for identifying brightness of image main body
CN111292333A (en) * 2018-12-07 2020-06-16 北京京东尚科信息技术有限公司 Method and apparatus for segmenting an image
CN111292333B (en) * 2018-12-07 2024-05-17 北京京东尚科信息技术有限公司 Method and apparatus for segmenting an image
CN109670486A (en) * 2019-01-30 2019-04-23 深圳前海达闼云端智能科技有限公司 A kind of face identification method based on video, device and calculate equipment
US12045946B2 (en) 2019-03-25 2024-07-23 Shanghai Hode Information Technology Co., Ltd. Rotating and cropping images for various forms of media including animation, comics, film, or television
CN111739084A (en) * 2019-03-25 2020-10-02 上海幻电信息科技有限公司 Picture processing method, atlas processing method, computer device, and storage medium
CN111739084B (en) * 2019-03-25 2023-12-05 上海幻电信息科技有限公司 Picture processing method, atlas processing method, computer device, and storage medium
CN110147824A (en) * 2019-04-18 2019-08-20 微梦创科网络科技(中国)有限公司 A kind of automatic classification method and device of image
CN110120012B (en) * 2019-05-13 2022-07-08 广西师范大学 Video stitching method for synchronous key frame extraction based on binocular camera
CN110120012A (en) * 2019-05-13 2019-08-13 广西师范大学 The video-splicing method that sync key frame based on binocular camera extracts
CN110472569A (en) * 2019-08-14 2019-11-19 旭辉卓越健康信息科技有限公司 A kind of method for parallel processing of personnel detection and identification based on video flowing
WO2021068330A1 (en) * 2019-10-12 2021-04-15 平安科技(深圳)有限公司 Intelligent image segmentation and classification method and device and computer readable storage medium
CN111105350A (en) * 2019-11-25 2020-05-05 南京大学 Real-time video splicing method based on self homography transformation under large parallax scene
CN111105350B (en) * 2019-11-25 2022-03-15 南京大学 Real-time video splicing method based on self homography transformation under large parallax scene
CN112861572A (en) * 2019-11-27 2021-05-28 杭州萤石软件有限公司 Pedestrian detection method, computer-readable storage medium and electronic device
CN112861572B (en) * 2019-11-27 2024-05-28 杭州萤石软件有限公司 Pedestrian detection method, computer-readable storage medium, and electronic device
US20230052101A1 (en) * 2020-01-30 2023-02-16 Nec Corporation Learning apparatus, learning method, and recording medium
TWI816072B (en) * 2020-12-10 2023-09-21 晶睿通訊股份有限公司 Object identification method and related monitoring system
CN112634273A (en) * 2021-03-10 2021-04-09 四川大学 Brain metastasis segmentation system based on deep neural network and construction method thereof
CN116564460A (en) * 2023-07-06 2023-08-08 四川省医学科学院·四川省人民医院 Health behavior monitoring method and system for leukemia child patient
CN116564460B (en) * 2023-07-06 2023-09-12 四川省医学科学院·四川省人民医院 Health behavior monitoring method and system for leukemia child patient

Also Published As

Publication number Publication date
CN103985114B (en) 2016-08-24

Similar Documents

Publication Publication Date Title
CN103985114A (en) Surveillance video person foreground segmentation and classification method
CN109670446B (en) Abnormal behavior detection method based on linear dynamic system and deep network
CN104867161B (en) A kind of method for processing video frequency and device
Narihira et al. Learning lightness from human judgement on relative reflectance
CN111368712A (en) Hyperspectral image disguised target detection method based on deep learning
Azim et al. Layer-based supervised classification of moving objects in outdoor dynamic environment using 3D laser scanner
EP3048561A1 (en) Method and system to perform text-to-image queries with wildcards
CN112016605B (en) Target detection method based on corner alignment and boundary matching of bounding box
CN108765279A (en) A kind of pedestrian&#39;s face super-resolution reconstruction method towards monitoring scene
CN104408745A (en) Real-time smog scene detection method based on video image
Li et al. A generative/discriminative learning algorithm for image classification
CN105528794A (en) Moving object detection method based on Gaussian mixture model and superpixel segmentation
CN103871077B (en) A kind of extraction method of key frame in road vehicles monitoring video
CN103617414B (en) The fire disaster flame of a kind of fire color model based on maximum margin criterion and smog recognition methods
CN112990282B (en) Classification method and device for fine-granularity small sample images
CN112215190A (en) Illegal building detection method based on YOLOV4 model
CN103020265A (en) Image retrieval method and system
CN111275058B (en) Safety helmet wearing and color identification method and device based on pedestrian re-identification
CN112364791B (en) Pedestrian re-identification method and system based on generation of confrontation network
CN106033548B (en) Crowd abnormity detection method based on improved dictionary learning
CN111274964B (en) Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle
JP2015149030A (en) Video content violence degree evaluation device, video content violence degree evaluation method, and video content violence degree evaluation program
CN110263731B (en) Single step human face detection system
Liu et al. A novel SVM network using HOG feature for prohibition traffic sign recognition
CN107679467B (en) Pedestrian re-identification algorithm implementation method based on HSV and SDALF

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160824

CF01 Termination of patent right due to non-payment of annual fee