CN110097579A

CN110097579A - Multiple dimensioned wireless vehicle tracking and device based on pavement texture contextual information

Info

Publication number: CN110097579A
Application number: CN201910514897.2A
Authority: CN
Inventors: 孔斌; 赵富强; 杨静; 王灿
Original assignee: Hefei Institutes of Physical Science of CAS
Current assignee: Hefei Institutes of Physical Science of CAS
Priority date: 2019-06-14
Filing date: 2019-06-14
Publication date: 2019-08-06
Anticipated expiration: 2039-06-14
Also published as: CN110097579B

Abstract

The present invention relates to a kind of multiple dimensioned wireless vehicle trackings based on pavement texture contextual information, include: S1, for linear space road texture the case where, obtain the center of target vehicle, S2, for non-linear space road texture the case where, in dual spaces obtain target vehicle center；Behind S3, acquisition target vehicle center, and in conjunction with the image optimal scale of present frame, more accurate target vehicle position is obtained；The invention also discloses a kind of multiple dimensioned car followers based on pavement texture contextual information.The pavement texture region of combining target vehicle bottom of the present invention, in target vehicle motion process, the relative position on target vehicle and road surface will not vary widely, pavement texture is more stable, in combination with pavement texture information, i.e. according to the relativeness of target vehicle and road surface region, it is accurately positioned target vehicle, prevents target frame from drifting about.

Description

Multi-scale vehicle tracking method and device based on pavement texture context information

Technical Field

The invention relates to the field of computer vision, in particular to a multi-scale vehicle tracking method and device based on road texture context information.

Background

Machine Learning (ML) is a multi-domain cross subject, and relates to multi-domain subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like, and target tracking belongs to important application in the field of Machine Learning.

The invention patent with publication number "CN 108776974B" (application date 2018.05.24) discloses "a real-time target tracking method suitable for public traffic scenes, which comprises the following steps: step 1, acquiring an initial position P (i) of a tracked target on the current ith frame by a detector; step 2, training a relevant filtering tracker by using P (i); step 3, acquiring an image of the target in the (i +1) th frame; step 4, carrying out correlation calculation by using a correlation filter and the (i +1) th frame to obtain a target prediction position P' (i + 1); and 5, evaluating the scale change rate, and judging whether the target predicted value needs to be corrected or not according to a threshold value. And 6, correcting the predicted value by using Kalman filtering to obtain the position P (i +1) of the target in the (i +1) th frame. According to the method, the size change rate is evaluated, the tracking accuracy and the real-time performance are improved, meanwhile, the target predicted value is corrected through Kalman filtering, and the influence of scale change is minimized. "

However, the method only utilizes the target area to realize tracking, and in a traffic scene, when a target vehicle is shielded, the tracking accuracy is low, the average overlapping rate of the output result rectangular frame and the real target rectangular frame is low, and thus the target cannot be effectively tracked.

The existing tracking method only singly utilizes the characteristics of the target vehicle, when the target vehicle is shielded or motion-blurred, the frame selection area formed around the target vehicle to be positioned may deviate, so that the accuracy of predicting the position of the target vehicle is low.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a multi-scale vehicle tracking method and device based on the contextual information of the road texture, so as to solve the problem of low accuracy of the position of the target vehicle in the background art.

In order to solve the above problems, the present invention provides the following technical solutions:

a multi-scale vehicle tracking method based on contextual road texture information comprises the following steps:

s1, acquiring the central position of the target vehicle under the condition of linear space road texture;

s2, for the condition of the nonlinear space road texture, acquiring the central position of the target vehicle in the dual space;

and S3, obtaining the central position of the target vehicle, and combining the central position with the optimal scale of the image of the current frame to obtain a more accurate position of the target vehicle.

As a further scheme of the invention: the step S1 includes:

processing a plurality of background areas of the target vehicle by using a least square method to obtain a ridge regression formula, and performing cyclic shift merging simplification on ridge regression formula parts corresponding to the plurality of background areas; calculating the simplified ridge regression formula to obtain a first weight parameter matrix in the frequency domainUsing a first weight parameter matrixObtaining a first response matrix R, and carrying out Fourier transform on the first response matrix R to obtain the first response matrix in the frequency domainCalculating a first response in the frequency domainMatrix arrayThe index corresponding to the maximum response value, namely the center position of the corresponding target vehicle;

wherein, the plurality of background areas comprise road surface texture areas.

As a further scheme of the invention: the step S1 further includes:

the obtained ridge regression formula is shown as the formula (1):

wherein, it is calledRepresents a two-norm, A₀A feature matrix representing samples after cyclic shift of the target vehicle; a. the₁A feature matrix representing samples after cyclic displacement of a road surface area under a target vehicle; a. the_iA feature matrix representing samples after cyclic shift of a background area of a left area or an upper end area or a right area of the target vehicle;

λ₁representing the proportion of the information of the texture area of the road surface in the training process; lambda [ alpha ]₂Represents A in the training process_iProportion of corresponding noise area, lambda₃Representing a regularization parameter and controlling the complexity of the first weight parameter matrix; y is₀A two-dimensional Gaussian matrix label representing a target vehicle; w represents a first weight parameter matrix needing regression; k is a radical of₁Representing the number of background areas around the target vehicle;

the formula (1) is divided into the following parts: the first part isRepresenting training the target area as a positive sample; the second part isThe road area under the target is used as a positive sample for training, and the parameter lambda is₁Controlling the degree of contribution to the loss; the third part isRepresenting the sum of the noise training with the left, upper and right regions of the target vehicle as the parameter lambda₂Controlling the degree of contribution to the loss; the fourth part isRepresents the complexity of the control weight parameter through training by regularization, which is represented by a parameter lambda₃Controlling the degree of contribution to the loss;

then, combining the areas corresponding to the formula (1) to obtain a simplified formula (2):

wherein B andis represented by formula (3):

b is a cyclic matrix, and B is a cyclic matrix,a label matrix, which represents label values corresponding to samples in the sample space of each part;

by makingGet the first rightThe weight parameter matrix w is as shown in equation (4):

wherein, B^TWhich is the transpose of B, I denotes the identity matrix,expressing that equation (2) derives w to be equal to zero;

fourier transformation is respectively carried out on two sides of the formula (4) to obtain a first weight parameter matrix in the frequency domainThe following were used:

wherein, a_i⊙ represents the dot product operation;is a vector a_iAfter fourier transform processing, in the frequency domain;to representThe first response matrix in the frequency domainThe acquisition method comprises the following steps:

respectively expanding outwards by taking the position of the target vehicle in the previous frame as the center by taking the width N times and the height N times of the target vehicle, and taking the expanded region as a search region;

according to the property of the cyclic matrix, in the search area, cyclic shift is carried out in the horizontal direction and the vertical direction by taking pixels as units, a sample to be detected is obtained by each cyclic shift, the sample to be detected forms a sample space to be detected, and a sample characteristic matrix formed by characteristic values of the sample space to be detected uses Z₁Representing, sample feature matrix Z₁Performing matrix operation with the first weight parameter matrix w to obtain a first response matrix R, as shown in formula (6)

R＝Z₁w (6)

And performing fourier transform on the first response matrix R to obtain equation (7), where equation (7) is as follows:

wherein,representing the form in the frequency domain after fourier transformation;representing the form in the frequency domain after fourier transformation,representing the form of the response matrix R in the frequency domain after Fourier transform processing to obtainNamely, a first response matrix R is obtained, and the center position of the target vehicle is determined.

As a further scheme of the invention: the step S2 includes: by kernel functionNon-linear samplesMapping the samples in the space to a linearly separable dual space, and combining the eigenvectors of all the samples in the dual space to obtain a dual space weight parameter matrix w_DualTraining a dual spatial weight parameter matrix w_DualObtaining a second weight parameter matrix in the frequency domainSolving a second response matrix in the frequency domainAnd the index corresponding to the maximum response value is the center position of the target vehicle.

As a further scheme of the invention:

1. the step S2 further includes:

several regions of the target vehicle pass through the kernel function in dual spaceAfter mapping, the corresponding sample space is as shown in equation (8):

wherein i represents an integer of zero or more and less than the number of target vehicle regions, a_imFeature vector, A, of the m-th sample representing the i-th region_iRepresenting a feature matrix formed by samples of the ith area;is a feature matrix A_iA representation mapped into a linearly separable dual space,mapping the circulant matrix B to a representation in a linearly separable dual space;mapping the feature vector of the mth sample of the ith area to a representation form in a linearly separable dual space;

calculating a dual spatial weight parameter matrix w using equation (9)_DualThe formula (9) is as follows:

wherein,to representα denotes a second weight parameter matrix, α_iRepresenting the ith column vector in the second weight parameter matrix, b_iA feature vector representing the ith sample,the eigenvector representing the ith sample is mapped to the eigenvector in the high-dimensional linear space, and 5m represents the dimension of the second weight parameter matrix;

obtaining a second weight parameter matrix in the frequency domainThe method comprises the following steps:

solving the second weight parameter matrix α in the frequency domain in conjunction with the loss function J (α) the loss function J (α) is shown in equation (10):

order toThe second weight parameter matrix α in the frequency domain is obtained by solving, as shown in equation (11)

Wherein,K₂is a matrix; by selecting kernel functionsSo that the matrix K₂The medium elements are changed in sequence, and the guarantee matrix K₂Still a circulant matrix;

then will beSubstituted typeObtaining a second weight parameter matrix in the frequency domain of the second weight parameter matrix α in dual spaceThe formula is shown in formula (12):

wherein, Delta_ij＝diag(m_ij+λ₃),{(i,j)∈{(0,0)}}

Δ_ij＝λ₁diag(m_ij+λ₃),{(i,j)∈{(1,1)}}

Δ_ij＝λ₂diag(m_ij+λ₃),{(i,j)∈{(2,2),(3,3),(4,4)}}

diag(m_ij) The diagonal matrix corresponding to the ith row and the jth column of the block diagonal matrix is shown as formula (13)

Wherein, a_i0A 0 th sample feature vector representing an ith region; a is_j0A 0 th sample feature vector representing a jth region; a is_jmAn m-th sample feature vector representing a j-th region; kappa (a)_i0,a_j0) Representing a kernel function between the 0 th sample feature vector of the ith region and the 0 th sample feature vector of the jth region; k (a)_i0m，a_jm) Representing a kernel function between the m-th sample feature vector of the i-th region and the m-th sample feature vector of the j-th region,

finally solving a second response matrix in the frequency domainObtaining a second response matrix in the frequency domainThe maximum response value is the center position of the target vehicle.

2. As a further scheme of the invention: solving a second response matrix in the frequency domainThe calculation formula is as (14):

Z₂represents: taking the position of the target vehicle in the previous frame as a center, respectively expanding outwards by N times of the width and the height of the target vehicle, taking the area after the outward expansion as a search area, in the search area, taking pixels as units, and circularly shifting transversely and longitudinally to obtain a sample space to be detected, wherein a sample characteristic matrix of the sample space uses Z₂；Is Z₂Representation in the frequency domain after fourier transform;and representing the parameter weight vector corresponding to the ith area in a frequency domain after Fourier transform.

As a further scheme of the invention: the step S3 includes: extracting different scale images of a plurality of target vehicles as training samples, training the scale sequence of the training samples by a least square method to obtain loss functions J (H), and processing the loss functions J (H) by Fourier transform to obtain J (H)^*) To J (H)^*) Calculating to obtain a third weight parameter matrix H in the frequency domain; then obtaining a scale response matrix R_sCalculating a scale response matrix R_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal image scale of the current frame, and a more accurate target vehicle position is obtained, namely J (H) is obtained^*) Calculating to obtain a third weight parameter matrix H in the frequency domain; then obtaining a scale response matrix R_sCalculating a scale response matrix R_sMedian maximum value pairThe scale corresponding to the index is the optimal image scale of the current frame, and the more accurate position of the target vehicle is obtained.

As a further scheme of the invention:

the step S3 further includes:

setting the size of an image of a target vehicle in a current frame as W multiplied by H, extracting a plurality of images with different scales as training samples and recording the training samples as S, wherein the scale sequence of the training samples is shown as a formula (15):

wherein b represents a scale factor; w represents the width of the rectangular frame area of the target vehicle; h represents the height of the rectangular frame area of the target vehicle; s represents the number of samples in the sample space;is a rounded-down symbol;

the loss function obtained by calculating equation (15) by the least square method is shown in equation (16):

wherein y' represents a label vector generated by a one-dimensional Gaussian function, h represents a scale estimation weight parameter matrix, and f represents a sample set characteristic matrix extracted at different scales;

fourier transform is carried out on J (H) to obtain J (H)^*) As shown in equation (17):

wherein, Y'_iFourier transform for label vector yThe value of the ith element of the corresponding vector in the frequency domain; h^*Performing Fourier transform on the scale estimation weight parameter matrix h to represent a conjugate transpose in a frequency domain; f_iA representation form of Fourier transform of an ith sample feature vector in a sample feature matrix f in a frequency domain; sigma is a summation symbol, and the value of n is the same as that of S;

solving for J (H)^*) The function obtains a third weight parameter matrix H, and a scale response matrix R is calculated by using a formula (19)_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal scale of the current frame, and equation (19) is as follows:

R_s＝F·H (19)

f is a representation form of a sample characteristic matrix F in a frequency domain after Fourier transformation;is composed ofThe conjugate transpose of (1); r_sA scale response matrix R representing a calculated target vehicle_sThe scale response matrix R_sAnd the scale corresponding to the index corresponding to the medium maximum value is the optimal scale of the current frame, so that the more accurate position of the target vehicle is obtained.

As a further scheme of the invention: the solution J (H)^*) The method of the function is as follows:

order toAnd solving to obtain a third weight parameter matrix H in the frequency domain, as shown in formula (18):

as a further scheme of the invention: a multi-scale vehicle tracking device based on contextual road texture information, comprising:

the first acquisition module is used for acquiring the central position of a target vehicle under the condition of linear space road texture;

the second acquisition module is used for acquiring a central position dual space module of the target vehicle in a dual space under the condition of the road texture in the nonlinear space;

and the scale module is used for obtaining the central position of the target vehicle and combining the central position with the optimal scale of the image of the current frame to obtain a more accurate position of the target vehicle.

Compared with the prior art, the invention has the beneficial effects that:

1. according to the method, the relative position of the target vehicle and the road surface is not changed greatly in the moving process of the target vehicle by combining the road surface texture area at the bottom of the target vehicle, and the road surface texture is stable in the moving process, so that the target vehicle is accurately positioned according to the relative relation between the target vehicle and the road surface area by combining the road surface texture information, the target frame is prevented from drifting, the average overlapping rate of the output result rectangular frame and the real target rectangular frame is the maximum, the average tracking failure frequency is relatively small, the average expected overlapping rate is the maximum, and the accuracy is high.

2. The existing target tracking method can not effectively and quickly combine the context information around the target with the scale prediction of the target, so that when the scale of the target changes, the existing tracking method can not accurately position the target boundary, and the weight parameters in the subsequent frames can not effectively learn the complete characteristics of the target, thereby causing tracking failure.

3. According to the method, the upper, left and right environmental information of the target vehicle is extracted and used as noise samples to be restrained in the training process, the road texture information of the area below the target vehicle is used as a positive sample for auxiliary positioning and is trained through a ridge regression algorithm to obtain a first weight parameter matrix w, a second weight parameter matrix α and a scale estimation weight parameter matrix h, and the center position of the target vehicle in a subsequent frame is determined by utilizing the first weight parameter matrix w and the second weight parameter matrix α.

4. Multiple experiments show that the specific gravity lambda of the contextual information of the texture of the road surface₁When the average tracking failure frequency is 0.6, compared with the traditional tracking algorithm based on the target, the average overlapping rate of the output result rectangular frame of the algorithm and the real target rectangular frame is maximum, the average tracking failure frequency is relatively small, the average expected overlapping rate is maximum, and the performance of the method is better under the condition that the target is shielded; the contextual information of the pavement texture area is combined with a multi-scale method, and the tracking method can be more accurately adapted to the scale change of the target, so that the failure times of the tracking method are effectively reduced.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention.

Fig. 1 is a flow chart of a multi-scale vehicle tracking method based on contextual road texture information according to embodiment 1 of the present invention.

Fig. 2 is a schematic diagram illustrating a comparison of areas of target vehicles in a multi-scale vehicle tracking method based on contextual information of road surface texture according to embodiment 1 of the present invention.

Fig. 3 is a schematic structural diagram of a multi-scale vehicle tracking device based on contextual road texture information according to embodiment 2 of the present invention.

Detailed Description

In order to make the technical problems, technical solutions and advantageous effects to be solved by the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Example 1

A multi-scale vehicle tracking method based on contextual road texture information comprises the following steps: the method for tracking the spatial road texture in the context correlation filtering mode comprises the following specific steps:

processing a plurality of background areas of the target vehicle by using a least square method to obtain a ridge regression formula, wherein the least square method is a common algorithm in machine learning and is not described here; wherein, the plurality of background areas comprise road texture areas;

taking into account information of the road surface area under the target vehicle and other directions around the target vehicle, as shown in fig. 1, fig. 1 is a schematic diagram of a relative relationship between the target vehicle area and the surrounding area, where a₀The corresponding zone represents the central zone of the target vehicle, A₁The corresponding areas represent road surface texture areas, A₂The corresponding region represents the noise region on the left side of the target vehicle, A₃The corresponding region represents a noise region of the upper end of the target vehicle, A₄The corresponding region represents a noise region on the right side of the target vehicle;

the ridge regression formula is shown in formula (1):

formula (1) is divided into four parts:

the first part isRepresenting training the target area as a positive sample; the second part isThe road area under the target is used as a positive sample for training, and the parameter lambda is₁Controlling the degree of contribution to the loss; the third part isRepresenting the sum of the noise training with the left, upper and right regions of the target vehicle as the parameter lambda₂Controlling the degree of contribution to the loss; the fourth part isRepresents the complexity of the control weight parameter through training by regularization, which is represented by a parameter lambda₃Controlling the degree of contribution to the loss;

wherein,represents a two-norm, A₀A feature matrix representing samples after cyclic shift of the target vehicle; a. the₁A feature matrix representing samples after cyclic displacement of a road surface area under a target vehicle; a. the_iA feature matrix representing samples after cyclic shifting of background areas on the left side, the upper end and the right side of the target vehicle; lambda [ alpha ]₁Represents the proportion of the information of the texture area of the road surface in the training process, lambda₂A representing the surroundings of the target vehicle during training₂Proportion of corresponding noise area (A)₃、A₄Respectively corresponding noise regionsSpecific gravity of and A₂The proportion of the corresponding noise areas is the same); lambda [ alpha ]₃Representing a regularization parameter and controlling the complexity of the first weight parameter matrix;

y₀a two-dimensional Gaussian matrix label representing a target vehicle; w represents a first weight parameter matrix needing regression; k is a radical of₁Indicates the number of background regions around the target vehicle, k in the present embodiment₁The value is four; a. the₀、A₁、A₂、A₃、A₄The corresponding five regions can be independently calculated, and the parts corresponding to the five regions are combined to obtain a simplified formula (2):

wherein B is a circulant matrix, in A₀、A₁、A₂、A₃、A₄In the corresponding five regions, circularly moving a position in the horizontal direction and the longitudinal direction by unit pixels to obtain a training sample space, and then calculating HOG characteristics of the samples to form a sample matrix, namely a circular matrix B; the HOG (Histogram of Oriented gradients) feature is a feature description used for object detection in computer vision and image processing, and is formed by calculating and counting the Histogram of Gradient orientation of local areas of an image; y is a label matrix and represents label values corresponding to samples in the sample space of each part; b andis represented by formula (3):

equation (2) is a convex function, where the optimal solution exists in the real number domain, by lettingThereby obtaining a first weight parameter matrix w, as shown in equation (4):

B^Twhich is the transpose of B, I denotes the identity matrix,a function representing the composition of w and B differentiates w to be zero; at the same time, A₀、A₁、A₂、A₃、A₄The corresponding five regions are mutually independent, and the feature matrix of each region can be calculated in parallel according to the property of the cyclic matrix, so that the real-time performance of the algorithm can be improved, and the calculation efficiency is accelerated;

after Fourier transformation is respectively carried out on two sides of the formula (4), the formula (5) is obtained:

wherein, a_iA vector formed by a first row of a sample characteristic matrix in a sample space corresponding to the i-th area;is a vector a_iAfter fourier transform processing, in the frequency domain;to representThe conjugate transpose of (c).

Acquiring a first response matrix R, wherein an index corresponding to the maximum response value in the first response matrix R is the position of the center of the target vehicle of the current frame image; the position of the target vehicle in the previous frame is used as a center, and the target vehicle expands outwards along the periphery by N times of width and N times of height of the target vehicle respectively to be used as a search area;

in this embodiment, the value of N is 2.5 times, a position is cyclically moved in the horizontal and vertical directions in a search area by taking a pixel as a unit to obtain a sample space to be detected, and a sample feature matrix to be detected is represented by Z₁Representing, sample feature matrix Z₁Performing matrix operation with the first weight parameter matrix w to obtain a response matrix R;

R＝Z₁w (6)

as shown in formula (6), the index corresponding to the maximum response value in the first response matrix R is the position of the center of the target vehicle in the current frame image, and the target vehicle can be tracked; fourier transform is carried out on the first response matrix R, and the formula (6) is equal to the formula (7) in horizontal direction; the Fourier transform converts the time domain signal (waveform) which is difficult to process originally into a frequency domain signal (frequency spectrum of the signal) which is easy to analyze, thereby facilitating analysis and calculation; the formula (7) is as follows:

representing the form in the frequency domain after fourier transformation;representing the form in the frequency domain after fourier transformation,representing the form of the first response matrix R in the frequency domain after fourier transform processing, ⊙ representing a dot product operation.

the classification regression problem is generally classified into a linear problem and a nonlinear problem, and for the linear problem, the solution is directly performed through a linear function, while for the nonlinear problem, the sample space needs to be converted into a new linear space, i.e., a dual space, so that the samples can be linearly divided in the dual space, and the problem can be converted into a linear divisible problem.

By kernel functionAnd mapping the samples in the nonlinear sample space to a linearly separable dual space, wherein the sample space corresponding to the five regions in the dual space is shown as the formula (8):

wherein i is 0, 1, 2, 3, 4, a_imFeature vector, A, of the m-th sample representing the i-th region_iA feature matrix formed by samples representing the ith region,is a feature matrix A_iA representation mapped into a linearly separable dual space,mapping the circulant matrix B to a representation in a linearly separable dual space;mapping the feature vector of the mth sample of the ith area to a representation form in a linearly separable dual space;

in dual space, a dual space weight parameter matrix w_DualMatrix w of dual spatial weight parameters_DualIs represented by a linear combination of the feature vectors of all samples, as shown in equation (9):

wherein,to representα denotes a second weight parameter matrix, α_iRepresenting the ith column vector in the second weight parameter matrix, b_iA feature vector representing the ith sample,the feature vector representing the ith sample is mapped to a feature vector in a high-dimensional linear space,alpha expression matrix5m represents the dimension of the second weight parameter matrix;

in dual space, the second weight parameter matrix α is solved using a loss function J (α), the loss function J (α) is shown as equation (10):

by makingThat is, the derivative of the second weight parameter matrix α of the function composed of the second weight parameter matrix α and B is equal to zero, so as to solve the second weight parameter matrix α, as shown in equation (11)

Wherein, the matrixK₂Is a matrix;

by selecting appropriate kernel functionsSo that the matrix K₂The medium elements are changed in sequence, the calculation result of the kernel function is not influenced, and the matrix K is ensured₂Still a circulant matrix, several classes of kernel functions satisfy this property:

polynomial kernel function: k (x, y) ═ f (x)^Ty)；

Will be provided withSubstituted typeMeanwhile, according to the properties of the block circulant matrix and the Fourier diagonalization properties of the circulant matrix, a second weight parameter matrix α in the frequency domain in the dual space is finally obtainedThe formula is shown in formula (12):

wherein, Delta_ij＝diag(m_ij+λ₃),{(i,j)∈{(0,0)}}

Δ_ij＝λ₁diag(m_ij+λ₃),{(i,j)∈{(1,1)}}

Δ_ij＝λ₂diag(m_ij+λ₃),{(i,j)∈{(2,2),(3,3),(4,4)}}

Wherein, a_i0A 0 th sample feature vector representing an ith region; a is_j0A 0 th sample feature vector representing a jth region; a is_jmAn m-th sample feature vector representing a j-th region; kappa (a)_i0,a_j0) Representing a kernel function between the 0 th sample feature vector of the ith region and the 0 th sample feature vector of the jth region;representing a kernel function between an m-th sample feature vector of the i-th region and an m-th sample feature vector of the j-th region;

using the second weightParameter matrix α and frequency domainSolving for a second response matrix in the frequency domainAs shown in formula (14);

the target position of the frame is used as the center, the frame is respectively outwards expanded by N times of the width and the height of a target vehicle and is used as a search area, N in the embodiment is 2.5, in the search area, pixels are used as units, the frame is circularly moved by one position in the transverse direction and the longitudinal direction to obtain a sample space to be detected, and a sample characteristic matrix to be detected uses Z₂Representing, by a sample feature matrix Z₂Is brought into (14) with a second weight parameter matrix α, and a second response matrix in the frequency domain is solved in the frequency domainSecond matrixThe index corresponding to the medium and maximum response value is the position of the center of the target vehicle of the current frame image;

wherein Z is₂A feature matrix of a sample space that is a search area;is Z₂Representation in the frequency domain after fourier transform;representing the parameter weight vector corresponding to the ith area in a frequency domain after Fourier transform;

s3, introducing an adaptive scale model;

in the vehicle tracking process of the road environment, the scale of the front vehicle collected by the vehicle-mounted camera is easy to change due to the change of the relative speed of the front vehicle and the current vehicle, and the traditional nuclear correlation filter cannot adapt to the condition that the scale of the target vehicle is changed greatly. Therefore, an adaptive scale model is introduced on the basis of the traditional kernel correlation filtering.

The problem related to the scale is solved by adopting an image pyramid or filter pyramid model, and the scale estimation is carried out by sampling images with different scales at certain intervals, but the problem of sharply increased calculated amount is caused, and the real-time performance of a tracking algorithm is influenced. Therefore, according to the characteristics of target tracking in a road environment, a dimension processing scale problem is added, and the steps are as follows:

extracting a plurality of images with different scales as training samples; calculating a scale sequence of the training sample by a least square method;

the size of the target vehicle image in the current frame is W multiplied by H, and a plurality of samples of the sample space are extracted and recorded as S;

preferably, the number of samples in this embodiment is selected to be 33, that is, 33 images with different scales are extracted as training samples, and the scale sequence size of the training samples is as shown in equation (15):

wherein: b represents a scale factor, wherein the experimental value in the embodiment is 1.05, W represents the width of the rectangular frame area of the target vehicle, H represents the height of the rectangular frame area of the target vehicle, and S represents the number of samples in the sample space;is a rounded-down symbol;

according to the formula (15), in the peripheral area of the center position of the target vehicle, the loss function of the scale training is obtained through the least square method calculation, and the formula (16) is shown as follows:

wherein: y' represents a label vector generated by a one-dimensional Gaussian function, h represents a scale estimation weight parameter matrix, and f is a sample set characteristic matrix extracted at different scales;

processing the formula (16) by using Fourier transform, and converting the processed formula (16) into a frequency domain for optimization, as shown in a formula (17);

wherein: y'_iThe value of the ith element of the vector corresponding to the label vector y' in the frequency domain through Fourier transform; h^*Performing Fourier transform on the scale estimation weight parameter matrix h to represent a conjugate transpose in a frequency domain; f_iA representation form of Fourier transform of an ith sample feature vector in a sample feature matrix f in a frequency domain; Σ is the summation sign, and n has the same value as S, where n has a value of 33.

Order toThe derivation H of the formula (17)^*Equal to zero, and then solved to obtain a third weight parameter matrix H in the frequency domain, as shown in equation (18):

the scale response matrix R is calculated using equation (19)_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal scale of the current frame, and equation (19) is as follows:

R_s＝F·H (19)

taking the central position of a target vehicle in a current frame image as a reference coordinate, taking a target scale in a previous frame image as an initial scale, sampling samples under different scales according to a formula (15), calculating a sample characteristic matrix, and expressing the sample characteristic matrix by using F, wherein F is a representation form of the sample characteristic matrix F in a frequency domain after Fourier transformation;is composed ofThe conjugate transpose of (1); r_sA scale response matrix R representing a calculated target vehicle_sThe scale response matrix R_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal scale of the current frame, and the obtained optimal scale is combined with the central position of the target vehicle, so that the boundary of the target vehicle is more accurately positioned, and the aim of tracking the vehicle is fulfilled.

The embodiment adopts the tracking algorithm based on the road texture information to locate the most probable central position of the target vehicle in the image; then, sampling a plurality of image areas with different scales, generating a label vector through a one-dimensional Gaussian function, still adopting a kernel correlation filtering algorithm to perform scale learning again to obtain a one-dimensional filtering vector h, sampling the positioned images with different scales in a tracking stage, and calculating the obtained characteristic vector with the h, wherein in the final response vector, the scale corresponding to the position with the maximum response value is the ideal target scale.

Multiple experiments show that the specific gravity lambda of the contextual information of the texture of the road surface₁When the value is 0.6, compared with the traditional tracking algorithm based on the target, the average overlapping rate of the output result rectangular frame of the algorithm and the real target rectangular frame is the maximum, the average tracking failure times are relatively small, the average expected overlapping rate is the maximum, and under the condition that the target is shielded, the target tracking effect is good, the accuracy is high, and the contextual information of the road texture area is highAnd the tracking method can be more accurately adapted to the scale change of the target by combining with a multi-scale method, so that the failure times of the tracking method are effectively reduced.

A multi-scale vehicle tracking device based on contextual road texture information, comprising:

a first obtaining module 301, configured to obtain a center position of a target vehicle in a linear space road texture condition;

the first obtaining module 301 further comprises processing a plurality of background areas of the target vehicle by using a least square method to obtain a ridge regression formula, and performing cyclic shift merging simplification on ridge regression formula parts corresponding to the plurality of background areas; calculating the simplified ridge regression formula to obtain a first weight parameter matrix in the frequency domainUsing a first weight parameter matrixObtaining a first response matrix R, and carrying out Fourier transform on the first response matrix R to obtain the first response matrix in the frequency domainCalculating to obtain a first response matrix in the frequency domainThe index corresponding to the maximum response value, namely the center position of the corresponding target vehicle;

the obtained ridge regression formula is shown as the formula (1):

wherein B andis represented by formula (3):

by makingObtaining a first weight parameter matrix w as formula (4):

R＝Z₁w (6)

A second obtaining module 302, configured to obtain a dual-space module of a center position of the target vehicle in a dual space in a case of a road texture in a nonlinear space;

the second obtaining module 302 further includes: by kernel functionMapping the samples in the nonlinear sample space to a linearly separable dual space, and combining the eigenvectors of all the samples in the dual space to obtain a dual space weight parameter matrix w_DualTraining a dual spatial weight parameter matrix w_DualObtaining a second weight parameter matrix in the frequency domainSolving a second response matrix in the frequency domainThe index corresponding to the maximum response value is the central position of the target vehicle;

of target vehicles in dual spaceSeveral regions pass through kernel functionAfter mapping, the corresponding sample space is as shown in equation (8):

wherein, Delta_ij＝diag(m_ij+λ₃),{(i,j)∈{(0,0)}}

Δ_ij＝λ₁diag(m_ij+λ₃),{(i,j)∈{(1,1)}}

Δ_ij＝λ₂diag(m_ij+λ₃),{(i,j)∈{(2,2),(3,3),(4,4)}}

Wherein, a_i0A 0 th sample feature vector representing an ith region; a is_j0A 0 th sample feature vector representing a jth region; a is_jmAn m-th sample feature vector representing a j-th region; kappa (a)_i0,a_j0) Representing a kernel function between the 0 th sample feature vector of the ith region and the 0 th sample feature vector of the jth region;representing a kernel function between the m-th sample feature vector of the i-th region and the m-th sample feature vector of the j-th region,

solving a second response matrix in the frequency domainThe calculation formula is as (14):

Z₂represents: taking the position of the target vehicle in the previous frame as a center, respectively expanding outwards by N times of the width and the height of the target vehicle, taking the area after the outward expansion as a search area, in the search area, taking pixels as units, and circularly shifting transversely and longitudinally to obtain a sample space to be detected, wherein a sample characteristic matrix of the sample space uses Z₂；Is Z₂Representation in the frequency domain after fourier transform;obtaining a second response matrix in the frequency domain for the representation form of the parameter weight vector corresponding to the ith region in the frequency domain after Fourier transformThe maximum response value is the central position of the target vehicle;

the scale module 303 is configured to obtain a central position of the target vehicle, and combine the central position with the optimal scale of the image of the current frame to obtain a more accurate position of the target vehicle;

the scale module further comprises 303: extracting different scale images of a plurality of target vehicles as training samples, training the scale sequence of the training samples by a least square method to obtain loss functions J (H), and processing the loss functions J (H) by Fourier transform to obtain J (H)^*) To J (H)^*) Calculating to obtain a third weight parameter matrix H in the frequency domain; then obtaining a scale response matrix R_sCalculating a scale response matrix R_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal image scale of the current frame, and a more accurate target vehicle position is obtained, namely J (H) is obtained^*) Calculating to obtain a third weight parameter matrix H in the frequency domain; then obtaining a scale response matrix R_sCalculating a scale response matrix R_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal image scale of the current frame, and a more accurate target vehicle position is obtained;

wherein, Y'_iThe value of the ith element of the vector corresponding to the label vector y' in the frequency domain through Fourier transform; h^*Performing Fourier transform on the scale estimation weight parameter matrix h to represent a conjugate transpose in a frequency domain; f_iA representation form of Fourier transform of an ith sample feature vector in a sample feature matrix f in a frequency domain; sigma is a summation symbol, and the value of n is the same as that of S;

R_s＝F·H(19)

In the description of the present invention, unless otherwise expressly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A multi-scale vehicle tracking method based on contextual road texture information is characterized by comprising the following steps:

2. The method for multi-scale vehicle tracking based on contextual road texture information according to claim 1, wherein said step S1 comprises: processing a plurality of background areas of the target vehicle by using a least square method to obtain a ridge regression formula, and performing cyclic shift merging simplification on ridge regression formula parts corresponding to the plurality of background areas; calculating the simplified ridge regression formula to obtain a first weight parameter matrix in the frequency domainUsing a first weight parameter matrixObtaining a first response matrix R, and carrying out Fourier transform on the first response matrix R to obtain the first response matrix in the frequency domainCalculating to obtain a first response matrix in the frequency domainThe index corresponding to the maximum response value, namely the center position of the corresponding target vehicle;

wherein, the plurality of background areas comprise road surface texture areas.

3. The method for multi-scale vehicle tracking based on contextual road texture information according to claim 2, wherein said step S1 further comprises:

the obtained ridge regression formula is shown as the formula (1):

wherein B andis represented by formula (3):

by makingObtaining a first weight parameter matrix w as formula (4):

R＝Z₁w (6)

wherein,represents passing throughThe form in the frequency domain after fourier transform;representing the form in the frequency domain after fourier transformation,representing the form of the response matrix R in the frequency domain after Fourier transform processing to obtainNamely, a first response matrix R is obtained, and the center position of the target vehicle is determined.

4. The method for multi-scale vehicle tracking based on contextual road texture information according to claim 1, wherein said step S2 comprises: by kernel functionMapping the samples in the nonlinear sample space to a linearly separable dual space, and combining the eigenvectors of all the samples in the dual space to obtain a dual space weight parameter matrix w_DualTraining a dual spatial weight parameter matrix w_DualObtaining a second weight parameter matrix in the frequency domainSolving a second response matrix in the frequency domainAnd the index corresponding to the maximum response value is the center position of the target vehicle.

5. The method for multi-scale vehicle tracking based on contextual road surface texture information according to claim 4, wherein said step S2 further comprises:

dual spaceIn the method, a plurality of regions of the target vehicle pass through the kernel functionAfter mapping, the corresponding sample space is as shown in equation (8):

wherein, Delta_ij＝diag(m_ij+λ₃),{(i,j)∈{(0,0)}}

Δ_ij＝λ₁diag(m_ij+λ₃),{(i,j)∈{(1,1)}}

Δ_ij＝λ₂diag(m_ij+λ₃),{(i,j)∈{(2,2),(3,3),(4,4)}}

Wherein, a_i0A 0 th sample feature vector representing an ith region; a is_j0Is shown asThe 0 th sample feature vector of j regions; a is_jmAn m-th sample feature vector representing a j-th region; kappa (a)_i0,a_j0) Representing a kernel function between the 0 th sample feature vector of the ith region and the 0 th sample feature vector of the jth region; k (a)_i0m，a_jm) Representing a kernel function between the m-th sample feature vector of the i-th region and the m-th sample feature vector of the j-th region,

6. The method of claim 5, wherein the solving of the second response matrix in the frequency domain is performed by using a multi-scale vehicle tracking method based on contextual information of road textureThe calculation formula is as (14):

7. The method for multi-scale vehicle tracking based on contextual road texture information according to claim 1, wherein said step S3 comprises: extracting different scale images of a plurality of target vehicles as training samples, training the scale sequence of the training samples by a least square method to obtain loss functions J (H), and processing the loss functions J (H) by Fourier transform to obtain J (H)^*) To J (H)^*) Calculating to obtain a third weight parameter matrix H in the frequency domain; then obtaining a scale response matrix R_sCalculating a scale response matrix R_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal image scale of the current frame, and a more accurate target vehicle position is obtained, namely J (H) is obtained^*) Calculating to obtain a third weight parameter matrix H in the frequency domain; then obtaining a scale response matrix R_sCalculating a scale response matrix R_sAnd the scale corresponding to the index corresponding to the medium maximum value is the optimal image scale of the current frame, so that the more accurate target vehicle position is obtained.

8. The method for multi-scale vehicle tracking based on contextual road surface texture information according to claim 7, wherein said step S3 further comprises:

wherein b represents a scale factor; w represents the width of the rectangular frame area of the target vehicle; h represents the height of the rectangular frame area of the target vehicle; s represents a sample spaceThe number of samples;is a rounded-down symbol;

R_s＝F·H (19)

f is a representation form of a sample characteristic matrix F in a frequency domain after Fourier transformation;is composed ofThe conjugate transpose of (1); r_sA scale response matrix R representing a calculated target vehicle_sThe scale response matrix R_sAnd the scale corresponding to the index corresponding to the medium maximum value is the optimal scale of the current frame, so that the more accurate position of the target vehicle can be obtained.

9. The method of claim 8, wherein the solution J (H) is a solution for multi-scale vehicle tracking based on contextual road texture information^*) The method of the function is as follows:

10. a multi-scale vehicle tracking device based on contextual road texture information, comprising:

a first acquisition module (301) for acquiring a center position of a target vehicle in the case of a linear space road texture;

a second obtaining module (302) for obtaining a center position dual space module of the target vehicle in the dual space in case of the road texture in the nonlinear space;

and the scale module (303) is used for obtaining the central position of the target vehicle and combining the central position with the optimal scale of the image of the current frame to obtain a more accurate position of the target vehicle.