CN110097579B

CN110097579B - Multi-scale vehicle tracking method and device based on pavement texture context information

Info

Publication number: CN110097579B
Application number: CN201910514897.2A
Authority: CN
Inventors: 孔斌; 赵富强; 杨静; 王灿
Original assignee: Hefei Institutes of Physical Science of CAS
Current assignee: Hefei Institutes of Physical Science of CAS
Priority date: 2019-06-14
Filing date: 2019-06-14
Publication date: 2021-08-13
Anticipated expiration: 2039-06-14
Also published as: CN110097579A

Abstract

The invention relates to a multi-scale vehicle tracking method based on contextual information of pavement texture, which comprises the following steps: s1, acquiring the central position of the target vehicle for the linear space road texture condition, and S2, acquiring the central position of the target vehicle in the dual space for the nonlinear space road texture condition; s3, obtaining the central position of the target vehicle, and combining the central position with the optimal scale of the image of the current frame to obtain a more accurate position of the target vehicle; the invention also discloses a multi-scale vehicle tracking device based on the contextual information of the road texture. The invention combines the road texture area at the bottom of the target vehicle, the relative position of the target vehicle and the road surface can not change greatly in the moving process of the target vehicle, the road texture is more stable, the target vehicle can be accurately positioned by combining the road texture information, namely according to the relative relation between the target vehicle and the road surface area, and the target frame is prevented from drifting.

Description

Multi-scale vehicle tracking method and device based on pavement texture context information

Technical Field

The invention relates to the field of computer vision, in particular to a multi-scale vehicle tracking method and device based on road texture context information.

Background

Machine Learning (ML) is a multi-domain cross subject, and relates to multi-domain subjects such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like, and target tracking belongs to important application in the field of Machine Learning.

The invention patent with publication number "CN 108776974B" (application date 2018.05.24) discloses "a real-time target tracking method suitable for public traffic scenes, which comprises the following steps: step 1, acquiring an initial position P (i) of a tracked target on the current ith frame by a detector; step 2, training a relevant filtering tracker by using P (i); step 3, acquiring an image of the target in the (i +1) th frame; step 4, carrying out correlation calculation by using a correlation filter and the (i +1) th frame to obtain a target prediction position P' (i + 1); and 5, evaluating the scale change rate, and judging whether the target predicted value needs to be corrected or not according to a threshold value. And 6, correcting the predicted value by using Kalman filtering to obtain the position P (i +1) of the target in the (i +1) th frame. According to the method, the size change rate is evaluated, the tracking accuracy and the real-time performance are improved, meanwhile, the target predicted value is corrected through Kalman filtering, and the influence of scale change is minimized. "

However, the method only utilizes the target area to realize tracking, and in a traffic scene, when a target vehicle is shielded, the tracking accuracy is low, the average overlapping rate of the output result rectangular frame and the real target rectangular frame is low, and thus the target cannot be effectively tracked.

The existing tracking method only singly utilizes the characteristics of the target vehicle, when the target vehicle is shielded or motion-blurred, the frame selection area formed around the target vehicle to be positioned may deviate, so that the accuracy of predicting the position of the target vehicle is low.

Disclosure of Invention

The technical problem to be solved by the present invention is to provide a multi-scale vehicle tracking method and device based on the contextual information of the road texture, so as to solve the problem of low accuracy of the position of the target vehicle in the background art.

In order to solve the above problems, the present invention provides the following technical solutions:

a multi-scale vehicle tracking method based on contextual road texture information comprises the following steps:

s1, acquiring the central position of the target vehicle under the condition of linear space road texture;

s2, for the condition of the nonlinear space road texture, acquiring the central position of the target vehicle in the dual space;

and S3, obtaining the central position of the target vehicle, and combining the central position with the optimal scale of the image of the current frame to obtain a more accurate position of the target vehicle.

As a further scheme of the invention: the step S1 includes:

processing a plurality of background areas of the target vehicle by using a least square method to obtain a ridge regression formula, and performing cyclic shift merging simplification on ridge regression formula parts corresponding to the plurality of background areas; calculating the simplified ridge regression formula to obtain a first weight parameter matrix in the frequency domain

Using a first weight parameter matrix

Obtaining a first response matrix R, and carrying out Fourier transform on the first response matrix R to obtain the first response matrix in the frequency domain

Calculating to obtain a first response matrix in the frequency domain

The index corresponding to the maximum response value, namely the center position of the corresponding target vehicle;

wherein, the plurality of background areas comprise road surface texture areas.

As a further scheme of the invention: the step S1 further includes:

the obtained ridge regression formula is shown as the formula (1):

wherein, it is called

Represents a two-norm, A₀A feature matrix representing samples after cyclic shift of the target vehicle; a. the₁A feature matrix representing samples after cyclic displacement of a road surface area under a target vehicle; a. the_iA feature matrix representing samples after cyclic shift of a background area of a left area or an upper end area or a right area of the target vehicle;

λ₁representing the proportion of the information of the texture area of the road surface in the training process; lambda [ alpha ]₂Represents A in the training process_iProportion of corresponding noise area, lambda₃Representing a regularization parameter and controlling the complexity of the first weight parameter matrix; y is₀A two-dimensional Gaussian matrix label representing a target vehicle; w represents a first weight parameter matrix needing regression; k is a radical of₁Representing the number of background areas around the target vehicle;

the formula (1) is divided into the following parts: the first part is

Representing training the target area as a positive sample; the second part is

The road area under the target is used as a positive sample for training, and the parameter lambda is₁Controlling the degree of contribution to the loss; the third part is

Representing the sum of the noise training with the left, upper and right regions of the target vehicle as the parameter lambda₂Controlling the degree of contribution to the loss; the fourth part is

Represents the complexity of the control weight parameter through training by regularization, which is represented by a parameter lambda₃Controlling the degree of contribution to the loss;

then, combining the areas corresponding to the formula (1) to obtain a simplified formula (2):

wherein B and

is represented by formula (3):

b is a cyclic matrix, and B is a cyclic matrix,

a label matrix, which represents label values corresponding to samples in the sample space of each part;

by making

Obtaining a first weight parameter matrix w as formula (4):

wherein, B^TWhich is the transpose of B, I denotes the identity matrix,

expressing that equation (2) derives w to be equal to zero;

fourier transformation is respectively carried out on two sides of the formula (4) to obtain a first weight parameter matrix in the frequency domain

The following were used:

wherein, a_iA vector formed by a first row of a sample characteristic matrix in a sample space corresponding to the i-th area; an indication of a dot product operation;

is a vector a_iAfter fourier transform processing, in the frequency domain;

to represent

The first response matrix in the frequency domain

The acquisition method comprises the following steps:

respectively expanding outwards by taking the position of the target vehicle in the previous frame as the center by taking the width N times and the height N times of the target vehicle, and taking the expanded region as a search region;

according to the property of the cyclic matrix, in the search area, cyclic shift is carried out in the horizontal direction and the vertical direction by taking pixels as units, a sample to be detected is obtained by each cyclic shift, the sample to be detected forms a sample space to be detected, and a sample characteristic matrix formed by characteristic values of the sample space to be detected uses Z₁Representing, sample feature matrix Z₁Performing matrix operation with the first weight parameter matrix w to obtain a first response matrix R, as shown in formula (6)

R＝Z₁w (6)

And performing fourier transform on the first response matrix R to obtain equation (7), where equation (7) is as follows:

wherein,

representing a sample feature matrix Z₁The form in the frequency domain after fourier transform;

representing the form of the first weight parameter matrix w in the frequency domain after fourier transformation,

representing the form of the response matrix R in the frequency domain after Fourier transform processing to obtain

Namely, a first response matrix R is obtained, and the center position of the target vehicle is determined.

As a further scheme of the invention: the step S2 includes: by kernel function

Mapping the samples in the nonlinear sample space to a linearly separable dual space, and combining the eigenvectors of all the samples in the dual space to obtain a dual space weight parameter matrix w_DualTraining a dual spatial weight parameter matrix w_DualObtaining a second weight parameter matrix in the frequency domain

Solving a second response matrix in the frequency domain

And the index corresponding to the maximum response value is the center position of the target vehicle.

As a further scheme of the invention:

the step S2 further includes:

several regions of the target vehicle pass through the kernel function in dual space

After mapping, the corresponding sample space is as shown in equation (8):

wherein i represents an integer of zero or more and less than the number of target vehicle regions, a_imFeature vector, A, of the m-th sample representing the i-th region_iRepresenting a feature matrix formed by samples of the ith area;

is a feature matrix A_iA representation mapped into a linearly separable dual space,

mapping the circulant matrix B to a representation in a linearly separable dual space;

mapping the feature vector of the mth sample of the ith area to a representation form in a linearly separable dual space;

calculating a dual spatial weight parameter matrix w using equation (9)_DualThe formula (9) is as follows:

wherein,

to represent

A denotes a second weight parameter matrix, a_iRepresenting the ith column vector in the second weight parameter matrix, b_iA feature vector representing the ith sample,

representing the characteristic direction of the ith sampleThe quantities are mapped to eigenvectors in a high-dimensional linear space, and 5m represents the dimension of the second weight parameter matrix;

obtaining a second weight parameter matrix in the frequency domain

The method comprises the following steps:

solving the second weight parameter matrix α loss function J (α) in conjunction with the loss function J (α) is shown in equation (10):

order to

Solving to obtain a second weight parameter matrix alpha, as shown in formula (11)

Wherein,

K₂is a matrix; by selecting kernel functions

So that the matrix K₂The medium elements are changed in sequence, and the guarantee matrix K₂Still a circulant matrix;

then will be

Substituted type

Obtaining a frequency domain representation of a second weight parameter matrix alpha in dual space

The formula is shown in formula (12):

wherein, Delta_ij＝diag(m_ij+λ₃),{(i,j)∈{(0,0)}}

Δ_ij＝λ₁diag(m_ij+λ₃),{(i,j)∈{(1,1)}}

Δ_ij＝λ₂diag(m_ij+λ₃),{(i,j)∈{(2,2),(3,3),(4,4)}}

diag(m_ij) The diagonal matrix corresponding to the ith row and the jth column of the block diagonal matrix is shown as formula (13)

Wherein, a_i0A 0 th sample feature vector representing an ith region; a is_j0A 0 th sample feature vector representing a jth region; a is_jmAn m-th sample feature vector representing a j-th region;

m number representing i number of areaA kernel function between the sample feature vector and the mth sample feature vector of the jth region;

representing a kernel function between the m-th sample feature vector of the i-th region and the m-th sample feature vector of the j-th region,

finally solving a second response matrix in the frequency domain

Obtaining a second response matrix in the frequency domain

The maximum response value is the center position of the target vehicle.

As a further scheme of the invention: solving a second response matrix in the frequency domain

The calculation formula is as (14):

Z₂represents: taking the position of the target vehicle in the previous frame as a center, respectively expanding outwards by N times of the width and the height of the target vehicle, taking the area after the outward expansion as a search area, in the search area, taking pixels as units, and circularly shifting transversely and longitudinally to obtain a sample space to be detected, wherein a sample characteristic matrix of the sample space uses Z₂Represents;

is Z₂Representation in the frequency domain after fourier transform;

and representing the parameter weight vector corresponding to the ith area in a frequency domain after Fourier transform.

AsThe further scheme of the invention is as follows: the step S3 includes: extracting different scale images of a plurality of target vehicles as training samples, training the scale sequence of the training samples by a least square method to obtain loss functions J (H), and processing the loss functions J (H) by Fourier transform to obtain J (H)^*) To J (H)^*) Calculating to obtain a third weight parameter matrix H in the frequency domain; then obtaining a scale response matrix R_sCalculating a scale response matrix R_sAnd the scale corresponding to the index corresponding to the medium maximum value is the optimal image scale of the current frame, so that the more accurate target vehicle position is obtained.

As a further scheme of the invention:

the step S3 further includes:

let the size of the image of the target vehicle in the current frame be W × H₁Extracting a plurality of images with different scales as training samples and recording the training samples as S, wherein the scale sequence of the training samples is shown as the formula (15):

wherein b represents a scale factor; w represents the width of the rectangular frame area of the target vehicle; h₁Representing the height of the target vehicle rectangular frame area; s represents the number of samples in the sample space;

is a rounded-down symbol;

the loss function obtained by calculating equation (15) by the least square method is shown in equation (16):

wherein y' represents a label vector generated by a one-dimensional Gaussian function, h represents a scale estimation weight parameter matrix, and f represents a sample set characteristic matrix extracted at different scales;

fourier transform is carried out on J (h) to obtainTo J (H)^*) As shown in equation (17):

wherein, Y_iThe value of the ith element of the vector corresponding to the label vector y' in the frequency domain through Fourier transform; h^*Performing Fourier transform on the scale estimation weight parameter matrix h to represent a conjugate transpose in a frequency domain; f_iA representation form of Fourier transform of an ith sample feature vector in a sample feature matrix f in a frequency domain; sigma is a summation symbol, and the value of N is the same as that of S;

solving for J (H)^*) The function obtains a third weight parameter matrix H, and a scale response matrix R is calculated by using a formula (19)_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal scale of the current frame, and equation (19) is as follows:

R_s＝F·H (19)

f is a representation form of a sample characteristic matrix F in a frequency domain after Fourier transformation;

is composed of

The conjugate transpose of (1); r_sA scale response matrix R representing a calculated target vehicle_sThe scale response matrix R_sAnd the scale corresponding to the index corresponding to the medium maximum value is the optimal scale of the current frame, so that the more accurate position of the target vehicle is obtained.

As a further scheme of the invention: the solution J (H)^*) The method of the function is as follows:

order to

And solving to obtain a third weight parameter matrix H in the frequency domain, as shown in formula (18):

as a further scheme of the invention: a multi-scale vehicle tracking device based on contextual road texture information, comprising:

the first acquisition module is used for acquiring the central position of a target vehicle under the condition of linear space road texture;

the second acquisition module is used for acquiring a central position dual space module of the target vehicle in a dual space under the condition of the road texture in the nonlinear space;

and the scale module is used for obtaining the central position of the target vehicle and combining the central position with the optimal scale of the image of the current frame to obtain a more accurate position of the target vehicle.

Compared with the prior art, the invention has the beneficial effects that:

1. according to the method, the relative position of the target vehicle and the road surface is not changed greatly in the moving process of the target vehicle by combining the road surface texture area at the bottom of the target vehicle, and the road surface texture is stable in the moving process, so that the target vehicle is accurately positioned according to the relative relation between the target vehicle and the road surface area by combining the road surface texture information, the target frame is prevented from drifting, the average overlapping rate of the output result rectangular frame and the real target rectangular frame is the maximum, the average tracking failure frequency is relatively small, the average expected overlapping rate is the maximum, and the accuracy is high.

2. The existing target tracking method can not effectively and quickly combine the context information around the target with the scale prediction of the target, so that when the scale of the target changes, the existing tracking method can not accurately position the target boundary, and the weight parameters in the subsequent frames can not effectively learn the complete characteristics of the target, thereby causing tracking failure.

3. According to the method, the upper, left and right environmental information of the target vehicle is extracted and used as noise samples, and inhibition is carried out in the training process; taking the road texture information of the area below the target vehicle as a positive sample for auxiliary positioning, training by a ridge regression algorithm to obtain a first weight parameter matrix w, a second weight parameter matrix alpha and a scale estimation weight parameter matrix h, and using the first weight parameter matrix w and the second weight parameter matrix alpha to determine the central position of the target vehicle in a subsequent frame; the method adds a multi-scale tracking function and can accurately predict the size of the target vehicle.

4. Multiple experiments show that the specific gravity lambda of the contextual information of the texture of the road surface₁When the average tracking failure frequency is 0.6, compared with the traditional tracking algorithm based on the target, the average overlapping rate of the output result rectangular frame of the algorithm and the real target rectangular frame is maximum, the average tracking failure frequency is relatively small, the average expected overlapping rate is maximum, and the performance of the method is better under the condition that the target is shielded; the contextual information of the pavement texture area is combined with a multi-scale method, and the tracking method can be more accurately adapted to the scale change of the target, so that the failure times of the tracking method are effectively reduced.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention.

Fig. 1 is a flow chart of a multi-scale vehicle tracking method based on contextual road texture information according to embodiment 1 of the present invention.

Fig. 2 is a schematic diagram illustrating a comparison of areas of target vehicles in a multi-scale vehicle tracking method based on contextual information of road surface texture according to embodiment 1 of the present invention.

Fig. 3 is a schematic structural diagram of a multi-scale vehicle tracking device based on contextual road texture information according to embodiment 2 of the present invention.

Detailed Description

In order to make the technical problems, technical solutions and advantageous effects to be solved by the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Example 1

A multi-scale vehicle tracking method based on contextual road texture information comprises the following steps: the method for tracking the spatial road texture in the context correlation filtering mode comprises the following specific steps:

processing a plurality of background areas of the target vehicle by using a least square method to obtain a ridge regression formula, wherein the least square method is a common algorithm in machine learning and is not described here; wherein, the plurality of background areas comprise road texture areas;

taking into account information of the road surface area under the target vehicle and other directions around the target vehicle, as shown in fig. 1, fig. 1 is a schematic diagram of a relative relationship between the target vehicle area and the surrounding area, where a₀The corresponding zone represents the central zone of the target vehicle, A₁The corresponding areas represent road surface texture areas, A₂The corresponding region represents the noise region on the left side of the target vehicle, A₃The corresponding region represents a noise region of the upper end of the target vehicle, A₄The corresponding region represents a noise region on the right side of the target vehicle;

the ridge regression formula is shown in formula (1):

formula (1) is divided into four parts:

the first part is

Representing training the target area as a positive sample; the second part is

wherein,

represents a two-norm, A₀A feature matrix representing samples after cyclic shift of the target vehicle; a. the₁A feature matrix representing samples after cyclic displacement of a road surface area under a target vehicle; a. the_iA feature matrix representing samples after cyclic shifting of background areas on the left side, the upper end and the right side of the target vehicle; lambda [ alpha ]₁Represents the proportion of the information of the texture area of the road surface in the training process, lambda₂A representing the surroundings of the target vehicle during training₂Proportion of corresponding noise area (A)₃、A₄The ratio of noise region to A₂The proportion of the corresponding noise areas is the same); lambda [ alpha ]₃Representing a regularization parameter and controlling the complexity of the first weight parameter matrix;

y₀a two-dimensional Gaussian matrix label representing a target vehicle; w represents a first weight parameter matrix needing regression; k is a radical of₁Indicates the number of background regions around the target vehicle, k in the present embodiment₁The value is four; a. the₀、A₁、A₂、A₃、A₄The corresponding five regions can be independently calculated, and the five region pairsThe corresponding parts are combined to obtain a simplified formula (2):

wherein B is a circulant matrix, in A₀、A₁、A₂、A₃、A₄In the corresponding five regions, circularly moving a position in the horizontal direction and the longitudinal direction by unit pixels to obtain a training sample space, and then calculating HOG characteristics of the samples to form a sample matrix, namely a circular matrix B; the HOG (Histogram of Oriented gradients) feature is a feature description used for object detection in computer vision and image processing, and is formed by calculating and counting the Histogram of Gradient orientation of local areas of an image;

a label matrix, which represents label values corresponding to samples in the sample space of each part; b and

is represented by formula (3):

equation (2) is a convex function, where the optimal solution exists in the real number domain, by letting

Thereby obtaining a first weight parameter matrix w, as shown in equation (4):

B^Twhich is the transpose of B, I denotes the identity matrix,

a function representing the composition of w and B differentiates w to be zero; at the same time, A₀、A₁、A₂、A₃、A₄The corresponding five regions are mutually independent, and the feature matrix of each region can be calculated in parallel according to the property of the cyclic matrix, so that the real-time performance of the algorithm can be improved, and the calculation efficiency is accelerated;

after Fourier transformation is respectively carried out on two sides of the formula (4), the formula (5) is obtained:

wherein, a_iA vector formed by a first row of a sample characteristic matrix in a sample space corresponding to the i-th area;

is a vector a_iAfter fourier transform processing, in the frequency domain;

to represent

The conjugate transpose of (a) is performed,

is a representation of the tag matrix in the frequency domain.

Acquiring a first response matrix R, wherein an index corresponding to the maximum response value in the first response matrix R is the position of the center of the target vehicle of the current frame image; the position of the target vehicle in the previous frame is used as a center, and the target vehicle expands outwards along the periphery by N times of width and N times of height of the target vehicle respectively to be used as a search area;

in this embodiment, the value of N is 2.5 times, a position is cyclically moved in the horizontal and vertical directions in a search area by taking a pixel as a unit to obtain a sample space to be detected, and a sample feature matrix to be detected is represented by Z₁To express, sample characteristicsSign matrix Z₁Performing matrix operation with the first weight parameter matrix w to obtain a response matrix R;

R＝Z₁w (6)

as shown in formula (6), the index corresponding to the maximum response value in the first response matrix R is the position of the center of the target vehicle in the current frame image, and the target vehicle can be tracked; fourier transform is carried out on the first response matrix R, and the formula (6) is equal to the formula (7) in horizontal direction; the Fourier transform converts the time domain signal (waveform) which is difficult to process originally into a frequency domain signal (frequency spectrum of the signal) which is easy to analyze, thereby facilitating analysis and calculation; the formula (7) is as follows:

representing the form of the first response matrix R in the frequency domain after Fourier transform processing;

indicating a dot product operation.

the classification regression problem is generally classified into a linear problem and a nonlinear problem, and for the linear problem, the solution is directly performed through a linear function, while for the nonlinear problem, the sample space needs to be converted into a new linear space, i.e., a dual space, so that the samples can be linearly divided in the dual space, and the problem can be converted into a linear divisible problem.

By kernel function

And mapping the samples in the nonlinear sample space to a linearly separable dual space, wherein the sample space corresponding to the five regions in the dual space is shown as the formula (8):

wherein i is 0, 1, 2, 3, 4, a_imFeature vector, A, of the m-th sample representing the i-th region_iA feature matrix formed by samples representing the ith region,

in dual space, a dual space weight parameter matrix w_DualMatrix w of dual spatial weight parameters_DualIs represented by a linear combination of the feature vectors of all samples, as shown in equation (9):

wherein,

to represent

the feature vector representing the ith sample is mapped to a feature vector in a high-dimensional linear space,

representation matrix

A matrix form linearly combined with the second weight parameter matrix α; 5m denotes a dimension of the second weight parameter matrix;

in dual space, the second weight parameter matrix α is solved using a loss function J (α), which is shown as equation (10):

by making

That is, the derivative of the second weight parameter matrix α to the function composed of the second weight parameter matrix α and B is equal to zero, so as to solve the second weight parameter matrix α, as shown in formula (11)

Wherein, the matrix

K₂Is a matrix;

by selecting appropriate kernel functions

So that the matrix K₂The medium elements are changed in sequence, the calculation result of the kernel function is not influenced, and the matrix K is ensured₂Still a circulant matrix, several classes of kernel functions satisfy this property:

polynomial kernel function: k (x, y) ═ f (x)^Ty)；

Will be provided with

Substituted type

And finally obtaining the frequency domain representation of the second weight parameter matrix alpha in the dual space according to the property of the block circulant matrix and the Fourier diagonalization property of the circulant matrix

The formula is shown in formula (12):

wherein, Delta_ij＝diag(m_ij+λ₃),{(i,j)∈{(0,0)}}

Δ_ij＝λ₁diag(m_ij+λ₃),{(i,j)∈{(1,1)}}

Δ_ij＝λ₂diag(m_ij+λ₃),{(i,j)∈{(2,2),(3,3),(4,4)}}

Wherein, a_i0A 0 th sample feature vector representing an ith region; a is_j0A 0 th sample feature vector representing a jth region; a is_jmAn m-th sample feature vector representing a j-th region; kappa (a)_i0,a_j0) Representing a kernel function between the 0 th sample feature vector of the ith region and the 0 th sample feature vector of the jth region; k (a)_im，a_jm) Representing a kernel function between an m-th sample feature vector of the i-th region and an m-th sample feature vector of the j-th region;

using the second weight parameter matrix alpha and the frequency domain form

Solving for a second response matrix in the frequency domain

As shown in formula (14);

the target position of the frame is used as the center, the frame is respectively outwards expanded by N times of the width and the height of a target vehicle and is used as a search area, N in the embodiment is 2.5, in the search area, pixels are used as units, the frame is circularly moved by one position in the transverse direction and the longitudinal direction to obtain a sample space to be detected, and a sample characteristic matrix to be detected uses Z₂Representing, by a sample feature matrix Z₂And a second weight parameterThe matrix alpha is of the type (14) and a second response matrix in the frequency domain is solved in the frequency domain

Second matrix

The index corresponding to the medium and maximum response value is the position of the center of the target vehicle of the current frame image;

wherein Z is₂A feature matrix of a sample space that is a search area;

is Z₂Representation in the frequency domain after fourier transform;

representing the parameter weight vector corresponding to the ith area in a frequency domain after Fourier transform;

s3, introducing an adaptive scale model;

in the vehicle tracking process of the road environment, the scale of the front vehicle collected by the vehicle-mounted camera is easy to change due to the change of the relative speed of the front vehicle and the current vehicle, and the traditional nuclear correlation filter cannot adapt to the condition that the scale of the target vehicle is changed greatly. Therefore, an adaptive scale model is introduced on the basis of the traditional kernel correlation filtering.

The problem related to the scale is solved by adopting an image pyramid or filter pyramid model, and the scale estimation is carried out by sampling images with different scales at certain intervals, but the problem of sharply increased calculated amount is caused, and the real-time performance of a tracking algorithm is influenced. Therefore, according to the characteristics of target tracking in a road environment, a dimension processing scale problem is added, and the steps are as follows:

extracting a plurality of images with different scales as training samples; calculating a scale sequence of the training sample by a least square method;

the size of the target vehicle image in the current frame is W multiplied by H, and a plurality of samples of the sample space are extracted and recorded as S;

preferably, the number of samples in this embodiment is selected to be 33, that is, 33 images with different scales are extracted as training samples, and the scale sequence size of the training samples is as shown in equation (15):

wherein: b represents a scale factor, the empirical value in this embodiment is 1.05, W represents the width of the rectangular frame region of the target vehicle, H₁Representing the height of the rectangular frame area of the target vehicle, S representing the number of samples of the sample space;

is a rounded-down symbol;

according to the formula (15), in the peripheral area of the center position of the target vehicle, the loss function of the scale training is obtained through the least square method calculation, and the formula (16) is shown as follows:

wherein: y' represents a label vector generated by a one-dimensional Gaussian function, h represents a scale estimation weight parameter matrix, and f is a sample set characteristic matrix extracted at different scales;

processing the formula (16) by using Fourier transform, and converting the processed formula (16) into a frequency domain for optimization, as shown in a formula (17);

wherein: y is_iThe value of the i-th element of the corresponding vector in the frequency domain for the label vector y' after Fourier transformation；H^*Performing Fourier transform on the scale estimation weight parameter matrix h to represent a conjugate transpose in a frequency domain; f_iA representation form of Fourier transform of an ith sample feature vector in a sample feature matrix f in a frequency domain; Σ is the summation sign, and the value of N is the same as the value of S, where N is 33.

Order to

The derivation H of the formula (17)^*Equal to zero, and then solved to obtain a third weight parameter matrix H in the frequency domain, as shown in equation (18):

the scale response matrix R is calculated using equation (19)_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal scale of the current frame, and equation (19) is as follows:

R_s＝F·H (19)

taking the central position of a target vehicle in a current frame image as a reference coordinate, taking a target scale in a previous frame image as an initial scale, sampling samples under different scales according to a formula (15), calculating a sample characteristic matrix, and expressing the sample characteristic matrix by using F, wherein F is a representation form of the sample characteristic matrix F in a frequency domain after Fourier transformation;

is composed of

The conjugate transpose of (1); r_sA scale response matrix R representing a calculated target vehicle_sThe scale response matrix R_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal scale of the current frame, and the obtained optimal scale is combined with the central position of the target vehicle, so that the boundary of the target vehicle is more accurately positioned, and the aim of tracking the vehicle is fulfilled.

The embodiment adopts the tracking algorithm based on the road texture information to locate the most probable central position of the target vehicle in the image; then, sampling a plurality of image areas with different scales, generating a label vector through a one-dimensional Gaussian function, still adopting a kernel correlation filtering algorithm to perform scale learning again to obtain a one-dimensional filtering vector h, sampling the positioned images with different scales in a tracking stage, and calculating the obtained characteristic vector with the h, wherein in the final response vector, the scale corresponding to the position with the maximum response value is the ideal target scale.

Multiple experiments show that the specific gravity lambda of the contextual information of the texture of the road surface₁When the value is 0.6, compared with the traditional tracking algorithm based on the target, the average overlapping rate of the output result rectangular frame of the algorithm and the real target rectangular frame is the largest, the average tracking failure frequency is relatively smaller, and the average expected overlapping rate is the largest.

A multi-scale vehicle tracking device based on road texture context information comprises:

a first obtaining module 301, configured to obtain a center position of a target vehicle in a linear space road texture condition;

the first obtaining module 301 further comprises processing a plurality of background areas of the target vehicle by using a least square method to obtain a ridge regression formula, and performing cyclic shift merging simplification on ridge regression formula parts corresponding to the plurality of background areas; calculating the simplified ridge regression formula to obtain a first weight parameter matrix in the frequency domain

Using a first weight parameter matrix

Calculating to obtain a first response matrix in the frequency domain

the obtained ridge regression formula is shown as the formula (1):

wherein, it is called

the formula (1) is divided into the following parts: the first part is

Representing training the target area as a positive sample; the second part is

Indicating a road area under the objectThe field is trained as a positive sample, with a parameter λ₁Controlling the degree of contribution to the loss; the third part is

wherein B and

is represented by formula (3):

b is a cyclic matrix, and B is a cyclic matrix,

by making

Obtaining a first weight parameter matrix w as formula (4):

wherein, B^TWhich is the transpose of B, I denotes the identity matrix,

expressing that equation (2) derives w to be equal to zero;

The following were used:

is a vector a_iAfter fourier transform processing, in the frequency domain;

to represent

The conjugate transpose of (a) is performed,

is a representation of the tag matrix in the frequency domain; a first response matrix in the frequency domain

The acquisition method comprises the following steps:

depending on the nature of the circulant matrix, in the search area,performing horizontal and vertical cyclic shift by taking pixels as units, obtaining a sample to be detected once per cyclic shift, forming a sample space to be detected by the sample to be detected, and using Z as a sample characteristic matrix formed by characteristic values of the sample space to be detected₁Representing, sample feature matrix Z₁Performing matrix operation with the first weight parameter matrix w to obtain a first response matrix R, as shown in formula (6)

R＝Z₁w (6)

wherein,

A second obtaining module 302, configured to obtain a dual-space module of a center position of the target vehicle in a dual space in a case of a road texture in a nonlinear space;

the second obtaining module 302 further includes: by kernel function

Sampling in a nonlinear sample spaceMapping to a linearly separable dual space, and combining the eigenvectors of all samples in the dual space to obtain a dual space weight parameter matrix w_DualTraining a dual spatial weight parameter matrix w_DualObtaining a second weight parameter matrix in the frequency domain

Solving a second response matrix in the frequency domain

The index corresponding to the maximum response value is the central position of the target vehicle;

After mapping, the corresponding sample space is as shown in equation (8):

wherein,

to represent

the eigenvector representing the ith sample is mapped to the eigenvector in the high-dimensional linear space, and 5m represents the dimension of the second weight parameter matrix;

obtaining a second weight parameter matrix in the frequency domain

The method comprises the following steps:

order to

Wherein,

K₂is a matrix; by selecting kernel functions

then will be

Substituted type

The formula is shown in formula (12):

wherein, Delta_ij＝diag(m_ij+λ₃),{(i,j)∈{(0,0)}}

Δ_ij＝λ₁diag(m_ij+λ₃),{(i,j)∈{(1,1)}}

Δ_ij＝λ₂diag(m_ij+λ₃),{(i,j)∈{(2,2),(3,3),(4,4)}}

Wherein, a_i0A 0 th sample feature vector representing an ith region; a is_j0A 0 th sample feature vector representing a jth region; a is_jmAn m-th sample feature vector representing a j-th region; kappa (a)_i0,a_j0) Representing a kernel function between the 0 th sample feature vector of the ith region and the 0 th sample feature vector of the jth region;

solving a second response matrix in the frequency domain

The calculation formula is as (14):

is Z₂Representation in the frequency domain after fourier transform;

obtaining a second response matrix in the frequency domain for the representation form of the parameter weight vector corresponding to the ith region in the frequency domain after Fourier transform

The maximum response value is the central position of the target vehicle;

the scale module 303 is configured to obtain a central position of the target vehicle, and combine the central position with the optimal scale of the image of the current frame to obtain a more accurate position of the target vehicle;

the scale module further comprises 303: extracting different scale images of a plurality of target vehicles as training samples, training the scale sequence of the training samples by a least square method to obtain loss functions J (H), and processing the loss functions J (H) by Fourier transform to obtain J (H)^*) To J (H)^*) Calculating to obtain a third weight parameter matrix H in the frequency domain; then obtaining a scale response matrix R_sCalculating a scale response matrix R_sThe scale corresponding to the index corresponding to the medium maximum value is the optimal image scale of the current frame, and a more accurate target vehicle position is obtained;

is a rounded-down symbol;

fourier transform is carried out on J (H) to obtain J (H)^*) As shown in equation (17):

order to

R_s＝F·H (19)

wherein F is the representation in the frequency domain after Fourier transform of the sample feature matrix FForms thereof;

is composed of

In the description of the present invention, unless otherwise expressly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A multi-scale vehicle tracking method based on contextual road texture information is characterized by comprising the following steps:

the step S1 includes: processing a plurality of background areas of the target vehicle by using a least square method to obtain a ridge regression formula, and performing cyclic shift merging simplification on ridge regression formula parts corresponding to the plurality of background areas; calculating the simplified ridge regression formula to obtain a first weight parameter matrix in the frequency domain

Using a first weight parameter matrix

Calculating to obtain a first response matrix in the frequency domain

wherein, the plurality of background areas comprise road texture areas;

the step S1 further includes:

the obtained ridge regression formula is shown as the formula (1):

wherein, it is called

λ₁representing the proportion of the information of the texture area of the road surface in the training process; lambda [ alpha ]₂Represents A in the training process_iProportion of corresponding noise area, lambda₃Representing a regularization parameter and controlling the complexity of the first weight parameter matrix; y is₀A two-dimensional Gaussian matrix label representing a target vehicle; w represents a first weight parameter matrix requiring regression；k₁Representing the number of background areas around the target vehicle;

the formula (1) is divided into the following parts: the first part is

Representing training the target area as a positive sample; the second part is

wherein B and

is represented by formula (3):

b is a cyclic matrix, and B is a cyclic matrix,

by making

Obtaining a first weight parameter matrix w as formula (4):

wherein, B^TWhich is the transpose of B, I denotes the identity matrix,

expressing that equation (2) derives w to be equal to zero;

The following were used:

is a vector a_iAfter fourier transform processing, in the frequency domain;

to represent

The conjugate transpose of (a) is performed,

The acquisition method comprises the following steps:

R＝Z₁w (6)

wherein,

Obtaining a first response matrix R, and determining the central position of the target vehicle;

2. The method for multi-scale vehicle tracking based on contextual road texture information according to claim 1, wherein said step S2 comprises: by kernel function

Solving a second response matrix in the frequency domain

3. The method for multi-scale vehicle tracking based on contextual road texture information according to claim 2, wherein said step S2 further comprises:

several zones of the target vehicle in the dual spaceDomain passing kernel function

After mapping, the corresponding sample space is as shown in equation (8):

wherein,

to represent

Transposed, alpha table ofShowing a second weight parameter matrix, alpha_iRepresenting the ith column vector in the second weight parameter matrix, b_iA feature vector representing the ith sample,

obtaining a second weight parameter matrix in the frequency domain

The method comprises the following steps:

order to

Wherein,

K₂is a matrix; by selecting kernel functions

then will be

Substituted type

The formula is shown in formula (12):

wherein, Delta_ij＝diag(m_ij+λ₃),{(i,j)∈{(0,0)}}

Δ_ij＝λ₁diag(m_ij+λ₃),{(i,j)∈{(1,1)}}

Δ_ij＝λ₂diag(m_ij+λ₃),{(i,j)∈{(2,2),(3,3),(4,4)}}

Wherein, a_i0A 0 th sample feature vector representing an ith region; a is_j0A 0 th sample feature vector representing a jth region; a is_jmAn m-th sample feature vector representing a j-th region; k (a)_i0,a_j0) Representing a kernel function between the 0 th sample feature vector of the ith region and the 0 th sample feature vector of the jth region; k (a)_im，a_jm) Representing a kernel function between the m-th sample feature vector of the i-th region and the m-th sample feature vector of the j-th region,

finally solving a second response matrix in the frequency domain

Obtaining a second response matrix in the frequency domain

The maximum response value is the center position of the target vehicle.

4. The method of claim 3, wherein the solving of the second response matrix in the frequency domain is performed by using a multi-scale vehicle tracking method based on contextual information of road texture

The calculation formula is as (14):

Z₂represents: the position of the target vehicle in the previous frame is taken as the center, the target vehicle is respectively expanded outwards by N times of the width and the height of the target vehicle, the expanded outwards area is taken as a search area, in the search area, pixels are taken as units, the horizontal and vertical cyclic shift is carried out, a sample space to be detected is obtained, and the sample spaceZ for sample feature matrix₂Represents;

is Z₂Representation in the frequency domain after fourier transform;

5. The method for multi-scale vehicle tracking based on contextual road texture information according to claim 1, wherein said step S3 comprises: extracting different scale images of a plurality of target vehicles as training samples, training the scale sequence of the training samples by a least square method to obtain loss functions J (H), and processing the loss functions J (H) by Fourier transform to obtain J (H)^*) To J (H)^*) Calculating to obtain a third weight parameter matrix H in the frequency domain; then obtaining a scale response matrix R_sCalculating a scale response matrix R_sAnd the scale corresponding to the index corresponding to the medium maximum value is the optimal image scale of the current frame, so that the more accurate target vehicle position is obtained.

6. The method for multi-scale vehicle tracking based on contextual road texture information according to claim 5, wherein said step S3 further comprises:

wherein b represents a scale factor; w represents the width of the rectangular frame area of the target vehicle; h₁Rectangular frame for representing target vehicleThe height of the region; s represents the number of samples in the sample space;

is a rounded-down symbol;

R_s＝F·H (19)

is composed of

The conjugate transpose of (1); r_sA scale response matrix R representing a calculated target vehicle_sThe scale response matrix R_sAnd the scale corresponding to the index corresponding to the medium maximum value is the optimal scale of the current frame, so that the more accurate position of the target vehicle can be obtained.

7. The method of claim 6, wherein the solution J (H) is obtained by solving for contextual information of road texture^*) The method of the function is as follows:

order to

8. a multi-scale vehicle tracking device based on contextual road texture information, comprising:

a first acquisition module (301) for acquiring a center position of a target vehicle in the case of a linear space road texture;

the first acquisition module (301) comprises: processing a plurality of background areas of the target vehicle by using a least square method to obtain a ridge regression formula, and performing cyclic shift merging simplification on ridge regression formula parts corresponding to the plurality of background areas; calculating the simplified ridge regression formula to obtain a first weight parameter matrix in the frequency domain

Using a first weight parameter matrix

Calculating to obtain a first response matrix in the frequency domain

wherein, the plurality of background areas comprise road texture areas;

the first acquisition module (301) further comprises:

the obtained ridge regression formula is shown as the formula (1):

wherein, it is called

the formula (1) is divided into the following parts: the first part is

Representing training the target area as a positive sample; the second part is

wherein B and

is represented by formula (3):

b is a cyclic matrix, and B is a cyclic matrix,

for a label matrix, samples representing respective partsLabel values corresponding to samples in the space;

by making

Obtaining a first weight parameter matrix w as formula (4):

wherein, B^TWhich is the transpose of B, I denotes the identity matrix,

expressing that equation (2) derives w to be equal to zero;

The following were used:

is a vector a_iAfter fourier transform processing, in the frequency domain;

to represent

The conjugate transpose of (a) is performed,

The acquisition method comprises the following steps:

R＝Z₁w (6)

wherein,

a second obtaining module (302) for obtaining a center position dual space module of the target vehicle in the dual space in case of the road texture in the nonlinear space;

and the scale module (303) is used for obtaining the central position of the target vehicle and combining the central position with the optimal scale of the image of the current frame to obtain a more accurate position of the target vehicle.