CN117576380A - Target autonomous detection tracking method and system - Google Patents
Target autonomous detection tracking method and system Download PDFInfo
- Publication number
- CN117576380A CN117576380A CN202410057608.1A CN202410057608A CN117576380A CN 117576380 A CN117576380 A CN 117576380A CN 202410057608 A CN202410057608 A CN 202410057608A CN 117576380 A CN117576380 A CN 117576380A
- Authority
- CN
- China
- Prior art keywords
- target
- tracking
- detection
- response
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 74
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000004044 response Effects 0.000 claims description 40
- 238000001914 filtration Methods 0.000 claims description 14
- 238000013441 quality evaluation Methods 0.000 claims description 12
- 230000009466 transformation Effects 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims description 2
- 238000004422 calculation algorithm Methods 0.000 description 35
- 238000000605 extraction Methods 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005286 illumination Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000008034 disappearance Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 241000195940 Bryophyta Species 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000013209 evaluation strategy Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000004660 morphological change Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000010355 oscillation Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000003014 reinforcing effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of target tracking, and discloses a target autonomous detection tracking method and a target autonomous detection tracking system, wherein the method comprises the following steps: s1, target detection: circularly detecting the image until a target is detected; s2, initializing a tracker template: initializing a tracker template; s3, target tracking: tracking the target. The invention solves the problems of failure in tracking the target or tracking drift in the prior art.
Description
Technical Field
The invention relates to the technical field of target tracking, in particular to a target autonomous detection tracking method and system.
Background
Target tracking technology has been a very important and challenging task in the field of computer vision research, which incorporates many disciplines of computer science, statistical learning, pattern recognition, machine learning, image processing, and the like. Along with the development of computer vision and deep learning artificial intelligence, the target tracking technology is mainly applied to tracking the state of a target and accurately positioning the position of the target, and has wide application in the aspects of security monitoring, intelligent transportation, man-machine interaction, aerospace, military reconnaissance and the like.
Object tracking is a fundamental problem in computer vision, and is used for determining the continuous position of an object of interest in a video sequence, namely, acquiring parameters of a moving object, such as position, size, speed, acceleration, movement track and the like, so as to further process and analyze, analyze and understand the behavior of the moving object, and complete higher-level tasks.
The target tracking can be classified into single target tracking, multi-target tracking, and the like according to the number of tracking targets. Single-target tracking refers to determining a target to be tracked (e.g., finding a target to be tracked in a first frame) and then determining the position of the target in a subsequent frame, which is a problem of time-space association, so that there are challenges of target disappearance, target appearance change, background interference, target movement, and the like. The target tracking algorithm can be divided into a generating algorithm and a discriminant algorithm according to a modeling mode, and can be divided into a short-time tracking algorithm and a long-time tracking algorithm according to the length of a tracking sequence.
The generated classical tracking algorithm mainly builds a target model by extracting target features, performs template matching in a subsequent frame, and realizes target positioning by gradual iteration. The discriminant tracking algorithm is a mainstream tracking algorithm at the current stage, the tracking problem is treated as a classification problem, a related filtering frame is mostly adopted in a discriminant algorithm frame, the target characteristic and the background characteristic are respectively extracted and modeled and respectively used as positive and negative samples to train corresponding classifiers, and then candidate samples are input into the classifiers in subsequent frames to obtain a sample with the maximum probability value, namely a tracking target.
However, due to the fact that the actual scene is complex and changeable, factors such as occlusion and disappearance of targets in the video, morphological change, mirror-out reproduction, rapid movement, motion blur, scale change and illumination can influence tracking accuracy, the method still faces many challenges in actual application, and a rapid and robust tracking algorithm is still very difficult to design and is still one of the most active research fields in computer vision.
In the existing target tracking scheme, tracking failure is easy to occur when a target is shielded, deformed and blurred or influenced by illumination. The model generation algorithm of the single-target tracking algorithm, such as an optical flow method, a Meanshift algorithm, a particle filtering method, a Camshift tracking algorithm and the like, does not consider background information, so that tracking failure easily occurs when a target is shielded, deformed and blurred or influenced by illumination, and the efficiency of the generating type tracking algorithm is low. Compared with the generated tracking algorithm, the discriminant tracking algorithm based on the correlation filtering, such as a CSK algorithm, a KCF algorithm, a DSST algorithm and the like, can better obtain the specific position of the target in the current image, has high recognition precision and high tracking speed, and is widely studied. With the development of deep learning, the CNN convolutional neural network is adopted to extract image characteristics, and the neural network is directly used for constructing an end-to-end twin network target tracking model.
Although object tracking gradually develops and matures, many challenges still remain when the object is faced with an actual scene, especially when the object is partially or completely blocked, and tracking a moving object, the object is out of the mirror, which results in losing the tracking object, so that the filtering template can easily learn the characteristics of the blocking obstacle, and the filtering template is polluted, so that the tracking object fails or drifts. In the prior art, a re-detection scheme is adopted to restrict the target of the new tracker so as to realize the tracking of the target in order to ensure the long-time or stable tracking of the target.
Disadvantages of the prior art:
1. in the existing relevant filter tracker, when the conditions of shielding, posture change, operation blurring or target departure occur to the target, model drift and target loss are caused, and follow-up tracking and target recovery cannot be completed;
2. the independent target tracker needs to be manually set when initializing the target in an initial frame, and can not automatically detect the target to finish initial tracking.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a target autonomous detection tracking method and system, which solve the problems of failure in tracking the target or tracking drift in the prior art.
The invention solves the problems by adopting the following technical scheme:
an autonomous target detection tracking method comprises the following steps:
s1, target detection: circularly detecting the image until a target is detected;
s2, initializing a tracker template: initializing a tracker template;
s3, target tracking: tracking the target.
In step S1, positioning frame information of an area where a target is located is obtained, and a center point (x, y) and a width and height (w, h) of the target in a positioning frame position in an image are output; wherein X represents the X-axis coordinate of the center point, Y represents the Y-axis coordinate of the center point, w represents the width of the positioning frame, and h represents the height of the positioning frame.
As a preferred technical solution, step S2 includes the following steps:
s21, determining the size of a target search area by using the (x, y, w, h) parameters of the detected positioning frame and the set filling parameters, and creating an initialized cosine window and a Gaussian ideal response label according to the size of the target search area;
s22, extracting gray scale features of the region where the target is located, and multiplying a cosine window by the extracted feature map to solve the edge effect;
s23, the obtained result of multiplying the gray features by the cosine window is converted into a frequency domain through Fourier transformation and multiplied by the Gaussian ideal response label after Fourier transformation, and therefore initialization of the tracker template is completed.
As a preferred technical solution, step S3 includes the following steps:
s31, obtaining frequency domain characteristics of a target search area;
s32, obtaining the positioning point coordinates and the response values of the targets in the current frame image;
s33, after the target position of the current frame is obtained, gray features of a target search area are extracted by taking the target position as a center, and a template is updated after weighted average is carried out on the tracker template to be used as a tracking filtering model of the next frame.
In step S31, the image data of the current frame is read in real time, the image of the size of the target search area in the initialization stage is intercepted by taking the center point of the target of the previous frame image tracking prediction as the center, the gray scale characteristics of the target area are extracted, and fourier transformation is performed after multiplication with a cosine window to obtain the frequency domain characteristics of the target search area.
In step S32, a matching calculation is performed with the tracking filter model, the calculation result is subjected to inverse fourier transform to obtain a response map of the search area, the coordinate of the maximum peak in the response map is the center coordinate of the predicted position of the target, and the coordinate is converted to the original map to obtain the positioning point coordinate of the target in the current frame image and the response value thereof.
As a preferred technical solution, step S3 includes the following steps:
s34, carrying out tracking quality evaluation on the result of each tracking prediction by using a tracking quality evaluation module, and judging whether to start a target detection module to carry out global re-detection according to the evaluated result: if the detection is judged to be needed to be re-detected, starting a target detection module to detect the target in the full-frame image range, selecting a target with the detection target output confidence higher than a set threshold as a retrieved tracking target, outputting positioning frame information of the target, and re-initializing a tracker model; otherwise, continuing the tracker to track.
As a preferred technical solution, in step S34, the tracking quality evaluation module evaluates the policy as follows: and carrying out comprehensive evaluation judgment by utilizing the response value of the tracking prediction output and the average peak correlation energy.
As a preferred technical solution, in step S34, the average peak correlation energy is calculated by:
wherein,represents the average peak correlation energy, +.>Representing the maximum value in the response graph of the tracking prediction result, +.>Representing the minimum value in the trace prediction response map, < >>Representing the response value at the (w, h) position in the response map, < >>Representing the averaging function.
The target autonomous detection tracking system is used for realizing the target autonomous detection tracking method and comprises the following modules connected in sequence:
the target detection module: the image detection device is used for circularly detecting the image until a target is detected;
tracker template initialization module: the tracker template is used for initializing;
a target tracking module: for tracking the target.
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention can automatically detect the target through the detection module, finish initializing the tracker template and automatically start tracking the target;
(2) The re-detection mechanism can perform global detection to retrieve the target when the target has tracking drift and loss, so that the tracking persistence and robustness are ensured;
(3) The invention combines the tracking rapidity of mutual filtering and the accuracy based on the deep learning detection algorithm, and can realize continuous tracking by the time tracking algorithm, thereby being compatible with better speed and accuracy.
Drawings
FIG. 1 is a flow chart of a target autonomous detection tracking method according to the present invention;
FIG. 2 is a schematic diagram of a network structure of Yolov 3;
fig. 3 is a flowchart of the KCF single target tracking algorithm.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.
Example 1
As shown in fig. 1 to 3, in the autonomous target detection and tracking method, aiming at a relevant filtering tracking algorithm based on KCF, the target cannot be automatically determined to track, and the target is lost due to the reasons of rapid movement, occlusion, out of view range and the like in the tracking process, so that the problem of continuous tracking of the target cannot be effectively solved. Starting to acquire a video sequence, circularly detecting the video sequence in the whole image range through a target detection module until a target is detected by a detection algorithm,
and acquiring rectangular frame information of a target area, and outputting a center point (x, y) and width and height (w, h) of a positioning frame position of the target in the image. Initializing a tracker template, namely firstly determining the size of a target search area by using the (x, y, w, h) parameters of a detected positioning frame and setting padding parameters, and creating an initialized cosine window and a Gaussian ideal response label according to the size of the target search area; then Gray Gray scale characteristics of the target area are extracted, and the cosine window is multiplied by the extracted characteristic diagram to solve the edge effect; and finally, converting the obtained two-dimensional data (the result of multiplying the gray features by the cosine window) into a frequency domain through Fourier transformation, and multiplying the frequency domain by the Gaussian ideal response label after Fourier transformation, thereby completing the initialization of the tracker template. When the program runs, whether the video is finished or not is judged, if so, the whole program is directly finished, and if not, target tracking is carried out.
And in the tracking stage, image data of a current frame are read in real time, a target center point of tracking and predicting of a previous frame of image is taken as a center, an image with the size of a target searching area in an initializing stage is intercepted, gray scale characteristics of the target area are extracted, fourier transformation is carried out after multiplication with a cosine window to obtain frequency domain characteristics of the target searching area, then matching calculation is carried out on the frequency domain characteristics and a tracking filter model, a response graph of the searching area is obtained after inverse Fourier transformation is carried out on a calculation result, the coordinate of the maximum peak value in the response graph is taken as the center coordinate of a predicted position of the target, and the coordinate is converted to an original graph, so that positioning point coordinates and response values of the target in the current frame of image are obtained. And after the target position of the current frame is acquired, extracting gray features of a target search area by taking the target position as a center, and updating a template after weighted average of the tracker template to serve as a tracking filtering model of the next frame.
Meanwhile, in the tracking stage, tracking quality evaluation is carried out on the result of each tracking prediction by using a tracking quality evaluation module, whether the target detection module is started to carry out global re-detection is judged according to the evaluated result, if the target detection module is judged to need re-detection, the target detection module is started to carry out target detection in the full image range, a target with the detection target output confidence higher than a threshold value is selected to be used as a retrieved tracking target, positioning rectangular frame information of the target is output, and a tracker model is reinitialized; otherwise, continuing the tracker to track. The tracking quality evaluation module evaluates the strategy to make comprehensive evaluation judgment by utilizing the response value of tracking prediction output and Average Peak Correlation Energy (APCE). APCE is average peak correlation energy, namely when a target is shielded by interference, the output response graph of the APCE shows a state that multimodal oscillation is severe, the average peak energy is obviously reduced, and the APCE is calculated as follows:
wherein the method comprises the steps ofAnd->Maximum and minimum values in response diagram representing tracking prediction result>The response value at the (w, h) position in the response map is shown.
Specific evaluation strategies are: and continuously recording the response value and the APCE value of the tracking target of 6 frames of the current frame forward history in a real-time tracking process. When the absolute value of the target response value of the tracking prediction is smaller than a certain specific threshold value, starting a detection module to re-detect; when the APCE value of the tracking prediction current frame is smaller than a certain specific threshold value, starting a detection module to perform re-detection; when the average value of the latest three frames of the recorded APCE value is smaller than a certain specific ratio threshold of the average value of the previous 3 frames, starting a detection module to re-detect; and when the response value continuously drops for 6 times in the recorded history, the response value of the current frame is smaller than a certain specific ratio threshold value of the 6 frames before the history, and the APCE continuously drops for 6 times, starting the detection module to re-detect.
The Yolov3 network consists mainly of three parts: backbone network (backbone), enhanced feature extraction layer (neg), network pre-header (Head). Backbone network: for a dark-53 feature extraction network with large residual block stacks, no pooling layer is used in the network and instead downsampling is replaced by convolution with a step size of 2. The dark-53 network can obtain 32-fold, 16-fold and 8-fold downsampled feature maps to obtain different receptive fields. Reinforcing feature extraction layer: feature graphs with different scales are led out from a main feature network, and multi-scale feature fusion is carried out through a path from bottom to top and transverse connection, so that high semantic information of a top layer and high resolution information of a bottom layer can be utilized. Network pre-measurement Head (Head) uses multi-scale output of different scale feature layers to adapt to detection of small scale, middle scale and large scale targets.
The network structure of Yolov3 is shown in fig. 2.
The input of the Yolov3 target detection model is an image with 416×416, and finally three feature graphs y1, y2 and y3 with different sizes are respectively obtained after a backbone feature extraction network, and the output is 13×13×C dimension, 26×26×C dimension and 52×52×C dimension, wherein C is (3× (4+1+L)), 3 represents that the anchor frame of each grid predicts 3 candidate frame parameters, and the predicted parameters (tx, ty, tw, th, tcof and L) of each candidate frame are respectively the positioning center (tx, ty), width height (tw, th), confidence degree tcof and category information L of the target. And detecting a plurality of candidate boundary frames through a Yolov3 target detection algorithm, and finally removing redundant prediction frames by using a non-maximum suppression algorithm, wherein only the prediction positioning frame with the highest confidence is reserved as a target detection output frame.
The backbone network Darknet-53 of Yolov3 is composed of 5 large ResX components, where X is 1, 2, 8, 4 respectively, indicating the number of times the residual block component is repeated. ResX is composed of a DBL module and X Resunit modules, wherein the DBL module is composed of three modules of a two-dimensional convolution Conv_2D+ batch normalization BN (Batch Normalization) + activation function LeakyRelu; a residual block consists of two DBL modules and a residual edge. If the size of the input image is 416×416, three feature maps with different sizes are obtained after the input image finally passes through the trunk feature extraction network: 13 x 13 dimension feature map, 26 x 26 dimension feature map, 52 x 52 dimension feature map. In the reinforced feature extraction part of the network, firstly, a (13×13) feature map is spliced with a (26×26) feature map through a DBL×5 module (stacked combination of 5 DBL modules), then is spliced with a (52×52) feature map through a DBL×5 module after being up-sampled, and finally a three-layer feature output layer is obtained.
The prediction header Head in the prediction output section is composed of a convolution of dbl+conv_1×1 for outputting the prediction parameters. Finally, the dimension of the prediction head obtained by 32 times downsampling is 13 multiplied by C, the sampling multiple is high, the receptive field to the original image is large, and the prediction head is used for detecting a large target; the 16 times downsampled prediction head fuses 32 times downsampled characteristics, the output dimension of the prediction head is 26 multiplied by C, and the prediction head is responsible for the prediction of a mesoscale target; finally, the 8 times downsampling fuses the 32 times and 16 times downsampling characteristics, the output dimension of the prediction head is 52 multiplied by C, the feeling of the prediction head in the original image is smaller, and the prediction head is mainly responsible for the prediction of a small-scale target.
The key point position combines the target detection with the tracker, realizes the automatic detection and tracking of the tracked target, and automatically starts the target detection algorithm to re-detect and retrieve the target for continuous tracking after the tracking quality is evaluated according to the situation that the target is lost due to the fact that the tracking quality is reduced, the target shielding exceeds the visual field range and the like, and ensures the sustainability and the robustness of target tracking. The tracking quality evaluation module comprehensively considers the response value and the APCE value to judge whether to start target detection or not, and effectively fuses the detection precision of the target detection based on deep learning with the speed of relevant filtering template tracking, thereby realizing continuous and stable tracking.
The target detection and the tracker are combined, automatic detection and tracking of the tracked target are realized, when the tracking quality is reduced, the target shielding exceeds the visual field range and the like, the target is lost, re-detection is performed according to the tracking quality evaluation, the tracking quality evaluation is performed by comprehensively considering the strategies of the response value and the APCE value of the tracked target, and the detection precision and the related filtering tracking speed are compatible.
The Yolov3 detection algorithm in the target detection module can be replaced by target detection algorithms such as SSD and FastRCNN, and the like, and the target detection function can be realized.
For a KCF target tracking algorithm in a tracker template, a MOSSE algorithm, a CSK algorithm and other related filtering tracking algorithms can be adopted, and a target tracking function can be realized.
The KCF single target tracking algorithm based on the correlation filtering is shown in the flow chart of figure 3.
The KCF first extracts the feature Gray of the image block of the target region of the first frame, filters with cos_windows, and then converts to the frequency domain by fast fourier transform, training the initial filter. During the detection stage, calculating a relevant response by using a filter model, and selecting the maximum response peak value as the movement position of the target; and extracting image characteristics of an image area with a corresponding peak value as a center, learning and updating a filter model expected to be output, and circularly executing the steps to update a filter to track the position of a target on the next frame of image.
As described above, the present invention can be preferably implemented.
All of the features disclosed in all of the embodiments of this specification, or all of the steps in any method or process disclosed implicitly, except for the mutually exclusive features and/or steps, may be combined and/or expanded and substituted in any way.
The foregoing description of the preferred embodiment of the invention is not intended to limit the invention in any way, but rather to cover all modifications, equivalents, improvements and alternatives falling within the spirit and principles of the invention.
Claims (10)
1. The autonomous target detecting and tracking method is characterized by comprising the following steps:
s1, target detection: circularly detecting the image until a target is detected;
s2, initializing a tracker template: initializing a tracker template;
s3, target tracking: tracking the target.
2. The autonomous detection and tracking method of a target according to claim 1, wherein in step S1, positioning frame information of an area where the target is located is obtained, and a center point (x, y) and a width and height (w, h) of the positioning frame position of the target in the image are output; wherein X represents the X-axis coordinate of the center point, Y represents the Y-axis coordinate of the center point, w represents the width of the positioning frame, and h represents the height of the positioning frame.
3. The method of autonomous target detection and tracking according to claim 1, wherein step S2 comprises the steps of:
s21, determining the size of a target search area by using the (x, y, w, h) parameters of the detected positioning frame and the set filling parameters, and creating an initialized cosine window and a Gaussian ideal response label according to the size of the target search area;
s22, extracting gray scale features of the region where the target is located, and multiplying a cosine window by the extracted feature map to solve the edge effect;
s23, the obtained result of multiplying the gray features by the cosine window is converted into a frequency domain through Fourier transformation and multiplied by the Gaussian ideal response label after Fourier transformation, and therefore initialization of the tracker template is completed.
4. The method of autonomous target detection and tracking according to claim 1, wherein step S3 comprises the steps of:
s31, obtaining frequency domain characteristics of a target search area;
s32, obtaining the positioning point coordinates and the response values of the targets in the current frame image;
s33, after the target position of the current frame is obtained, gray features of a target search area are extracted by taking the target position as a center, and a template is updated after weighted average is carried out on the tracker template to be used as a tracking filtering model of the next frame.
5. The method of autonomous target detection and tracking according to claim 4, wherein in step S31, image data of a current frame is read in real time, an image of a target search area size in an initialization stage is intercepted with a target center point of tracking prediction of a previous frame image as a center, gray scale characteristics of the target area are extracted, and fourier transform is performed after multiplication with a cosine window to obtain frequency domain characteristics of the target search area.
6. The autonomous detection and tracking method according to claim 4, wherein in step S32, a matching calculation is performed with the tracking filter model, a response chart of the search area is obtained after the calculation result is subjected to inverse fourier transform, the coordinate of the maximum peak in the response chart is the center coordinate of the predicted position of the target, and the coordinate is converted to the original chart, so as to obtain the positioning point coordinate of the target in the current frame image and the response value thereof.
7. The method according to any one of claims 1 to 6, wherein step S3 includes the steps of:
s34, carrying out tracking quality evaluation on the result of each tracking prediction by using a tracking quality evaluation module, and judging whether to start a target detection module to carry out global re-detection according to the evaluated result: if the detection is judged to be needed to be re-detected, starting a target detection module to detect the target in the full-frame image range, selecting a target with the detection target output confidence higher than a set threshold as a retrieved tracking target, outputting positioning frame information of the target, and re-initializing a tracker model; otherwise, continuing the tracker to track.
8. The method according to claim 7, wherein in step S34, the tracking quality evaluation module evaluates a policy as follows: and carrying out comprehensive evaluation judgment by utilizing the response value of the tracking prediction output and the average peak correlation energy.
9. The method for autonomous target detection and tracking according to claim 8, wherein in step S34, the average peak correlation energy is calculated by:
wherein,represents the average peak correlation energy, +.>Representing the maximum value in the response graph of the tracking prediction result, +.>Representing the minimum value in the trace prediction response map, < >>Representing the response value at the (w, h) position in the response map, < >>Representing the averaging function.
10. An autonomous target detection and tracking system, characterized by being used for realizing the autonomous target detection and tracking method according to any one of claims 1 to 9, comprising the following modules connected in sequence:
the target detection module: the image detection device is used for circularly detecting the image until a target is detected;
tracker template initialization module: the tracker template is used for initializing;
a target tracking module: for tracking the target.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410057608.1A CN117576380A (en) | 2024-01-16 | 2024-01-16 | Target autonomous detection tracking method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410057608.1A CN117576380A (en) | 2024-01-16 | 2024-01-16 | Target autonomous detection tracking method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117576380A true CN117576380A (en) | 2024-02-20 |
Family
ID=89864724
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410057608.1A Pending CN117576380A (en) | 2024-01-16 | 2024-01-16 | Target autonomous detection tracking method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117576380A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117911724A (en) * | 2024-03-20 | 2024-04-19 | 江西软件职业技术大学 | Target tracking method |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107680119A (en) * | 2017-09-05 | 2018-02-09 | 燕山大学 | A kind of track algorithm based on space-time context fusion multiple features and scale filter |
CN110414439A (en) * | 2019-07-30 | 2019-11-05 | 武汉理工大学 | Anti- based on multi-peak detection blocks pedestrian tracting method |
CN110569723A (en) * | 2019-08-02 | 2019-12-13 | 西安工业大学 | Target tracking method combining feature fusion and model updating |
CN110599519A (en) * | 2019-08-27 | 2019-12-20 | 上海交通大学 | Anti-occlusion related filtering tracking method based on domain search strategy |
CN111582062A (en) * | 2020-04-21 | 2020-08-25 | 电子科技大学 | Re-detection method in target tracking based on YOLOv3 |
CN114897932A (en) * | 2022-03-31 | 2022-08-12 | 北京航天飞腾装备技术有限责任公司 | Infrared target tracking implementation method based on feature and gray level fusion |
CN116342653A (en) * | 2023-03-21 | 2023-06-27 | 西安交通大学 | Target tracking method, system, equipment and medium based on correlation filter |
-
2024
- 2024-01-16 CN CN202410057608.1A patent/CN117576380A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107680119A (en) * | 2017-09-05 | 2018-02-09 | 燕山大学 | A kind of track algorithm based on space-time context fusion multiple features and scale filter |
CN110414439A (en) * | 2019-07-30 | 2019-11-05 | 武汉理工大学 | Anti- based on multi-peak detection blocks pedestrian tracting method |
CN110569723A (en) * | 2019-08-02 | 2019-12-13 | 西安工业大学 | Target tracking method combining feature fusion and model updating |
CN110599519A (en) * | 2019-08-27 | 2019-12-20 | 上海交通大学 | Anti-occlusion related filtering tracking method based on domain search strategy |
CN111582062A (en) * | 2020-04-21 | 2020-08-25 | 电子科技大学 | Re-detection method in target tracking based on YOLOv3 |
CN114897932A (en) * | 2022-03-31 | 2022-08-12 | 北京航天飞腾装备技术有限责任公司 | Infrared target tracking implementation method based on feature and gray level fusion |
CN116342653A (en) * | 2023-03-21 | 2023-06-27 | 西安交通大学 | Target tracking method, system, equipment and medium based on correlation filter |
Non-Patent Citations (1)
Title |
---|
王涵靓: "基于稀疏表示和相关滤波的目标重新检测和跟踪算法", 中国优秀硕士学位论文全文数据库 信息科技辑, no. 2021, 15 June 2021 (2021-06-15), pages 135 - 198 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117911724A (en) * | 2024-03-20 | 2024-04-19 | 江西软件职业技术大学 | Target tracking method |
CN117911724B (en) * | 2024-03-20 | 2024-06-04 | 江西软件职业技术大学 | Target tracking method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108154118B (en) | A kind of target detection system and method based on adaptive combined filter and multistage detection | |
CN113327272B (en) | Robustness long-time tracking method based on correlation filtering | |
JP7327077B2 (en) | Road obstacle detection device, road obstacle detection method, and road obstacle detection program | |
CN110175649A (en) | It is a kind of about the quick multiscale estimatiL method for tracking target detected again | |
CN113689464B (en) | Target tracking method based on self-adaptive multi-layer response fusion of twin network | |
WO2017168462A1 (en) | An image processing device, an image processing method, and computer-readable recording medium | |
CN113888461A (en) | Method, system and equipment for detecting defects of hardware parts based on deep learning | |
Hu et al. | A video streaming vehicle detection algorithm based on YOLOv4 | |
CN110147768B (en) | Target tracking method and device | |
CN110310305A (en) | A kind of method for tracking target and device based on BSSD detection and Kalman filtering | |
KR20210129503A (en) | Object tracking apparatus and method using self-attention | |
CN114612508A (en) | Anti-occlusion related filtering target tracking method for multi-feature online learning | |
Kadim et al. | Deep-learning based single object tracker for night surveillance. | |
CN117576380A (en) | Target autonomous detection tracking method and system | |
CN113379789A (en) | Moving target tracking method in complex environment | |
Anwer et al. | Accident vehicle types classification: a comparative study between different deep learning models | |
Esfahani et al. | DeepDSAIR: Deep 6-DOF camera relocalization using deblurred semantic-aware image representation for large-scale outdoor environments | |
CN113129332A (en) | Method and apparatus for performing target object tracking | |
Cho et al. | Modified perceptual cycle generative adversarial network-based image enhancement for improving accuracy of low light image segmentation | |
CN114627156A (en) | Consumption-level unmanned aerial vehicle video moving target accurate tracking method | |
CN115482513A (en) | Apparatus and method for adapting a pre-trained machine learning system to target data | |
CN117475357B (en) | Monitoring video image shielding detection method and system based on deep learning | |
Duan | [Retracted] Deep Learning‐Based Multitarget Motion Shadow Rejection and Accurate Tracking for Sports Video | |
CN117894065A (en) | Multi-person scene behavior recognition method based on skeleton key points | |
CN112614158B (en) | Sampling frame self-adaptive multi-feature fusion online target tracking method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |