CN110428447B - Target tracking method and system based on strategy gradient - Google Patents
Target tracking method and system based on strategy gradient Download PDFInfo
- Publication number
- CN110428447B CN110428447B CN201910638477.5A CN201910638477A CN110428447B CN 110428447 B CN110428447 B CN 110428447B CN 201910638477 A CN201910638477 A CN 201910638477A CN 110428447 B CN110428447 B CN 110428447B
- Authority
- CN
- China
- Prior art keywords
- target
- response
- tracking
- template
- action
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/269—Analysis of motion using gradient-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a target tracking method and system based on a strategy gradient, and belongs to the field of computer vision. The method comprises the following steps: (1) inputting the target image into a convolutional neural network to obtain a target appearance template Z; (2) inputting the search image into a convolutional neural network, and waiting for a search area feature map; (3) calculating the template image Z and the search area characteristic graph through a similarity measurement function f to obtain a response graph ht; (4) inputting the response image ht and the historical response image hi obtained in the step (3) into a policy network, and selecting the action with the highest score to be added into the set Ct (i is 1-N); (5) repeating (4) until each historical response map in the response map template pool is traversed; and finally, executing the action with the largest occurrence number in the set Ct (i equals to 1 to N). The system includes a tracker and a decision maker. Wrong template updating is avoided, and the target can be found and found again in time when the target is lost.
Description
Technical Field
The invention relates to the field of computer vision, in particular to a target tracking method and system based on a strategy gradient.
Background
Visual Object Tracking (VOT) is one of the most challenging problems in the field of computer vision. The method has wide application in video monitoring, man-machine interaction and automatic driving. Despite significant advances in VOT technology over the last several decades, it still faces significant challenges where severe occlusion, severe illumination changes and distortions, etc., may cause tracking failures.
Visual target tracking algorithms can be mainly classified into two categories: a generate class method and a discriminate class method. The generation-based method generally constructs a model from a target region of a current frame, and then searches for a region most similar to the model in a next frame, which is known as kalman filtering, particle filtering, mean shift, and the like. The discrimination class method, also known as detection tracking, learns a discrimination model to distinguish between a target region and a surrounding background region. The difference between the two methods is that the detection tracking method utilizes machine learning to train the classifier, and background information is used in the training process. Thus, the classifier can focus on identifying the foreground and background, and therefore, the discrimination class method is generally better than the generation class method.
Among the discrimination-like methods, a method based on a Discrimination Correlation Filter (DCF) is known for its high efficiency and high accuracy. By using the discrete fourier transform and cyclic shift of the training samples, the DCF-based modified KCF tracker can run at 292fps on a single CPU, far exceeding the real-time requirement. In recent years, research on DCF has been successful by using multi-feature channels, scale estimation, and reduction of boundary effects. However, as the accuracy of the DCF-like tracker increases, its speed also drops sharply.
In recent years, tracking algorithms based on Convolutional Neural Networks (CNN) have attracted attention for their excellent performance. Unlike conventional tracking algorithms, CNN-based algorithms use deep convolution features rather than manual features, which makes them show superior results over multiple tracking benchmarks. While these CNN-based trackers are superior in performance, these approaches either use a simple online update strategy or never update the initial appearance template, relying only on the powerful characterization capabilities of the trained CNN. This may be effective for interference free short term tracking. However, once severe occlusion or significant appearance change occurs, the tracker drifts into the background, thereby losing the target. And these methods also lack an effective means to re-detect the target after it is lost.
Therefore, the invention provides a strategy gradient-based target tracking algorithm, which recognizes unreliable tracking results by learning an effective strategy through a strategy gradient algorithm in reinforcement learning, and takes measures to prevent wrong template updating and detect lost targets again.
The closest prior art:
[1]Tracking-Learning-Detection
[2]Long-term correlation tracking
[3]Large Margin Object Tracking with Circulant Feature Maps
[4]Reliable Re-detection for Long-term Tracking
[5]Tracking as Online Decision-Making:Learning a Policy from Streaming Videos with Reinforcement Learning
in the above target tracking technology, the method [1-4] solves the problems of template updating and re-detection by means of manual strategy design, and such methods generally have a fixed mathematical formula to calculate the tracking confidence level, and only update the tracking model when the tracking confidence level is higher. However, this kind of method has certain limitations due to the fixed parameter formula, and cannot perfectly adapt to different tracking sequences. Method [5] learns a policy through the Q-learning algorithm to decide when to update the target appearance template and whether to search the entire image globally. However, this method uses a single response diagram to represent the state, and does not consider the response diversity of different tracking sequences, so that the tracking result cannot be accurately estimated to make a reliable decision. In addition to this, the global search severely impacts the speed of the algorithm.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a target tracking method based on strategy gradients, which can accurately identify unreliable tracking results and determine when to update an appearance template and whether to detect again or not so as to avoid wrong template update and timely find and find out the target again when the target is lost.
A strategy gradient-based target tracking method comprises the following steps:
(1) inputting the target image into a convolutional neural network to obtain a target appearance template Z;
(2) inputting the search image into a convolutional neural network, and waiting for a search area feature map;
(3) comparing the template image Z with a candidate area with the same size of the search area feature map, and calculating through a similarity measurement function f to obtain a response map ht; the similarity measurement function f is a Siamese network:
whereinIs a convolution embedding function, which represents the cross-correlation of two feature maps, and b is an offset. After a target appearance template Z and a search area feature map are input, a function f generates a response map ht;
(4) inputting the response image ht and the historical response image hi obtained in the step (3) into a policy network together to obtain the regularization score of each decision-making action; the historical response graphs hi (i is 1-N) come from a response graph template pool, N historical response graphs are stored in the response graph template pool, and each historical response graph corresponds to the latest good tracking result; then, the action with the highest score is selected and added to the set Ct (i ═ 1 to N);
(5) repeating the step (4) until each historical response map in the response map template pool is traversed; and finally, executing the action with the largest occurrence number in the set Ct (i equals to 1 to N).
Further, the policy network includes a state stAction a, learning strategy pi and reward RtState s oftRepresented as a tuple (h)i,ht) Action a includes updating, tracking and redetecting, the reward RtAwarding a prize R according to the overlap ratio of the current bounding box to the objecttThe learning strategy pi optimizes the depth strategy network by using gradient descent:
Δθ=α▽θlogπθ(at|st)Rτ (2)
wherein R isτRepresenting the return of the whole process, during the training process, the action samples are extracted from the policy network, and then the reward R is given through the evaluation of the selected actiontThe strategy is optimized by updating parameters using the reward information to maximize the desired reward, resulting in a trained strategy network.
Further, in the training process, the tracking process of one frame is regarded as the whole process, and a back propagation algorithm is executed by using formula (2), and a return function of the back propagation algorithm is defined as follows:
wherein, the Intersection-over-Union (IOU) represents the overlapping rate of the predicted box b and the real box g;
the strategy network consists of two 516-dimensional full-connection layers and an output layer, wherein the output layer outputs three actions of updating, tracking and re-detecting, and each full-connection layer is initialized randomly and is subjected to ReLU and batch regularization processing; the whole algorithm trains 200 cycles on an object tracking reference (OTB) data set, and each cycle is finished after an agent interacts with all training images; for each cycle, after 8192 samples are collected, the policy network starts learning.
Furthermore, after the strategy network learns every time, the updated strategy network continues to sample for the next learning, and the learning rate is 10 in the whole training process-6Down to 10-8And a batch size of 64 is used.
Further, if the maximum action in the execution set Ct is update, the update target position Pt is Ptp, which is the predicted position of the current target, and the response map ht is added to the response map template pool, one old response map in the response map template pool is discarded, and the target appearance template Z is updated with the current target position information.
Further, if the maximum action in the execution set Ct is tracking, the target position Pt is updated to Ptp, which is the predicted position of the current target.
Further, if the most action in the execution set Ct is re-detection, a search area where one target is most likely to appear is obtained through a particle filter, a response map htc of the search area is calculated, a predicted target position Ptc of the re-detection area is obtained, ht ═ htc, Ptp ═ Ptc is updated, and then the predicted target position Ptc is input to the policy network again for decision-making.
Further, in the re-detection process, the particle filter draws candidate search regions where M targets are most likely to appear, and for each candidate search region, the tracking network is reused to calculate a response map, and then an optimal candidate search region is selected through the confidence score:
Ci=max(fi)·cos(γ||Pi-Pt||)
wherein f isiIs a response map of the ith candidate search region, PiAnd PtIs the ith candidate search area and the center position of the target in the previous frame, and γ is a predefined distance penalty parameter.
Further, when the re-detection is performed twice or more in one frame, the re-detection result is discarded and the initial tracking result is used.
A target tracking system based on strategy gradient comprises a tracker and a decision maker; the tracker calculates a target appearance template Z and a search area characteristic graph through a similarity measurement function f to obtain a response graph ht; the similarity measurement function f is a Siamese network:
whereinIs a convolution embedding function, which represents the cross-correlation of two feature maps, and b is an offset. After a target appearance template Z and a search area feature map are input, a function f generates a response map ht; the response map ht and the historical response map hi form the state s of the trackertThe tracker selects an action a according to a strategy pi given by the decision maker;
the decision maker is a trained strategy network and converts the state s of the trackertInputting the data into a policy network to obtain the regularization score of each decision-making action; the historical response graphs hi (i is 1-N) are from a response graph template pool, N historical response graphs are stored in the response graph template pool, and each historical response graph represents the latest good tracking result; then, the action with the highest score is selected and added to the set Ct (i ═ 1 to N); traversing each historical response map in the response map template pool; and finally, executing the action with the largest occurrence number in the set Ct (i is 1 to N) as a decision result of the decision maker.
The strategy gradient-based target tracking technology provided by the invention learns a strategy network through a strategy gradient algorithm in reinforcement learning, the strategy network can accurately identify unreliable tracking results, then through executing corresponding decision actions, wrong template updating is avoided, and a target can be found and found again in time when the target is lost, so that the difficulties in the target tracking technologies such as shielding, deformation and the like are effectively solved, the tracking precision and robustness are greatly improved, and meanwhile, higher speed is kept. Experiments prove that the method provided by the invention improves the performance by 5-6% in the original tracking frame.
Drawings
FIG. 1 is a general framework diagram of a strategy gradient-based object tracking technique;
FIG. 2 is a block flow diagram of a policy gradient-based target tracking technique;
FIG. 3 is a distance accuracy run result of an OTB-50 reference dataset;
FIG. 4 shows the results of OTB-50 baseline dataset overlap success rate operation;
FIG. 5 is a distance accuracy operation result of the OTB-100 reference data set;
FIG. 6 shows the results of the OTB-100 baseline data set overlap success rate operation.
Detailed Description
The technical scheme of the invention is further explained by combining the attached drawings of the specification.
As shown in fig. 1 and 2:
(1) based on a SimFC tracking algorithm framework, cutting out an area where a first frame target of a video sequence is located, and zooming the area to a fixed size to obtain a target image; and inputting the target image into a convolutional neural network to obtain a target appearance template Z, wherein the size of the template image is 127 multiplied by 127.
(2) Based on a SimFC tracking algorithm framework, in a t frame, a corresponding area around the central position of a target in a t-1 frame is used as a search area, a search image X is obtained by cutting, the size of the search image is 255 multiplied by 255, the search image X is also zoomed to a fixed size and is input into a convolutional neural network, and a search area characteristic diagram is obtained.
(3) The template image Z is compared with candidate regions of the same size in the search region feature map. If the two image blocks describe the same target, the similarity measurement function f returns a high score, and in fact, the similarity measurement function f is a siamese network in depth.
WhereinIs a convolution embedding function, which represents the cross-correlation of two feature maps, and b is an offset. After the target appearance template Z and the search area feature map are input, the function f generates a 33 × 33 response map ht.
The structure is about the full convolution of the search image X, the target appearance template Z in the step (1) is used as a convolution kernel to be convolved with the search area feature map in the step (2), so that a response map ht is obtained, and the position of the maximum value in the response map represents the central position of the target to be tracked. From the response map ht, the position of the target to be tracked can be preliminarily predicted, which allows the similarity function for all translated sub-windows in the search image to be calculated in one evaluation.
(4) And (4) inputting the response map ht obtained in the step (3) and the historical response map hi into a policy network together to obtain the regularization score of each decision-making action. The historical response maps hi (i is 1 to N) come from responses, and N historical response maps are stored in the response map template pool, and each historical response map corresponds to the latest good tracking result. Then, the action with the highest score is selected to be added to the set (i ═ 1 to N) for further reliable decision making.
The problem of reinforcement learning for policy networks can be generally viewed as a Markov Decision Process (MDP) in which agents interact with the environment through states, actions and rewards. In the tracking problem, the tracker is treated as a proxy. Given a state stThe agent must select an action a according to policy pi. After performing this action, the agent will give a positive or negative reward R based on the current bounding box to object overlap ratio IOUt. By maximizing the expected reward, the agent learns an optimal strategy to take action.
Strategy gradient algorithm learning strategy pi optimizes the depth strategy network by using gradient descent:
Δθ=α▽θlogπθ(at|st)Rτ (2)
wherein R isτRepresenting the return of the whole process. During the training process, action samples are drawn from the policy network and then awarded through evaluation of the selected action. With this reward information, the strategy can be optimized by updating the parameters to maximize the desired reward.
The selectable actions are tracking, updating and re-detecting respectively. The update and tracking actions determine whether the tracker uses the predicted location information to update the appearance template of the target. When the re-detection action is executed, the re-detection module draws the candidate search areas where the M targets are most likely to appear through the particle filter around the previous target positions. For each candidate search region, the tracking network is reused to calculate a response map, and then a best candidate search region is selected by the confidence score:
Ci=max(fi)·cos(γ||Pi-Pt||) (3)
wherein f isiIs a response map of the ith candidate search region, PiAnd PtIs the ith candidate search area and the center position of the target in the previous frame, and γ is a predefined distance penalty parameter. Similarly, the position of the re-detection target is determined by the maximum value of the response map of the best candidate search area. Finally, the response map of the best candidate search area is input into the policy network again to check the reliability of the re-detection result. When the re-detection is performed twice or more in one frame, the re-detection result is discarded and the initial tracking result is used.
State stCan be represented as a tuple (h)i,ht),hiIs a historical response graph of a good tracking result, htIs the response map of the current frame. In previous approaches, only a single h was typically usedtTo describe the state st. However, because of the uncertainty of the tracking problem, the confidence of a response map may fluctuate in different sequences. For example, when a response graph shows a failed trace result in video a, but at the same time it may show a successful trace result in another video B that contains more challenging factors. So the present invention incorporates the current response graph htAnd historical response graph hiTo evaluate the reliability of the tracking result. In a sense, a policy network can be viewed as a similarity metric function, which yields hiAnd htThe similarity between the two is adopted so as to judge whether the current tracking result is good or badAnd performing further actions.
In the training process, regarding the tracking process of one frame as the whole process, a back propagation algorithm is executed using formula (2), and the reward function is defined as follows:
wherein, the overlap rate of the predicted frame b and the real frame g is represented by interaction-over-Union (IOU), which reflects the credibility of the tracking result of the given frame.
The policy network consists of two 516-dimensional fully-connected layers and one output layer, which outputs 3 actions. Each fully connected layer is initialized randomly and subjected to ReLU and batch regularization. The entire algorithm is trained on an Object Tracking Baseline (OTB) dataset for 200 cycles, each cycle ending after the agent interacts with all the training images. For each cycle, after 8192 samples are collected, the policy network starts learning. After each learning, the updated policy network will continue to sample for the next learning. The learning rate is 10 in the whole training process-6Down to 10-8And a batch size of 64 is used.
(5) And (4) repeating the step (4) until each historical response map in the response map template pool is traversed. And finally executing the action with the largest occurrence number in the set (i equals to 1 to N). If the decision action is an updating action, updating the target position and the appearance template according to the prediction result; if the decision is made as a tracking action, updating the location but not the appearance template according to the prediction result; if the action is decided as the re-detection action, the target position and the appearance template are not updated according to the prediction result, and the lost target is searched by using the re-detection module.
By adopting the tracking method, the OTB-50 and OTB-100 reference data sets shown in figure 3 are processed, and the processing results are shown in figures 3 to 6, so that the tracking method and the tracking system improve the performance by 5 to 6 percent.
A target tracking system based on strategy gradient is characterized by comprising a tracker and a decision maker; the tracker calculates a target appearance template Z and a search area characteristic graph through a similarity measurement function f to obtain a response graph ht; the similarity measurement function f is a Siamese network:
whereinIs a convolution embedding function, which represents the cross-correlation of two feature maps, and b is an offset. After a target appearance template Z and a search area feature map are input, a function f generates a response map ht; the response map ht and the historical response map hi form the state s of the trackertThe tracker selects an action a according to a strategy pi given by the decision maker;
the decision maker is a trained strategy network and converts the state s of the trackertInputting the data into a policy network to obtain the regularization score of each decision-making action; the historical response graphs hi (i is 1-N) come from a response graph template pool, N historical response graphs are stored in the response graph template pool, and each historical response graph corresponds to the latest good tracking result; then, the action with the highest score is selected and added to the set Ct (i ═ 1 to N); traversing each historical response map in the response map template pool; and finally, executing the action with the largest occurrence number in the set Ct (i is 1 to N) as a decision result of the decision maker.
Claims (10)
1. A strategy gradient-based target tracking method is characterized by comprising the following steps:
(1) inputting the target image into a convolutional neural network to obtain a target appearance template Z;
(2) inputting the search image into a convolutional neural network to obtain a search area characteristic diagram;
(3) comparing the template image Z with a candidate area with the same size of the search area feature map, and calculating through a similarity measurement function f to obtain a response map ht; the similarity measurement function f is a Siamese network:
whereinIs a convolution embedding function, which represents the cross correlation of two characteristic graphs, b is an offset; after a target appearance template Z and a search area feature map are input, a function f generates a response map ht;
(4) inputting the response image ht and the historical response image hi obtained in the step (3) into a policy network together to obtain the regularization score of each decision-making action; the historical response graphs hi (i is 1-N) come from a response graph template pool, N historical response graphs are stored in the response graph template pool, and each historical response graph corresponds to the latest good tracking result; then, the action with the highest score is selected and added to the set Ct (i ═ 1 to N);
(5) repeating the step (4) until each historical response map in the response map template pool is traversed; and finally, executing the action with the largest occurrence number in the set Ct (i equals to 1 to N).
2. The method of claim 1, wherein the policy network comprises a state stAction a, learning strategy pi and reward RtState s oftRepresented as a tuple (h)i,ht) Action a includes updating, tracking and redetecting, the reward RtAwarding a prize R according to the overlap ratio of the current bounding box to the objecttBy passingGradient descent algorithm to optimize the policy network:
wherein R isτRepresenting the return of the whole process, during the training process, extracting action samples from the strategy network, and then giving a reward R through the evaluation of the selected actiontThe strategy is optimized by updating parameters using the reward information to maximize the desired reward, resulting in a trained strategy network.
3. The method according to claim 2, wherein the training process considers a frame tracking process as a whole process, and a back propagation algorithm is performed using formula (2), and a reward function is defined as follows:
wherein, the Intersection-over-Unit, i.e. IOU represents the overlapping rate of the prediction box b and the real box g;
the strategy network consists of two 516-dimensional full-connection layers and an output layer, wherein the output layer outputs three actions of updating, tracking and re-detecting, and each full-connection layer is initialized randomly and is subjected to ReLU and batch regularization processing; the whole algorithm trains 200 periods on a target tracking reference OTB data set, and each period is finished after an agent interacts with all training images; for each cycle, after 8192 samples are collected, the policy network starts learning.
4. The method as claimed in claim 3, wherein after each learning of the policy network, the updated policy network continues to be sampled for the next learning, and the learning rate is 10 throughout the training process-6Down to 10-8And a batch size of 64 is used.
5. The method of claim 1, wherein if the most actions in the execution set Ct are update, the target position Pt is updated to Ptp, which is the predicted position of the current target, and the response map ht is added to the response map template pool, one old response map in the response map template pool is discarded, and the target appearance template Z is updated with the current target position information.
6. The method according to claim 1, wherein if the most action in the execution set Ct is tracking, the target position Pt is updated to Ptp, which is the predicted position of the current target.
7. The method of claim 1, wherein if the most actions in the execution set Ct are re-detection, a search area where a target is most likely to appear is obtained through a particle filter, a response map htc of the search area is calculated, a predicted target position Ptc of the re-detection area is obtained, ht — htc and Ptp — Ptc are updated, and then the predicted target position Ptc is input to the policy network again for decision-making.
8. The method of claim 7, wherein in the re-detection process, the particle filter maps out the most likely candidate search regions for the M targets, and for each candidate search region, the tracking network is reused to calculate the response map, and then a best candidate search region is selected by the confidence score:
Ci=max(fi)·cos(γ||Pi-Pt||) (3)
wherein f isiIs a response map of the ith candidate search region, PiAnd PtIs the ith candidate search area and the center position of the target in the previous frame, and γ is a predefined distance penalty parameter.
9. The method of claim 7, wherein when the re-detection is performed twice or more in a frame, the re-detection result is discarded and the initial tracking result is adopted.
10. A target tracking system based on strategy gradient is characterized by comprising a tracker and a decision maker; inputting a target image into a convolutional neural network to obtain a target appearance template Z, inputting a search image into the convolutional neural network to obtain a search area characteristic diagram, and calculating the target appearance template Z and the search area characteristic diagram by the tracker through a similarity measurement function f to obtain a response diagram ht; the similarity measurement function f is a Siamese network:
whereinA convolution embedding function represents the cross correlation of two feature graphs, b is an offset, and after a target appearance template Z and a search area feature graph are input, a response graph ht is generated by a function f; the response map ht and the historical response map hi form the state s of the trackertThe tracker selects an action a according to a strategy pi given by the decision maker;
the decision maker is a trained strategy network and converts the state s of the trackertInputting into a policy network to obtain each decision actionMaking a regularization score; the historical response graphs hi (i is 1-N) come from a response graph template pool, N historical response graphs are stored in the response graph template pool, and each historical response graph corresponds to the latest good tracking result; then, the action with the highest score is selected and added to the set Ct (i ═ 1 to N); traversing each historical response map in the response map template pool; and finally, executing the action with the largest occurrence number in the set Ct (i is 1 to N) as a decision result of the decision maker.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910638477.5A CN110428447B (en) | 2019-07-15 | 2019-07-15 | Target tracking method and system based on strategy gradient |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910638477.5A CN110428447B (en) | 2019-07-15 | 2019-07-15 | Target tracking method and system based on strategy gradient |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110428447A CN110428447A (en) | 2019-11-08 |
CN110428447B true CN110428447B (en) | 2022-04-08 |
Family
ID=68409608
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910638477.5A Active CN110428447B (en) | 2019-07-15 | 2019-07-15 | Target tracking method and system based on strategy gradient |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110428447B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021142571A1 (en) * | 2020-01-13 | 2021-07-22 | 深圳大学 | Twin dual-path target tracking method |
CN117765031B (en) * | 2024-02-21 | 2024-05-03 | 四川盎芯科技有限公司 | Image multi-target pre-tracking method and system for edge intelligent equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846358A (en) * | 2018-06-13 | 2018-11-20 | 浙江工业大学 | Target tracking method for feature fusion based on twin network |
CN109543559A (en) * | 2018-10-31 | 2019-03-29 | 东南大学 | Method for tracking target and system based on twin network and movement selection mechanism |
CN109636829A (en) * | 2018-11-24 | 2019-04-16 | 华中科技大学 | A kind of multi-object tracking method based on semantic information and scene information |
CN109784155A (en) * | 2018-12-10 | 2019-05-21 | 西安电子科技大学 | Visual target tracking method, intelligent robot based on verifying and mechanism for correcting errors |
CN109859241A (en) * | 2019-01-09 | 2019-06-07 | 厦门大学 | Adaptive features select and time consistency robust correlation filtering visual tracking method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9965717B2 (en) * | 2015-11-13 | 2018-05-08 | Adobe Systems Incorporated | Learning image representation by distilling from multi-task networks |
-
2019
- 2019-07-15 CN CN201910638477.5A patent/CN110428447B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108846358A (en) * | 2018-06-13 | 2018-11-20 | 浙江工业大学 | Target tracking method for feature fusion based on twin network |
CN109543559A (en) * | 2018-10-31 | 2019-03-29 | 东南大学 | Method for tracking target and system based on twin network and movement selection mechanism |
CN109636829A (en) * | 2018-11-24 | 2019-04-16 | 华中科技大学 | A kind of multi-object tracking method based on semantic information and scene information |
CN109784155A (en) * | 2018-12-10 | 2019-05-21 | 西安电子科技大学 | Visual target tracking method, intelligent robot based on verifying and mechanism for correcting errors |
CN109859241A (en) * | 2019-01-09 | 2019-06-07 | 厦门大学 | Adaptive features select and time consistency robust correlation filtering visual tracking method |
Non-Patent Citations (5)
Title |
---|
Asynchronous Methods for Deep Reinforcement Learning;Volodymyr Mnih 等;《arXiv》;20160616;1-19 * |
Deep Reinforcement Learning Based Optimal Trajectory Tracking Control of Autonomous Underwater Vehicle;Runsheng Yu 等;《第36届中国控制会议论文集(D)》;20170728;4958-4965 * |
Tracking as Online Decision-Making:Learning a Policy from Streaming Videos with Reinforcement Learning;James Supanˇciˇc 等;《arXiv》;20170717;1-11 * |
一种自适应占空比的目标跟踪策略;沈伟华 等;《南昌大学学报(理科版)》;20150225;第39卷(第1期);39-49 * |
基于值函数和策略梯度的深度强化学习综述;刘建伟 等;《计算机学报》;20181022;第42卷(第6期);1406-1438 * |
Also Published As
Publication number | Publication date |
---|---|
CN110428447A (en) | 2019-11-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111401201B (en) | Aerial image multi-scale target detection method based on spatial pyramid attention drive | |
CN109146921B (en) | Pedestrian target tracking method based on deep learning | |
CN108470355B (en) | Target tracking method fusing convolution network characteristics and discriminant correlation filter | |
CN110120064B (en) | Depth-related target tracking algorithm based on mutual reinforcement and multi-attention mechanism learning | |
CN111462191B (en) | Non-local filter unsupervised optical flow estimation method based on deep learning | |
CN113052873B (en) | Single-target tracking method for on-line self-supervision learning scene adaptation | |
CN107424177A (en) | Positioning amendment long-range track algorithm based on serial correlation wave filter | |
CN111612817A (en) | Target tracking method based on depth feature adaptive fusion and context information | |
CN113129336A (en) | End-to-end multi-vehicle tracking method, system and computer readable medium | |
CN113902991A (en) | Twin network target tracking method based on cascade characteristic fusion | |
CN114332157B (en) | Long-time tracking method for double-threshold control | |
CN110428447B (en) | Target tracking method and system based on strategy gradient | |
CN112233145A (en) | Multi-target shielding tracking method based on RGB-D space-time context model | |
CN115761393B (en) | Anchor-free target tracking method based on template online learning | |
CN112686326A (en) | Target tracking method and system for intelligent sorting candidate frame | |
CN106485283B (en) | A kind of particle filter pedestrian target tracking based on Online Boosting | |
Li et al. | Fish trajectory extraction based on object detection | |
CN114627156A (en) | Consumption-level unmanned aerial vehicle video moving target accurate tracking method | |
CN115953570A (en) | Twin network target tracking method combining template updating and trajectory prediction | |
CN113192110A (en) | Multi-target tracking method, device, equipment and storage medium | |
CN116958057A (en) | Strategy-guided visual loop detection method | |
CN116385915A (en) | Water surface floater target detection and tracking method based on space-time information fusion | |
CN111915648B (en) | Long-term target motion tracking method based on common sense and memory network | |
CN116168060A (en) | Deep twin network target tracking algorithm combining element learning | |
CN116245913A (en) | Multi-target tracking method based on hierarchical context guidance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |