Tang et al., 2023 - Google Patents

Learning spatial-frequency transformer for visual object tracking

Tang et al., 2023

Document ID: 633544829314473240
Author: Tang C; Wang X; Bai Y; Wu Z; Zhang J; Huang Y
Publication year: 2023
Publication venue: IEEE Transactions on Circuits and Systems for Video Technology

External Links

Cited by

Snippet

Recently, some researchers have begun to adopt the Transformer to combine or replace the widely used ResNet as their new backbone network. As the Transformer captures the long- range relations between pixels well using the self-attention scheme, which complements the …

Continue reading at arxiv.org (PDF) (other versions)

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/04—Inference methods or devices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion

Similar Documents

Publication	Publication Date	Title
Tang et al.	2023	Learning spatial-frequency transformer for visual object tracking
Wang et al.	2018	Learning attentions: residual attentional siamese network for high performance online visual tracking
Wang et al.	2019	Inferring salient objects from human fixations
Zhang et al.	2018	Synthetic data generation for end-to-end thermal infrared tracking
Yuan et al.	2023	Robust thermal infrared tracking via an adaptively multi-feature fusion model
Zhang et al.	2022	A background-aware correlation filter with adaptive saliency-aware regularization for visual tracking
Zhang et al.	2023	Learning background-aware and spatial-temporal regularized correlation filters for visual tracking
Li et al.	2019	Visual object tracking via multi-stream deep similarity learning networks
Li et al.	2023	Dynamic feature-memory transformer network for RGBT tracking
Lu et al.	2023	Siamese graph attention networks for robust visual object tracking
Zhou et al.	2022	Regression-selective feature-adaptive tracker for visual object tracking
Gu et al.	2024	Rtsformer: a robust toroidal transformer with spatiotemporal features for visual tracking
Sun et al.	2023	Deblurring transformer tracking with conditional cross-attention
Wang et al.	2023	Siamadt: siamese attention and deformable features fusion network for visual object tracking
Wang et al.	2024	Basketball technique action recognition using 3D convolutional neural networks
Ma et al.	2022	Robust visual tracking via adaptive feature channel selection
Gong et al.	2024	ASAFormer: Visual tracking with convolutional vision transformer and asymmetric selective attention
Liu et al.	2024	Siamdmu: Siamese dual mask update network for visual object tracking
Tian et al.	2024	Toward class-agnostic tracking using feature decorrelation in point clouds
Zhang et al.	2024	A Comprehensive Review of RGBT Tracking
An et al.	2024	Self-supervised facial expression recognition with fine-grained feature selection
Guo et al.	2021	End‐to‐end feature fusion Siamese network for adaptive visual tracking
Gong et al.	2024	Visual tracking with pyramidal feature fusion and transformer based model predictor
Dong et al.	2020	Improving model drift for robust object tracking
Shao et al.	2024	High-level LoRA and hierarchical fusion for enhanced micro-expression recognition