Zhou et al., 2023 - Google Patents
Deep fusion transformer network with weighted vector-wise keypoints voting for robust 6d object pose estimationZhou et al., 2023
View PDF- Document ID
- 7060722434837826813
- Author
- Zhou J
- Chen K
- Xu L
- Dou Q
- Qin J
- Publication year
- Publication venue
- Proceedings of the IEEE/CVF International Conference on Computer Vision
External Links
Snippet
One critical challenge in 6D object pose estimation from a single RGBD image is efficient integration of two different modalities, ie, color and depth. In this work, we tackle this problem by a novel Deep Fusion Transformer (DFTr) block that can aggregate cross-modality …
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
- G06K9/6203—Shifting or otherwise transforming the patterns to accommodate for positional errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
- G06K9/00355—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Du et al. | Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review | |
He et al. | Ffb6d: A full flow bidirectional fusion network for 6d pose estimation | |
Ma et al. | Vision-centric bev perception: A survey | |
Wang et al. | Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation | |
Hui et al. | 3d siamese voxel-to-bev tracker for sparse point clouds | |
Zhou et al. | Deep fusion transformer network with weighted vector-wise keypoints voting for robust 6d object pose estimation | |
Zeng et al. | 3dmatch: Learning local geometric descriptors from rgb-d reconstructions | |
Lu et al. | Deep learning for 3d point cloud understanding: a survey | |
Poursaeed et al. | Deep fundamental matrix estimation without correspondences | |
Cai et al. | Reconstruct locally, localize globally: A model free method for object pose estimation | |
Yang et al. | Pvt-ssd: Single-stage 3d object detector with point-voxel transformer | |
Zhou et al. | A novel depth and color feature fusion framework for 6d object pose estimation | |
Chen et al. | Oh-former: Omni-relational high-order transformer for person re-identification | |
Zhu et al. | A review of 6d object pose estimation | |
Xie et al. | A deep feature aggregation network for accurate indoor camera localization | |
Li et al. | Generative category-level shape and pose estimation with semantic primitives | |
Wang et al. | 4d unsupervised object discovery | |
Liu et al. | Mgmap: Mask-guided learning for online vectorized hd map construction | |
Huang et al. | Overview of LiDAR point cloud target detection methods based on deep learning | |
Chen et al. | An improved dense-to-sparse cross-modal fusion network for 3D object detection in RGB-D images | |
Chen et al. | Joint scene flow estimation and moving object segmentation on rotational LiDAR data | |
Zhao et al. | 3d-aware hypothesis & verification for generalizable relative object pose estimation | |
Wang et al. | Interactive multi-scale fusion of 2D and 3D features for multi-object tracking | |
Jiang et al. | MLFNet: Monocular lifting fusion network for 6DoF texture-less object pose estimation | |
Yu et al. | Improving feature-based visual localization by geometry-aided matching |