Zhou et al., 2023 - Google Patents

Deep fusion transformer network with weighted vector-wise keypoints voting for robust 6d object pose estimation

Zhou et al., 2023

Document ID: 7060722434837826813
Author: Zhou J; Chen K; Xu L; Dou Q; Qin J
Publication year: 2023
Publication venue: Proceedings of the IEEE/CVF International Conference on Computer Vision

External Links

Cited by

Snippet

One critical challenge in 6D object pose estimation from a single RGBD image is efficient integration of two different modalities, ie, color and depth. In this work, we tackle this problem by a novel Deep Fusion Transformer (DFTr) block that can aggregate cross-modality …

Continue reading at openaccess.thecvf.com (PDF) (other versions)

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
- G06K9/6203—Shifting or otherwise transforming the patterns to accommodate for positional errors
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00335—Recognising movements or behaviour, e.g. recognition of gestures, dynamic facial expressions; Lip-reading
- G06K9/00355—Recognition of hand or arm movements, e.g. recognition of deaf sign language
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image

Similar Documents

Publication	Publication Date	Title
Du et al.	2021	Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review
He et al.	2021	Ffb6d: A full flow bidirectional fusion network for 6d pose estimation
Ma et al.	2022	Vision-centric bev perception: A survey
Wang et al.	2021	Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation
Hui et al.	2021	3d siamese voxel-to-bev tracker for sparse point clouds
Zhou et al.	2023	Deep fusion transformer network with weighted vector-wise keypoints voting for robust 6d object pose estimation
Zeng et al.	2017	3dmatch: Learning local geometric descriptors from rgb-d reconstructions
Lu et al.	2020	Deep learning for 3d point cloud understanding: a survey
Poursaeed et al.	2018	Deep fundamental matrix estimation without correspondences
Cai et al.	2020	Reconstruct locally, localize globally: A model free method for object pose estimation
Yang et al.	2023	Pvt-ssd: Single-stage 3d object detector with point-voxel transformer
Zhou et al.	2020	A novel depth and color feature fusion framework for 6d object pose estimation
Chen et al.	2021	Oh-former: Omni-relational high-order transformer for person re-identification
Zhu et al.	2022	A review of 6d object pose estimation
Xie et al.	2022	A deep feature aggregation network for accurate indoor camera localization
Li et al.	2023	Generative category-level shape and pose estimation with semantic primitives
Wang et al.	2022	4d unsupervised object discovery
Liu et al.	2024	Mgmap: Mask-guided learning for online vectorized hd map construction
Huang et al.	2022	Overview of LiDAR point cloud target detection methods based on deep learning
Chen et al.	2024	An improved dense-to-sparse cross-modal fusion network for 3D object detection in RGB-D images
Chen et al.	2024	Joint scene flow estimation and moving object segmentation on rotational LiDAR data
Zhao et al.	2023	3d-aware hypothesis & verification for generalizable relative object pose estimation
Wang et al.	2022	Interactive multi-scale fusion of 2D and 3D features for multi-object tracking
Jiang et al.	2022	MLFNet: Monocular lifting fusion network for 6DoF texture-less object pose estimation
Yu et al.	2022	Improving feature-based visual localization by geometry-aided matching