[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Liu et al., 2023 - Google Patents

Inter-modal masked autoencoder for self-supervised learning on point clouds

Liu et al., 2023

Document ID
16265910150963681401
Author
Liu J
Wu Y
Gong M
Liu Z
Miao Q
Ma W
Publication year
Publication venue
IEEE Transactions on Multimedia

External Links

Snippet

Masked autoencoder (MAE) is a recently widely used self-supervised learning method that has achieved great success in NLP and computer vision. However, the potential advantages of masked pre-training for point cloud understanding have not been fully explored. There is …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06K9/6232Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
    • G06K9/6247Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6268Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6201Matching; Proximity measures
    • G06K9/6202Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • G06K9/4604Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes, intersections
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computer systems based on biological models
    • G06N3/02Computer systems based on biological models using neural network models
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications

Similar Documents

Publication Publication Date Title
Han et al. A survey on vision transformer
Aggarwal et al. Generative adversarial network: An overview of theory and applications
Wang et al. Learning visual relationship and context-aware attention for image captioning
Liu et al. Temporal decoupling graph convolutional network for skeleton-based gesture recognition
Tang et al. CTFN: Hierarchical learning for multimodal sentiment analysis using coupled-translation fusion network
Hu et al. Signbert+: Hand-model-aware self-supervised pre-training for sign language understanding
Zou et al. 6d-vit: Category-level 6d object pose estimation via transformer-based instance representation learning
Zuo et al. Exploiting modality-invariant feature for robust multimodal emotion recognition with missing modalities
Liu et al. Inter-modal masked autoencoder for self-supervised learning on point clouds
Zhang et al. Uncovering prototypical knowledge for weakly open-vocabulary semantic segmentation
Gao et al. PE-Transformer: Path enhanced transformer for improving underwater object detection
CN112651940A (en) Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network
Fang et al. GroupTransNet: Group transformer network for RGB-D salient object detection
Yao et al. Transformers and CNNs fusion network for salient object detection
Gao et al. 3D interacting hand pose and shape estimation from a single RGB image
Xiao et al. A survey of label-efficient deep learning for 3D point clouds
Zheng et al. Sar: Spatial-aware regression for 3d hand pose and mesh reconstruction from a monocular rgb image
Wang et al. Dual-perspective fusion network for aspect-based multimodal sentiment analysis
Li et al. Sequential interactive biased network for context-aware emotion recognition
Li et al. Exploiting global and instance-level perceived feature relationship matrices for 3D face reconstruction and dense alignment
Fan et al. Multi-level contrastive learning: Hierarchical alleviation of heterogeneity in multimodal sentiment analysis
Liu et al. Deep Fuzzy Multi-Teacher Distillation Network for Medical Visual Question Answering
Chen et al. Learning point-language hierarchical alignment for 3D visual grounding
Peng et al. Pattern Recognition and Computer Vision: Third Chinese Conference, PRCV 2020, Nanjing, China, October 16–18, 2020, Proceedings, Part III
Yang et al. Language-aware vision transformer for referring segmentation