Yang et al., 2021 - Google Patents
Monocular depth estimation based on multi-scale depth map fusionYang et al., 2021
View PDF- Document ID
- 5517054249183354289
- Author
- Yang X
- Chang Q
- Liu X
- He S
- Cui Y
- Publication year
- Publication venue
- IEEE Access
External Links
Snippet
Monocular depth estimation is a basic task in machine vision. In recent years, the performance of monocular depth estimation has been greatly improved. However, most depth estimation networks are based on a very deep network to extract features that lead to …
- 230000004927 fusion 0 title abstract description 60
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20112—Image segmentation details
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding, e.g. from bit-mapped to non bit-mapped
- G06T9/001—Model-based coding, e.g. wire frame
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhou et al. | LSNet: Lightweight spatial boosting network for detecting salient objects in RGB-thermal images | |
CN112001960B (en) | Monocular image depth estimation method based on multi-scale residual error pyramid attention network model | |
CN111047548B (en) | Attitude transformation data processing method and device, computer equipment and storage medium | |
Tang et al. | HRTransNet: HRFormer-driven two-modality salient object detection | |
Chen et al. | Joint hand-object 3d reconstruction from a single image with cross-branch feature fusion | |
Bansal et al. | Marr revisited: 2d-3d alignment via surface normal prediction | |
Chen et al. | Modality-induced transfer-fusion network for RGB-D and RGB-T salient object detection | |
Zhang et al. | Deep hierarchical guidance and regularization learning for end-to-end depth estimation | |
Tu et al. | Consistent 3d hand reconstruction in video via self-supervised learning | |
Li et al. | Learning face image super-resolution through facial semantic attribute transformation and self-attentive structure enhancement | |
Xue et al. | Boundary-induced and scene-aggregated network for monocular depth prediction | |
Yang et al. | Monocular depth estimation based on multi-scale depth map fusion | |
Xu et al. | THCANet: Two-layer hop cascaded asymptotic network for robot-driving road-scene semantic segmentation in RGB-D images | |
Feng et al. | U²-Former: Nested U-Shaped Transformer for Image Restoration via Multi-View Contrastive Learning | |
Zeng et al. | Dual swin-transformer based mutual interactive network for RGB-D salient object detection | |
Pan et al. | Multi-stage feature pyramid stereo network-based disparity estimation approach for two to three-dimensional video conversion | |
Zhang et al. | Spatial-information guided adaptive context-aware network for efficient RGB-D semantic segmentation | |
CN116129289A (en) | Attention edge interaction optical remote sensing image saliency target detection method | |
Zhou et al. | Boundary-guided lightweight semantic segmentation with multi-scale semantic context | |
Tang et al. | Sparse2dense: From direct sparse odometry to dense 3-d reconstruction | |
Cong et al. | Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360$^{\circ} $ Omnidirectional Image | |
Chen et al. | Hybrid attention fusion embedded in transformer for remote sensing image semantic segmentation | |
Lin et al. | Efficient and high-quality monocular depth estimation via gated multi-scale network | |
KR20200073967A (en) | Method and apparatus for determining target object in image based on interactive input | |
Yang et al. | Monocular camera based real-time dense mapping using generative adversarial network |