Lee et al., 2021 - Google Patents
GSVNet: Guided spatially-varying convolution for fast semantic segmentation on videoLee et al., 2021
View PDF- Document ID
- 16157576820527993080
- Author
- Lee S
- Chen S
- Peng W
- Publication year
- Publication venue
- 2021 IEEE International Conference on Multimedia and Expo (ICME)
External Links
Snippet
This paper addresses fast semantic segmentation on video. Video segmentation often calls for real-time, or even faster than real-time, processing. One common recipe for conserving computation arising from feature extraction is to propagate features of few selected …
- 230000011218 segmentation 0 title abstract description 59
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00711—Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
- G06K9/32—Aligning or centering of the image pick-up or image-field
- G06K9/3233—Determination of region of interest
- G06K9/325—Detection of text region in scene imagery, real life image or Web pages, e.g. licenses plates, captions on TV images
- G06K9/3258—Scene text, e.g. street name
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/14—Picture signal circuitry for video frequency region
- H04N5/144—Movement detection
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lee et al. | GSVNet: Guided spatially-varying convolution for fast semantic segmentation on video | |
CN110570458B (en) | Target tracking method based on internal cutting and multi-layer characteristic information fusion | |
Park et al. | Per-clip video object segmentation | |
CN109948721B (en) | Video scene classification method based on video description | |
CN110599486A (en) | Method and system for detecting video plagiarism | |
US20040227851A1 (en) | Frame interpolating method and apparatus thereof at frame rate conversion | |
CN111160295A (en) | Video pedestrian re-identification method based on region guidance and space-time attention | |
Tan et al. | Real time video object segmentation in compressed domain | |
Feng et al. | TapLab: A fast framework for semantic video segmentation tapping into compressed-domain knowledge | |
CN111445424B (en) | Image processing method, device, equipment and medium for processing mobile terminal video | |
Soh et al. | Reduction of video compression artifacts based on deep temporal networks | |
CN111968123A (en) | Semi-supervised video target segmentation method | |
US20230245283A1 (en) | Image inpainting method and device | |
CN111507215B (en) | Video target segmentation method based on space-time convolution cyclic neural network and cavity convolution | |
CN112584196A (en) | Video frame insertion method and device and server | |
WO2021073066A1 (en) | Image processing method and apparatus | |
Zhang et al. | A crowd counting framework combining with crowd location | |
Zhang et al. | Weighted convolutional motion-compensated frame rate up-conversion using deep residual network | |
Cho et al. | Histogram shape-based scene-change detection algorithm | |
CN112347852B (en) | Target tracking and semantic segmentation method and device for sports video and plug-in | |
Li et al. | Mdqe: Mining discriminative query embeddings to segment occluded instances on challenging videos | |
Lin et al. | Multistage spatial context models for learned image compression | |
Bae et al. | Dual-dissimilarity measure-based statistical video cut detection | |
Xiong et al. | ψ-net: Point structural information network for no-reference point cloud quality assessment | |
CN106611043B (en) | Video searching method and system |