Lee et al., 2021 - Google Patents

GSVNet: Guided spatially-varying convolution for fast semantic segmentation on video

Lee et al., 2021

Document ID: 16157576820527993080
Author: Lee S; Chen S; Peng W
Publication year: 2021
Publication venue: 2021 IEEE International Conference on Multimedia and Expo (ICME)

External Links

Cited by

Snippet

This paper addresses fast semantic segmentation on video. Video segmentation often calls for real-time, or even faster than real-time, processing. One common recipe for conserving computation arising from feature extraction is to propagate features of few selected …

Continue reading at arxiv.org (PDF) (other versions)

230000011218 segmentation 0 title abstract description 59

Classifications

- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
- G06F17/30799—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00711—Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/20—Image acquisition
- G06K9/32—Aligning or centering of the image pick-up or image-field
- G06K9/3233—Determination of region of interest
- G06K9/325—Detection of text region in scene imagery, real life image or Web pages, e.g. licenses plates, captions on TV images
- G06K9/3258—Scene text, e.g. street name
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6201—Matching; Proximity measures
- G06K9/6202—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G06K9/00268—Feature extraction; Face representation
- G06K9/00281—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/14—Picture signal circuitry for video frequency region
- H04N5/144—Movement detection

Similar Documents

Publication	Publication Date	Title
Lee et al.	2021	GSVNet: Guided spatially-varying convolution for fast semantic segmentation on video
CN110570458B (en)	2022-02-01	Target tracking method based on internal cutting and multi-layer characteristic information fusion
Park et al.	2022	Per-clip video object segmentation
CN109948721B (en)	2021-07-09	Video scene classification method based on video description
CN110599486A (en)	2019-12-20	Method and system for detecting video plagiarism
US20040227851A1 (en)	2004-11-18	Frame interpolating method and apparatus thereof at frame rate conversion
CN111160295A (en)	2020-05-15	Video pedestrian re-identification method based on region guidance and space-time attention
Tan et al.	2020	Real time video object segmentation in compressed domain
Feng et al.	2020	TapLab: A fast framework for semantic video segmentation tapping into compressed-domain knowledge
CN111445424B (en)	2023-07-18	Image processing method, device, equipment and medium for processing mobile terminal video
Soh et al.	2018	Reduction of video compression artifacts based on deep temporal networks
CN111968123A (en)	2020-11-20	Semi-supervised video target segmentation method
US20230245283A1 (en)	2023-08-03	Image inpainting method and device
CN111507215B (en)	2022-01-28	Video target segmentation method based on space-time convolution cyclic neural network and cavity convolution
CN112584196A (en)	2021-03-30	Video frame insertion method and device and server
WO2021073066A1 (en)	2021-04-22	Image processing method and apparatus
Zhang et al.	2021	A crowd counting framework combining with crowd location
Zhang et al.	2018	Weighted convolutional motion-compensated frame rate up-conversion using deep residual network
Cho et al.	2019	Histogram shape-based scene-change detection algorithm
CN112347852B (en)	2022-07-29	Target tracking and semantic segmentation method and device for sports video and plug-in
Li et al.	2023	Mdqe: Mining discriminative query embeddings to segment occluded instances on challenging videos
Lin et al.	2023	Multistage spatial context models for learned image compression
Bae et al.	2019	Dual-dissimilarity measure-based statistical video cut detection
Xiong et al.	2023	ψ-net: Point structural information network for no-reference point cloud quality assessment
CN106611043B (en)	2020-07-03	Video searching method and system