[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Lee et al., 2021 - Google Patents

GSVNet: Guided spatially-varying convolution for fast semantic segmentation on video

Lee et al., 2021

View PDF
Document ID
16157576820527993080
Author
Lee S
Chen S
Peng W
Publication year
Publication venue
2021 IEEE International Conference on Multimedia and Expo (ICME)

External Links

Snippet

This paper addresses fast semantic segmentation on video. Video segmentation often calls for real-time, or even faster than real-time, processing. One common recipe for conserving computation arising from feature extraction is to propagate features of few selected …
Continue reading at arxiv.org (PDF) (other versions)

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30781Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F17/30784Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
    • G06F17/30799Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre using low-level visual features of the video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00624Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
    • G06K9/00711Recognising video content, e.g. extracting audiovisual features from movies, extracting representative key-frames, discriminating news vs. sport content
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/20Image acquisition
    • G06K9/32Aligning or centering of the image pick-up or image-field
    • G06K9/3233Determination of region of interest
    • G06K9/325Detection of text region in scene imagery, real life image or Web pages, e.g. licenses plates, captions on TV images
    • G06K9/3258Scene text, e.g. street name
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6201Matching; Proximity measures
    • G06K9/6202Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • G06K9/00268Feature extraction; Face representation
    • G06K9/00281Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/144Movement detection

Similar Documents

Publication Publication Date Title
Lee et al. GSVNet: Guided spatially-varying convolution for fast semantic segmentation on video
CN110570458B (en) Target tracking method based on internal cutting and multi-layer characteristic information fusion
Park et al. Per-clip video object segmentation
CN109948721B (en) Video scene classification method based on video description
CN110599486A (en) Method and system for detecting video plagiarism
US20040227851A1 (en) Frame interpolating method and apparatus thereof at frame rate conversion
CN111160295A (en) Video pedestrian re-identification method based on region guidance and space-time attention
Tan et al. Real time video object segmentation in compressed domain
Feng et al. TapLab: A fast framework for semantic video segmentation tapping into compressed-domain knowledge
CN111445424B (en) Image processing method, device, equipment and medium for processing mobile terminal video
Soh et al. Reduction of video compression artifacts based on deep temporal networks
CN111968123A (en) Semi-supervised video target segmentation method
US20230245283A1 (en) Image inpainting method and device
CN111507215B (en) Video target segmentation method based on space-time convolution cyclic neural network and cavity convolution
CN112584196A (en) Video frame insertion method and device and server
WO2021073066A1 (en) Image processing method and apparatus
Zhang et al. A crowd counting framework combining with crowd location
Zhang et al. Weighted convolutional motion-compensated frame rate up-conversion using deep residual network
Cho et al. Histogram shape-based scene-change detection algorithm
CN112347852B (en) Target tracking and semantic segmentation method and device for sports video and plug-in
Li et al. Mdqe: Mining discriminative query embeddings to segment occluded instances on challenging videos
Lin et al. Multistage spatial context models for learned image compression
Bae et al. Dual-dissimilarity measure-based statistical video cut detection
Xiong et al. ψ-net: Point structural information network for no-reference point cloud quality assessment
CN106611043B (en) Video searching method and system