GB2600359A - Video interpolation using one or more neural networks - Google Patents
Video interpolation using one or more neural networks Download PDFInfo
- Publication number
- GB2600359A GB2600359A GB2201524.2A GB202201524A GB2600359A GB 2600359 A GB2600359 A GB 2600359A GB 202201524 A GB202201524 A GB 202201524A GB 2600359 A GB2600359 A GB 2600359A
- Authority
- GB
- United Kingdom
- Prior art keywords
- neural networks
- frame
- training
- processor
- pseudo
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract 40
- 230000015654 memory Effects 0.000 claims 3
- 238000000034 method Methods 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0117—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving conversion of the spatial resolution of the incoming video signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0127—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level by changing the field or frame frequency of the incoming video signal, e.g. frame rate converter
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0135—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving interpolation processes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Graphics (AREA)
- Image Analysis (AREA)
Abstract
Apparatuses, systems, and techniques to enhance video. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having a higher frame rate, higher resolution, or reduced number of missing or corrupt video frames.
Claims (37)
1. A processor comprising: one or more arithmetic logic units (ALUs) to be configured to generate higher frame rate video from lower frame rate video using one or more neural networks
2. The processor of claim 1, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint
3. The processor of claim 2, wherein the unsupervised training includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized
4. The processor of claim 1, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks
5. The processor of claim 4, wherein the pseudo-supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized
6. The processor of claim 1, wherein the one or more neural networks utilize one or more image interpolation algorithms
7. The processor of claim 1, wherein the one or more ALUs are further to be configured to generate enhanced video, using the one or more neural networks, having a higher resolution or lower frame drop rate than input video
8. A system comprising: one or more processors to be configured to generate higher frame rate video from lower frame rate video using one or more neural networks; and one or more memories to store the one or more neural networks
9. The system of claim 8, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint
10. The system of claim 9, wherein the cycle consistency constraint includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized
11. The system of claim 8, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks
12. The system of claim 11, wherein the pseudo-supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized
13. The system of claim 8, wherein the one or more neural networks utilize one or more image interpolation algorithms
14. A machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least: generate higher frame rate video from lower frame rate video using one or more neural networks
15. The machine-readable medium of claim 14, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint
16. The machine-readable medium of claim 15, wherein the cycle consistency constraint includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized
17. The machine-readable medium of claim 14, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks
18. The machine-readable medium of claim 17, wherein the pseudo- supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized
19. The machine-readable medium of claim 14, wherein the one or more neural networks utilize one or more image interpolation algorithms
20. A processor comprising: one or more arithmetic logic units (ALUs) to train one or more neural networks, at least in part, to generate higher frame rate video from lower frame rate video
21. The processor of claim 20, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint
22. The processor of claim 21, wherein the cycle consistency constraint includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized
23. The processor of claim 20, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks
24. The processor of claim 23, wherein the pseudo-supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized
25. The processor of claim 20, wherein the one or more neural networks utilize one or more image interpolation algorithms
26. A system comprising: one or more processors to calculate parameters corresponding to one or more neural networks, at least in part, to generate higher frame rate video from lower frame rate video; and one or more memories to store the parameters
27. The system of claim 26, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint
28. The system of claim 27, wherein the cycle consistency constraint includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized
29. The system of claim 26, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks
30. The system of claim 29, wherein the pseudo-supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized
31. The system of claim 26, wherein the one or more neural networks utilize one or more image interpolation algorithms
32. A machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least: cause one or more neural networks to be trained, at least in part, to generate higher frame rate video from lower frame rate video; and one or more memories to store the parameters
33. The machine-readable medium of claim 32, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint
34. The machine-readable medium of claim 33, wherein the cycle consistency constraint includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized
35. The machine-readable medium of claim 32, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks
36. The machine-readable medium of claim 35, wherein the pseudo- supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized .
37. The machine-readable medium of claim 32, wherein the one or more neural networks utilize one or more image interpolation algorithms.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/559,312 US20210067735A1 (en) | 2019-09-03 | 2019-09-03 | Video interpolation using one or more neural networks |
PCT/US2020/046978 WO2021045904A1 (en) | 2019-09-03 | 2020-08-19 | Video interpolation using one or more neural networks |
Publications (1)
Publication Number | Publication Date |
---|---|
GB2600359A true GB2600359A (en) | 2022-04-27 |
Family
ID=72292682
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB2201524.2A Withdrawn GB2600359A (en) | 2019-09-03 | 2020-08-19 | Video interpolation using one or more neural networks |
Country Status (5)
Country | Link |
---|---|
US (1) | US20210067735A1 (en) |
CN (1) | CN114303160A (en) |
DE (1) | DE112020003165T5 (en) |
GB (1) | GB2600359A (en) |
WO (1) | WO2021045904A1 (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI729826B (en) * | 2020-05-26 | 2021-06-01 | 友達光電股份有限公司 | Display method |
US11763544B2 (en) | 2020-07-07 | 2023-09-19 | International Business Machines Corporation | Denoising autoencoder image captioning |
US11651522B2 (en) * | 2020-07-08 | 2023-05-16 | International Business Machines Corporation | Adaptive cycle consistency multimodal image captioning |
US12061672B2 (en) * | 2020-09-10 | 2024-08-13 | Canon Kabushiki Kaisha | Image processing method, image processing apparatus, learning method, learning apparatus, and storage medium |
US12003885B2 (en) * | 2021-06-14 | 2024-06-04 | Microsoft Technology Licensing, Llc | Video frame interpolation via feature pyramid flows |
CN113891027B (en) * | 2021-12-06 | 2022-03-15 | 深圳思谋信息科技有限公司 | Video frame insertion model training method and device, computer equipment and storage medium |
CN114782497B (en) * | 2022-06-20 | 2022-09-27 | 中国科学院自动化研究所 | Motion function analysis method and electronic device |
US20240037150A1 (en) * | 2022-08-01 | 2024-02-01 | Qualcomm Incorporated | Scheduling optimization in sequence space |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109068174A (en) * | 2018-09-12 | 2018-12-21 | 上海交通大学 | Video frame rate upconversion method and system based on cyclic convolution neural network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109379550B (en) * | 2018-09-12 | 2020-04-17 | 上海交通大学 | Convolutional neural network-based video frame rate up-conversion method and system |
-
2019
- 2019-09-03 US US16/559,312 patent/US20210067735A1/en not_active Abandoned
-
2020
- 2020-08-19 WO PCT/US2020/046978 patent/WO2021045904A1/en active Application Filing
- 2020-08-19 CN CN202080061061.2A patent/CN114303160A/en active Pending
- 2020-08-19 GB GB2201524.2A patent/GB2600359A/en not_active Withdrawn
- 2020-08-19 DE DE112020003165.9T patent/DE112020003165T5/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109068174A (en) * | 2018-09-12 | 2018-12-21 | 上海交通大学 | Video frame rate upconversion method and system based on cyclic convolution neural network |
Non-Patent Citations (3)
Title |
---|
Anonymous, "GitHub - NVIDIA/unsupervised-video-interpolation: Unsupervised Video Interpolation using Cycle Consistency", (20200131), URL: https://github.com/NVIDIA/unsupervised-video-interpolation, (20201103), XP055746680 [XP] 1-37 * page 1 - page 9 * * |
Anonymous, "Volta (microarchitecture) - Wikipedia", (20190716), URL: https://en.wikipedia.org/w/index.php?title=Volta_(microarchitecture)&oldid=906608016, (20201104), XP055746770 [A] 1-37 * page 1 - page 2 * * |
FITSUM A. REDA; DEQING SUN; AYSEGUL DUNDAR; MOHAMMAD SHOEYBI; GUILIN LIU; KEVIN J. SHIH; ANDREW TAO; JAN KAUTZ; BRYAN CATANZARO: "Unsupervised Video Interpolation Using Cycle Consistency", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 13 June 2019 (2019-06-13), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081381443 * |
Also Published As
Publication number | Publication date |
---|---|
WO2021045904A1 (en) | 2021-03-11 |
US20210067735A1 (en) | 2021-03-04 |
CN114303160A (en) | 2022-04-08 |
DE112020003165T5 (en) | 2022-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
GB2600359A (en) | Video interpolation using one or more neural networks | |
GB2600346A (en) | Video upsampling using one or more neural networks | |
GB2600869A (en) | Video prediction using one or more neural networks | |
SA521420966B1 (en) | Multiple History Based Non-Adjacent MVPS for Wavefront Processing of Video Coding | |
WO2018170393A3 (en) | Frame interpolation via adaptive convolution and adaptive separable convolution | |
US10599745B2 (en) | Apparatus and methods for vector operations | |
BR112019023395A2 (en) | LOW LATENCY MATRIX MULTIPLICATION UNIT | |
GB2606066A (en) | Training one or more neural networks using synthetic data | |
WO2009042101A3 (en) | Processing an input image to reduce compression-related artifacts | |
US20170206450A1 (en) | Method and apparatus for machine learning | |
NO20084862L (en) | Bandwidth enhancement for 3D display | |
NZ608999A (en) | Composite video streaming using stateless compression | |
RU2018110382A (en) | REPRODUCING AUGMENTATION OF IMAGE DATA | |
GB2606060A (en) | Upsampling an image using one or more neural networks | |
CN101843483A (en) | Method and device for realizing water-fat separation | |
BR9507794A (en) | Apparatus for decoding a video device sequence to produce images partially dependent on the input of the interactive user device to a primarily intended data carrier device to receive a data model method to transform an image source method to generate an apparatus field to transform an image source method to generate a field and method to reduce the effects of misalignment | |
JP2017535142A5 (en) | ||
GB2600300A (en) | Image generation using one or more neural networks | |
JP2009272781A5 (en) | ||
BR112015000367A2 (en) | image switching method and device | |
Johnson et al. | Motion correction in MRI using deep learning | |
JP2019201256A5 (en) | ||
US11734557B2 (en) | Neural network with frozen nodes | |
KR102654862B1 (en) | Apparatus and Method of processing image | |
GB2605030A (en) | Image generation using one or more neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WAP | Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1) |