GB2600359A

GB2600359A - Video interpolation using one or more neural networks

Info

Publication number: GB2600359A
Application number: GB2201524.2A
Authority: GB
Inventors: Reda Fitsum; Sun Deqing; Dundar Aysegul; Shoeybi Mohammad; Liu Guilin; Shih Kevin; Tao Andrew; Kautz Jan; Catanzaro Bryan
Original assignee: Nvidia Corp
Current assignee: Nvidia Corp
Priority date: 2019-09-03
Filing date: 2020-08-19
Publication date: 2022-04-27
Also published as: WO2021045904A1; US20210067735A1; CN114303160A; DE112020003165T5

Abstract

Apparatuses, systems, and techniques to enhance video. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having a higher frame rate, higher resolution, or reduced number of missing or corrupt video frames.

Claims

CLAIMS WHAT IS CLAIMED IS

1. A processor comprising: one or more arithmetic logic units (ALUs) to be configured to generate higher frame rate video from lower frame rate video using one or more neural networks

2. The processor of claim 1, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint

3. The processor of claim 2, wherein the unsupervised training includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized

4. The processor of claim 1, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks

5. The processor of claim 4, wherein the pseudo-supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized

6. The processor of claim 1, wherein the one or more neural networks utilize one or more image interpolation algorithms

7. The processor of claim 1, wherein the one or more ALUs are further to be configured to generate enhanced video, using the one or more neural networks, having a higher resolution or lower frame drop rate than input video

8. A system comprising: one or more processors to be configured to generate higher frame rate video from lower frame rate video using one or more neural networks; and one or more memories to store the one or more neural networks

9. The system of claim 8, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint

10. The system of claim 9, wherein the cycle consistency constraint includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized

11. The system of claim 8, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks

12. The system of claim 11, wherein the pseudo-supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized

13. The system of claim 8, wherein the one or more neural networks utilize one or more image interpolation algorithms

14. A machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least: generate higher frame rate video from lower frame rate video using one or more neural networks

15. The machine-readable medium of claim 14, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint

16. The machine-readable medium of claim 15, wherein the cycle consistency constraint includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized

17. The machine-readable medium of claim 14, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks

18. The machine-readable medium of claim 17, wherein the pseudo- supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized

19. The machine-readable medium of claim 14, wherein the one or more neural networks utilize one or more image interpolation algorithms

20. A processor comprising: one or more arithmetic logic units (ALUs) to train one or more neural networks, at least in part, to generate higher frame rate video from lower frame rate video

21. The processor of claim 20, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint

22. The processor of claim 21, wherein the cycle consistency constraint includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized

23. The processor of claim 20, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks

24. The processor of claim 23, wherein the pseudo-supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized

25. The processor of claim 20, wherein the one or more neural networks utilize one or more image interpolation algorithms

26. A system comprising: one or more processors to calculate parameters corresponding to one or more neural networks, at least in part, to generate higher frame rate video from lower frame rate video; and one or more memories to store the parameters

27. The system of claim 26, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint

28. The system of claim 27, wherein the cycle consistency constraint includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized

29. The system of claim 26, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks

30. The system of claim 29, wherein the pseudo-supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized

31. The system of claim 26, wherein the one or more neural networks utilize one or more image interpolation algorithms

32. A machine-readable medium having stored thereon a set of instructions, which if performed by one or more processors, cause the one or more processors to at least: cause one or more neural networks to be trained, at least in part, to generate higher frame rate video from lower frame rate video; and one or more memories to store the parameters

33. The machine-readable medium of claim 32, wherein one or more neural networks are trained using unsupervised training with at least one cycle consistency constraint

34. The machine-readable medium of claim 33, wherein the cycle consistency constraint includes generating, from a frame triplet, a set of intermediate frames and generating a version of a middle triplet frame from the intermediate frames for determining a loss value to be minimized

35. The machine-readable medium of claim 32, wherein the one or more neural networks are refined using pseudo-supervised training for a domain other than was used for training the one or more neural networks

36. The machine-readable medium of claim 35, wherein the pseudo- supervised training includes generating, using one or more already trained neural networks, versions of an intermediate frame using each of two adjacent video frames for determining a loss value to be minimized .

37. The machine-readable medium of claim 32, wherein the one or more neural networks utilize one or more image interpolation algorithms.