Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.09494 (cs)

[Submitted on 17 Mar 2022 (v1), last revised 9 May 2022 (this version, v3)]

Title:Transframer: Arbitrary Frame Prediction with Generative Models

Authors:Charlie Nash, João Carreira, Jacob Walker, Iain Barr, Andrew Jaegle, Mateusz Malinowski, Peter Battaglia

View PDF

Abstract:We present a general-purpose framework for image modelling and vision tasks based on probabilistic frame prediction. Our approach unifies a broad range of tasks, from image segmentation, to novel view synthesis and video interpolation. We pair this framework with an architecture we term Transframer, which uses U-Net and Transformer components to condition on annotated context frames, and outputs sequences of sparse, compressed image features. Transframer is the state-of-the-art on a variety of video generation benchmarks, is competitive with the strongest models on few-shot view synthesis, and can generate coherent 30 second videos from a single image without any explicit geometric information. A single generalist Transframer simultaneously produces promising results on 8 tasks, including semantic segmentation, image classification and optical flow prediction with no task-specific architectural components, demonstrating that multi-task computer vision can be tackled using probabilistic image models. Our approach can in principle be applied to a wide range of applications that require learning the conditional structure of annotated image-formatted data.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2203.09494 [cs.CV]
	(or arXiv:2203.09494v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.09494

Submission history

From: Charlie Nash [view email]
[v1] Thu, 17 Mar 2022 17:48:32 UTC (44,922 KB)
[v2] Fri, 18 Mar 2022 10:34:43 UTC (44,922 KB)
[v3] Mon, 9 May 2022 17:02:49 UTC (44,923 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Transframer: Arbitrary Frame Prediction with Generative Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Transframer: Arbitrary Frame Prediction with Generative Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators