Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleOctober 2024
Quantifying NBA Shot Quality: A Deep Network Approach
MMSports '24: Proceedings of the 7th ACM International Workshop on Multimedia Content Analysis in SportsPages 91–95https://doi.org/10.1145/3689061.3689068Since the introduction of player positional tracking data to the NBA in 2013, the field of basketball analytics has been steadily developing. As such, more and more teams utilize data-driven approaches to maximize the potential for their team to score a ...
- ArticleNovember 2024
Investigating Style Similarity in Diffusion Models
- Gowthami Somepalli,
- Anubhav Gupta,
- Kamal Gupta,
- Shramay Palta,
- Micah Goldblum,
- Jonas Geiping,
- Abhinav Shrivastava,
- Tom Goldstein
AbstractGenerative models are now widely used by graphic designers and artists. Prior works have shown that these models remember and often replicate content from their training data during generation. Hence as their proliferation increases, it has become ...
- ArticleNovember 2024
Do Text-Free Diffusion Models Learn Discriminative Visual Representations?
- Soumik Mukhopadhyay,
- Matthew Gwilliam,
- Yosuke Yamaguchi,
- Vatsal Agarwal,
- Namitha Padmanabhan,
- Archana Swaminathan,
- Tianyi Zhou,
- Jun Ohya,
- Abhinav Shrivastava
AbstractDiffusion models have proven to be state-of-the-art methods for generative tasks. These models involve training a U-Net to iteratively predict and remove noise, and the resulting model can synthesize high-fidelity, diverse, novel images. However, ...
- ArticleNovember 2024
Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models
AbstractImage customization has been extensively studied in text-to-image (T2I) diffusion models, leading to impressive outcomes and applications. With the emergence of text-to-video (T2V) diffusion models, its temporal counterpart, motion customization, ...
- ArticleNovember 2024
Latent-INR: A Flexible Framework for Implicit Representations of Videos with Discriminative Semantics
AbstractImplicit Neural Networks (INRs) have emerged as powerful representations to encode all forms of data, including images, videos, audios, and scenes. With video, many INRs for video have been proposed for the compression task, and recent methods ...
-
- ArticleNovember 2024
- ArticleOctober 2024
LEIA: Latent View-Invariant Embeddings for Implicit 3D Articulation
AbstractNeural Radiance Fields (NeRFs) have revolutionized the reconstruction of static scenes and objects in 3D, offering unprecedented quality. However, extending NeRFs to model dynamic objects or object articulations remains a challenging problem. ...
- ArticleOctober 2024
Trajectory-Aligned Space-Time Tokens for Few-Shot Action Recognition
AbstractWe propose a simple yet effective approach for few-shot action recognition, emphasizing the disentanglement of motion and appearance representations. By harnessing recent progress in tracking, specifically point trajectories and self-supervised ...
- ArticleOctober 2024
- ArticleSeptember 2024
LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors
AbstractWe present a simple self-supervised method to enhance the performance of ViT features for dense downstream tasks. Our Lightweight Feature Transform (LiFT) is a straightforward and compact postprocessing network that can be applied to enhance the ...
- research-articleDecember 2023
Video dynamics prior: an internal learning approach for robust video enhancements
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsArticle No.: 1483, Pages 34228–34246In this paper, we present a novel robust framework for low-level vision tasks, including denoising, object removal, frame interpolation, and super-resolution, that does not require any external training data corpus. Our proposed approach directly learns ...
- research-articleJune 2023
Leveraging Hand-Object Interactions in Assistive Egocentric Vision
IEEE Transactions on Pattern Analysis and Machine Intelligence (ITPM), Volume 45, Issue 6Pages 6820–6831https://doi.org/10.1109/TPAMI.2021.3123303Egocentric vision holds great promise for increasing access to visual information and improving the quality of life for blind people. While we strive to improve recognition performance, it remains difficult to identify which object is of interest to the ...
- ArticleOctober 2022
Neural Space-Filling Curves
AbstractWe present Neural Space-filling Curves (SFCs), a data-driven approach to infer a context-based scan order for a set of images. Linear ordering of pixels forms the basis for many applications such as video scrambling, compression, and auto-...
- ArticleOctober 2022
Burn After Reading: Online Adaptation for Cross-domain Streaming Data
AbstractIn the context of online privacy, many methods propose complex security preserving measures to protect sensitive data. In this paper we note that: not storing any sensitive data is the best form of security. We propose an online framework called “...
- ArticleOctober 2022
Improving Closed and Open-Vocabulary Attribute Prediction Using Transformers
AbstractWe study recognizing attributes for objects in visual scenes. We consider attributes to be any phrases that describe an object’s physical and semantic properties, and its relationships with other objects. Existing work studies attribute prediction ...
- ArticleOctober 2022
Learning Semantic Correspondence with Sparse Annotations
AbstractFinding dense semantic correspondence is a fundamental problem in computer vision, which remains challenging in complex scenes due to background clutter, extreme intra-class variation, and a severe lack of ground truth. In this paper, we aim to ...
- research-articleDecember 2021
PatchGame: learning to signal mid-level patches in referential games
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing SystemsArticle No.: 1992, Pages 26015–26027We study a referential game (a type of signaling game) where two agents communicate with each other via a discrete bottleneck to achieve a common goal. In our referential game, the goal of the speaker is to compose a message or a symbolic representation ...
- research-articleDecember 2021
NeRV: neural representations for videos
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing SystemsArticle No.: 1649, Pages 21557–21568We propose a novel neural representation for videos (NeRV) which encodes videos in neural networks. Unlike conventional representations that treat videos as frame sequences, we represent videos as neural networks taking frame index as input. Given a ...
- research-articleMay 2021
No-frills Dynamic Planning using Static Planners
2021 IEEE International Conference on Robotics and Automation (ICRA)Pages 2005–2011https://doi.org/10.1109/ICRA48506.2021.9560762In this paper, we address the task of interacting with dynamic environments where the changes in the environment are independent of the agent. We study this through the context of trapping a moving ball with a UR5 robotic arm. Our key contribution is an ...
- ArticleAugust 2020
Quantization Guided JPEG Artifact Correction
AbstractThe JPEG image compression algorithm is the most popular method of image compression because of it’s ability for large compression ratios. However, to achieve such high compression, information is lost. For aggressive quantization settings, this ...