Diversity is Definitely Needed: Improving Model-Agnostic Zero-shot Classification via Stable Diffusion
- 1. Queensland University of Technology
- 2. Sentient Vision Systems
- 3. La Trobe University
Description
In this work, we investigate the problem of Model-Agnostic Zero-Shot Classification (MA-ZSC), which refers to training non-specific classification architectures (downstream models) to classify real images without using any real images during training. Recent research has demonstrated that generating synthetic training images using diffusion models provides a potential solution to address MA-ZSC. However, the performance of this approach currently falls short of that achieved by large-scale vision-language models. One possible explanation is a potential significant domain gap between synthetic and real images. Our work offers a fresh perspective on the problem by providing initial insights that MA-ZSC performance can be improved by improving the diversity of images in the generated dataset. We propose a set of modifications to the text-to-image generation process using a pre-trained diffusion model to enhance diversity, which we refer to as our bag of tricks. Our approach shows notable improvements in various classification architectures, with results comparable to state-of-the-art models such as CLIP. To validate our approach, we conduct experiments on CIFAR10, CIFAR100, and EuroSAT, which is particularly difficult for zero-shot classification due to its satellite image domain. We evaluate our approach with five classification architectures, including ResNet and ViT. Our findings provide initial insights into the problem of MA-ZSC using diffusion models.
Files
Files
(4.0 GB)
Name | Size | Download all |
---|---|---|
md5:e5e221eab02817d6949cef1ab651d70c
|
152.6 MB | Download |
md5:aba34b4f0bda1527b38e6857324e4db8
|
534.4 MB | Download |
md5:da7e68befc76006cd4bac8296aa351d3
|
148.2 MB | Download |
md5:a868589b7b21d4b3e1f34211ee4329d2
|
160.6 MB | Download |
md5:9af59eeb7e77e118f4aabfdd5140e1ba
|
130.9 MB | Download |
md5:eef98d135f2589aefb2f86aee0232db3
|
153.7 MB | Download |
md5:0bf4520b321784fd7943297f60589cc0
|
1.0 GB | Download |
md5:340f0c542e3ffb8870dee2d69d6de8a2
|
147.6 MB | Download |
md5:b578e52acd91ba0b11ca3dcfa1df3fdb
|
152.6 MB | Download |
md5:39ef36d93b00fdda63446bd1dd051acf
|
312.9 MB | Download |
md5:3aa7edbe47f245b1e771bc397c1d7152
|
735.2 MB | Download |
md5:d778468f3352a7c65a41206ef075e942
|
324.1 MB | Download |