Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Introduction

Meissonic is a non-autoregressive mask image modeling text-to-image synthesis model that can generate high-resolution images. It is designed to run on consumer graphics cards.

Note: This is a project under development. If you encounter any specific performance issues or find significant discrepancies with the results reported in the paper, please submit an issue on the GitHub repository! Thank you for your support!

Prerequisites

Install requirements

pip install accelerate pytorch-lightning torch torchvision tqdm transformers diffusers numpy gradio --extra-index-url https://download.pytorch.org.whl/cu124

Install diffusers

git clone https://github.com/huggingface/diffusers.git
cd diffusers
pip install -e .

Usage

text2image

python inference.py

zero-shot inpaint or outpaint

python inpaint.py --mode inpaint
python inpaint.py --mode outpaint

Some Interesting Examples

Prompt: "A pillow with a picture of a Husky on it."

A pillow with a picture of a Husky on it.

Prompt: "A white coffee mug, a solid black background"

A white coffee mug, a solid black background

Citation

If you find this work helpful, please consider citing:

@article{bai2024meissonic,
  title={Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis},
  author={Bai, Jinbin and Ye, Tian and Chow, Wei and Song, Enxin and Chen, Qing-Guo and Li, Xiangtai and Dong, Zhen and Zhu, Lei and Yan, Shuicheng},
  journal={arXiv preprint arXiv:2410.08261},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
assets		assets
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
inpaint.py		inpaint.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Introduction

Prerequisites

Install requirements

Install diffusers

Usage

text2image

zero-shot inpaint or outpaint

Some Interesting Examples

Citation

About

Uh oh!

Releases

Packages

Languages

License

TimeLovercc/Meissonic

Folders and files

Latest commit

History

Repository files navigation

Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis

Introduction

Prerequisites

Install requirements

Install diffusers

Usage

text2image

zero-shot inpaint or outpaint

Some Interesting Examples

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages