BAGEL Gradio UI Fork with Extended Features

This is a fork of the official BAGEL project's Gradio WebUI, incorporating several quality-of-life improvements and features.

Based on the original work by Chaorui Deng* et al.

Latest Update (June 03, 2025)

This update adds support for dfloat11 compressed BAGEL models and enhances model management, flexibility and inference speed within the BagelUI:

DFloat11 Compressed Model Support:
- Integrated full support for loading and running DFloat11 compressed version of BAGEL model.
Dynamic Model Loading & Switching:
- Introduced new ⚙️ Models tab, allowing dynamic loading and switching between different BAGEL model checkpoints and quantizations.
Inference Optimizations:
- Made modifications to reduce memory overhead and speed up operations by disabling gradient tracking.

Special thanks to this repo for the original inference implementation of the DFloat11 model: https://github.com/LeanModels/Bagel-DFloat11/

The BagelUI-Colab.ipynb Jupyter Notebook has also been updated.

✨ Added Features

This fork builds upon the original BAGEL Gradio UI by adding the following functionalities:

Structured Image Saving: Automatically saves all generated and edited images to a configurable output directory (output/ by default) with a clear folder structure based on the tab and mode used (Text-to-Image, Image Edit Standard, Image Edit Task Breakdown projects, X/Y Plot runs).
Batch Image Generation & Editing: Use the Batch Size slider in the Text-to-Image and Image Edit tabs to generate multiple images sequentially with varying seeds (or a fixed seed if specified).
LLM-Powered Task Breakdown for Editing (Experimental):
- An experimental mode in the Image Edit tab (Enable Task Breakdown) that leverages the built-in Qwen2 LLM to break down a complex editing prompt into sequential sub-steps.
- These sub-steps are applied one after another to the image.
X/Y Plotting:
- A dedicated X/Y Plot menu in the Text to Image and Image Edit tabs.
- Allows selecting up to two hyperparameters (X and Y axes) and providing comma-separated values for each.
- Generates an image for every combination of the selected parameter values.
- Includes Prompt S/R (Search/Replace) parameter for axes, akin to the same feature in Automatic1111's Stable Diffusion webui. Search for a string in the prompt and replace it with something else (separated by commas).
- Assembles the generated images into a single grid with axis labels indicating the parameter values used for each row/column.
Batch Image Understanding/Captioning:
- Adds an Input Mode button to the Image Understanding tab for switching between single image and batch processing.
- The Batch (Files/ZIP) mode accepts multiple image files or a single ZIP file containing images.
- Processes each image file in the batch sequentially, saving the understanding result for each into a corresponding .txt file.
- Provides a downloadable ZIP file containing all processed images and their generated .txt files.

⚙️ Local Installation

(A Jupyter Notebook 'BagelUI-colab.ipynb' for easy cloud-use is also provided, L4 GPU is enough to run DFloat11)

Clone this fork:

git clone https://github.com/dasjoms/BagelUI.git
cd BagelUI

Set up environment:

conda create -n bagel python=3.11.12 -y
conda activate bagel
pip install -r requirements.txt
pip install flash_attn==2.5.8 --no-build-isolation

2.5 Manually install dfloat11 package

  pip install dfloat11

Download pretrained checkpoint and/or DFloat11 compressed model:

### Regular BAGEL-7B Model

from huggingface_hub import snapshot_download

save_dir = "models/BAGEL-7B-MoT"
repo_id = "ByteDance-Seed/BAGEL-7B-MoT"
cache_dir = save_dir + "/cache"

snapshot_download(cache_dir=cache_dir,
 local_dir=save_dir,
 repo_id=repo_id,
 local_dir_use_symlinks=False,
 resume_download=True,
 allow_patterns=["*.json", "*.safetensors", "*.bin", "*.py", "*.md", "*.txt"],
)

### DFloat11 Model - Allows for 24GB VRAM Single GPU inference without quality loss

from huggingface_hub import snapshot_download

save_dir = "models/BAGEL-7B-MoT-DF11"
repo_id = "DFloat11/BAGEL-7B-MoT-DF11"
cache_dir = save_dir + "/cache"

snapshot_download(cache_dir=cache_dir,
 local_dir=save_dir,
 repo_id=repo_id,
 local_dir_use_symlinks=False,
 resume_download=True,
 allow_patterns=["*.json", "*.safetensors", "*.bin", "*.py", "*.md", "*.txt", "*.model", "vae/*"],
)

▶️ Usage

Run the modified Gradio WebUI script:

# For 32GB+ VRAM GPU or multi GPUs. Saves output to ./output/
python app.py

# To specify a different output directory
python app.py --output_dir /path/to/your/output

# For 12~32GB VRAM GPU/NF4 quantization and Chinese UI
python app.py --mode 2 --zh

# Different Requirements apply for using the DFloat11 Model

❤️ Based on the Original BAGEL Project

This work is based on the amazing BAGEL project. Please refer to the original repository for core model details, training guidelines, and benchmarks.

✍️ Citation

@article{deng2025bagel,
  title   = {Emerging Properties in Unified Multimodal Pretraining},
  author  = {Deng, Chaorui and Zhu, Deyao and Li, Kunchang and Gou, Chenhui and Li, Feng and Wang, Zeyu and Zhong, Shu and Yu, Weihao and Nie, Xiaonan and Song, Ziang and Shi, Guang and Fan, Haoqi},
  journal = {arXiv preprint arXiv:2505.14683},
  year    = {2025}
}

📜 License

BAGEL is licensed under the Apache 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
assets		assets
data		data
eval		eval
modeling		modeling
scripts		scripts
test_images		test_images
train		train
.gitignore		.gitignore
BagelUI-Colab.ipynb		BagelUI-Colab.ipynb
EVAL.md		EVAL.md
LICENSE		LICENSE
README.md		README.md
TRAIN.md		TRAIN.md
app.py		app.py
inference.ipynb		inference.ipynb
inferencer.py		inferencer.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BAGEL Gradio UI Fork with Extended Features

Latest Update (June 03, 2025)

✨ Added Features

⚙️ Local Installation

▶️ Usage

❤️ Based on the Original BAGEL Project

✍️ Citation

📜 License

About

Uh oh!

Releases

Packages

Languages

License

dasjoms/BagelUI

Folders and files

Latest commit

History

Repository files navigation

BAGEL Gradio UI Fork with Extended Features

Latest Update (June 03, 2025)

✨ Added Features

⚙️ Local Installation

▶️ Usage

❤️ Based on the Original BAGEL Project

✍️ Citation

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages