🌋 LLaVA Image tagger

thanks LLava

Install

If you are not using Linux, do NOT proceed

Clone this repository and navigate to LLaVA folder

git clone https://github.com/liushuchun/llava_image_tagger.git
cd llava_image_tagger

Install Package

conda create -n llava python=3.10 -y
conda activate llava
pip install --upgrade pip  # enable PEP 660 support
pip install -e .

Install additional packages for training cases

pip install -e ".[train]"
pip install flash-attn --no-build-isolation

Upgrade to latest code base

git pull
pip install -e .

# if you see some import errors when you upgrade,
# please try running the command below (without #)
# pip install flash-attn --no-build-isolation --no-cache-dir

Quick Start With HuggingFace

Example Code

from llava.model.builder import load_pretrained_model
from llava.mm_utils import get_model_name_from_path
from llava.eval.run_llava import eval_model

model_path = "liuhaotian/llava-v1.5-7b"

tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path=model_path,
    model_base=None,
    model_name=get_model_name_from_path(model_path)
)

Check out the details wth the load_pretrained_model function in llava/model/builder.py.

You can also use the eval_model function in llava/eval/run_llava.py to get the output easily. By doing so, you can use this code on Colab directly after downloading this repository.

model_path = "liuhaotian/llava-v1.5-7b"
prompt = "What are the things I should be cautious about when I visit here?"
image_file = "https://llava-vl.github.io/static/images/view.jpg"

args = type('Args', (), {
    "model_path": model_path,
    "model_base": None,
    "model_name": get_model_name_from_path(model_path),
    "query": prompt,
    "conv_mode": None,
    "image_file": image_file,
    "sep": ",",
    "temperature": 0,
    "top_p": None,
    "num_beams": 1,
    "max_new_tokens": 512
})()

eval_model(args)

LLaVA Weights

Please check out our Model Zoo for all public LLaVA checkpoints, and the instructions of how to use the weights.

Demo

CLI Inference

python-mllava.serve.cli--model-path/media/shuchun/data/models/llava_v1.6--image-dir/media/shuchun/data --recursive--load-4bit--devicecuda

Name		Name	Last commit message	Last commit date
Latest commit History 463 Commits
.devcontainer		.devcontainer
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
docs		docs
images		images
llava		llava
playground/data		playground/data
scripts		scripts
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cog.yaml		cog.yaml
predict.py		predict.py
pyproject.toml		pyproject.toml
start_0.sh		start_0.sh
start_1.sh		start_1.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌋 LLaVA Image tagger

Install

Upgrade to latest code base

Quick Start With HuggingFace

LLaVA Weights

Demo

CLI Inference

About

Releases

Packages

Contributors 49

Languages

License

liushuchun/llava_image_tagger

Folders and files

Latest commit

History

Repository files navigation

🌋 LLaVA Image tagger

Install

Upgrade to latest code base

Quick Start With HuggingFace

LLaVA Weights

Demo

CLI Inference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 49

Languages

Packages