8000 GitHub - liushuchun/llava_image_tagger
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

liushuchun/llava_image_tagger

Repository files navigation

🌋 LLaVA Image tagger

thanks LLava

Install

If you are not using Linux, do NOT proceed

  1. Clone this repository and navigate to LLaVA folder
git clone https://github.com/liushuchun/llava_image_tagger.git
cd llava_image_tagger
  1. Install Package
conda create -n llava python=3.10 -y
conda activate llava
pip install --upgrade pip  # enable PEP 660 support
pip install -e .
  1. Install additional packages for training cases
pip install -e ".[train]"
pip install flash-attn --no-build-isolation

Upgrade to latest code base

git pull
pip install -e .

# if you see some import errors when you upgrade,
# please try running the command below (without #)
# pip install flash-attn --no-build-isolation --no-cache-dir

Quick Start With HuggingFace

Example Code
from llava.model.builder import load_pretrained_model
from llava.mm_utils import get_model_name_from_path
from llava.eval.run_llava import eval_model

model_path = "liuhaotian/llava-v1.5-7b"

tokenizer, model, image_processor, context_len = load_pretrained_model(
    model_path=model_path,
    model_base=None,
    model_name=get_model_name_from_path(model_path)
)

Check out the details wth the load_pretrained_model function in llava/model/builder.py.

You can also use the eval_model function in llava/eval/run_llava.py to get the output easily. By doing so, you can use this code on Colab directly after downloading this repository.

model_path = "liuhaotian/llava-v1.5-7b"
prompt = "What are the things I should be cautious about when I visit here?"
image_file = "https://llava-vl.github.io/static/images/view.jpg"

args = type('Args', (), {
    "model_path": model_path,
    "model_base": None,
    "model_name": get_model_name_from_path(model_path),
    "query": prompt,
    "conv_mode": None,
    "image_file": image_file,
    "sep": ",",
    "temperature": 0,
    "top_p": None,
    "num_beams": 1,
    "max_new_tokens": 512
})()

eval_model(args)

LLaVA Weights

Please check out our Model Zoo for all public LLaVA checkpoints, and the instructions of how to use the weights.

Demo

CLI Inference

python-mllava.serve.cli--model-path/media/shuchun/data/models/llava_v1.6--image-dir/media/shuchun/data --recursive--load-4bit--devicecuda

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0