thanks LLava
If you are not using Linux, do NOT proceed
- Clone this repository and navigate to LLaVA folder
git clone https://github.com/liushuchun/llava_image_tagger.git
cd llava_image_tagger
- Install Package
conda create -n llava python=3.10 -y
conda activate llava
pip install --upgrade pip # enable PEP 660 support
pip install -e .
- Install additional packages for training cases
pip install -e ".[train]"
pip install flash-attn --no-build-isolation
git pull
pip install -e .
# if you see some import errors when you upgrade,
# please try running the command below (without #)
# pip install flash-attn --no-build-isolation --no-cache-dir
Example Code
from llava.model.builder import load_pretrained_model
from llava.mm_utils import get_model_name_from_path
from llava.eval.run_llava import eval_model
model_path = "liuhaotian/llava-v1.5-7b"
tokenizer, model, image_processor, context_len = load_pretrained_model(
model_path=model_path,
model_base=None,
model_name=get_model_name_from_path(model_path)
)
Check out the details wth the load_pretrained_model
function in llava/model/builder.py
.
You can also use the eval_model
function in llava/eval/run_llava.py
to get the output easily. By doing so, you can use this code on Colab directly after downloading this repository.
model_path = "liuhaotian/llava-v1.5-7b"
prompt = "What are the things I should be cautious about when I visit here?"
image_file = "https://llava-vl.github.io/static/images/view.jpg"
args = type('Args', (), {
"model_path": model_path,
"model_base": None,
"model_name": get_model_name_from_path(model_path),
"query": prompt,
"conv_mode": None,
"image_file": image_file,
"sep": ",",
"temperature": 0,
"top_p": None,
"num_beams": 1,
"max_new_tokens": 512
})()
eval_model(args)
Please check out our Model Zoo for all public LLaVA checkpoints, and the instructions of how to use the weights.
python-mllava.serve.cli--model-path/media/shuchun/data/models/llava_v1.6--image-dir/media/shuchun/data --recursive--load-4bit--devicecuda