Add OpenVINO backend for torch.compile node #6638

openvino-dev-samples · 2025-01-29T15:13:04Z

Features description

To support both .safetensor model and LoRa weights with OpenVINO runtime.
To support ALL of Intel CPU/GPU/NPU.

Installation

To enable this integration, you only need to install openvino runtime in advance:

pip install openvino
python3 main.py --cpu --use-pytorch-cross-attention

Test cases

model without lora node -> lora node enabled -> lora node disabled
# current issue: /

model with lora node -> lora node disabled -> lora node enabled
# current issue: it will need 2 warm-up model compilation (generation is slow) before fast generation with LoRA node

Screenshots

TorchCompileModel output has to be connected with the input of KSampler.

The following model has to be selected for checkpoint

#2473

remove history commit remove history commit

simonlui · 2025-01-30T02:04:54Z

Questions, as an Intel Arc owner and having contributed to the repository.

1.) I have used both the Triton(inductor) and OpenVINO backends using custom nodes and Triton is faster in both compilation time and speed from what I have tested. How do these backends differ and plan to be supported because the benefit to me for adding OpenVINO seems minimal outside of supporting NPU devices.

2.) Are there sensible errors that pop up if you don't have a suitable OpenVINO GPU or NPU and try to run with this node and ways to diagnose how to solve them if users run into them? This can be a issue and both device types, but especially NPU, require drivers at this time to function properly and I can't even get my LNL laptop to use NPU at this time on Linux so I also have questions about maturity too at this time.

openvino-dev-samples · 2025-02-01T14:59:04Z

Questions, as an Intel Arc owner and having contributed to the repository.

1.) I have used both the Triton(inductor) and OpenVINO backends using custom nodes and Triton is faster in both compilation time and speed from what I have tested. How do these backends differ and plan to be supported because the benefit to me for adding OpenVINO seems minimal outside of supporting NPU devices.

2.) Are there sensible errors that pop up if you don't have a suitable OpenVINO GPU or NPU and try to run with this node and ways to diagnose how to solve them if users run into them? This can be a issue and both device types, but especially NPU, require drivers at this time to function properly and I can't even get my LNL laptop to use NPU at this time on Linux so I also have questions about maturity too at this time.

hi @simonlui great thanks for your quick feedback.

i believe the benefit of adding openvino is to trigger Intel GPU and NPU on inferencing.
you are right, so I added a method to detect the devices supported by openvino on system firstly, and list them on UI. For how to check the correctness of hardware installation, i will update the documents to illustrate this part once the approach in this PR is accepted. you also can find these information in openvino's document site. https://docs.openvino.ai/2024/get-started/configurations.html

Finally. Thanks to comfyanonymous/ComfyUI#6638 to use as a guide for add_patches function

Panchovix · 2025-02-09T05:14:09Z

Hi there, just passed by and wanted to say, many many thanks! Finally with your add_patches modifications (and others) managed to make loras work with torch.compile. Really appreciated!

Panchovix · 2025-02-09T16:48:55Z

Sorry for double post, but wondering, does loading a lora, then disabling it, and then enabling it again works fine for you?

Maybe some unpatching or recompiling is needed?

I think on a first inference with a lora, it will patch the keys before compiling, and it will work.

If you then disable it and enable the lora, it will compile without a lora and will add some _orig_mod prefixes to the keys, so when trying to apply the lora keys again on a 3rd inference to the compiled model, it will not match the key and it won't load.

Correct me if I'm wrong though.

openvino-dev-samples · 2025-02-10T04:00:42Z

python3 main.py --cpu --use-pytorch-cross-attention

I think it can supp

Sorry for double post, but wondering, does loading a lora, then disabling it, and then enabling it again works fine for you?

Maybe some unpatching or recompiling is needed?

I think on a first inference with a lora, it will patch the keys before compiling, and it will work.

If you then disable it and enable the lora, it will compile without a lora and will add some _orig_mod prefixes to the keys, so when trying to apply the lora keys again on a 3rd inference to the compiled model, it will not match the key and it won't load.

Correct me if I'm wrong though.

Hi, when your implementation path start from a checkpoint without LoRa, everything works. However if it starts from a checkpoint with LoRa, the enabling and disabling LoRA does not work. which mean:

model without lora -> lora enabled -> lora disabled
model with lora -> lora disabled -> lora enabled

In second case, my new patch will not be triggered, so I believe it is a general issue for torch.compile node, and I will do furthe investigation.

openvino-dev-samples · 2025-02-11T03:58:48Z

python3 main.py --cpu --use-pytorch-cross-attention

I think it can supp

Sorry for double post, but wondering, does loading a lora, then disabling it, and then enabling it again works fine for you?
Maybe some unpatching or recompiling is needed?
I think on a first inference with a lora, it will patch the keys before compiling, and it will work.
If you then disable it and enable the lora, it will compile without a lora and will add some _orig_mod prefixes to the keys, so when trying to apply the lora keys again on a 3rd inference to the compiled model, it will not match the key and it won't load.
Correct me if I'm wrong though.

Hi, when your implementation path start from a checkpoint without LoRa, everything works. However if it starts from a checkpoint with LoRa, the enabling and disabling LoRA does not work. which mean:

model without lora -> lora enabled -> lora disabled

model with lora -> lora disabled -> lora enabled

In second case, my new patch will not be triggered, so I believe it is a general issue for torch.compile node, and I will do furthe investigation.

I have updated the PR, however it may need 2 warm-up inference for first time generation with LoRA weights

openvino-dev-samples · 2025-02-26T02:01:20Z

hi @comfyanonymous could you help to review ?

rebase

openvino-dev-samples · 2025-03-12T06:29:46Z

Hi @NineMeowICT Could you help to check if this PR is ready to be merged ? thanks

NineMeowICT · 2025-03-12T10:30:17Z

@openvino-dev-samples I'm sorry for the misunderstanding. I'm not one of ComfyUI's contributors and I don't have write access, my approval was simply to recognize your efforts.

raymondlo84 · 2025-03-31T18:02:13Z

How can we get this merged?

Sen-sou · 2025-04-08T19:53:32Z

@openvino-dev-samples Appreciate you working on implementing this, currently using your fork, it dosent seem to be using the gpu for the inference even tho ive selected gpu as the device. Is there any specefic openvino version that i need to use? I dont think its compiling anything and just strainght skipping the block.
Thank you

openvino-dev-samples · 2025-04-09T01:36:54Z

@openvino-dev-samples Appreciate you working on implementing this, currently using your fork, it dosent seem to be using the gpu for the inference even tho ive selected gpu as the device. Is there any specefic openvino version that i need to use? I dont think its compiling anything and just strainght skipping the block. Thank you

hi Could you share the screenshot of network ? BTW you can try following command to update the version of openvino:
pip install --pre -U openvino --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly

openvino-dev-samples · 2025-04-09T05:52:46Z

@openvino-dev-samples Appreciate you working on implementing this, currently using your fork, it dosent seem to be using the gpu for the inference even tho ive selected gpu as the device. Is there any specefic openvino version that i need to use? I dont think its compiling anything and just strainght skipping the block. Thank you

hi Could you share the screenshot of network ? BTW you can try following command to update the version of openvino: pip install --pre -U openvino --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly

Thanks for replying, Updated the openvino version, it worked for once, it used the gpu for the inference, but after that it switched back to cpu because i changed the sampler. Restarted couple of times, still using the cpu. Do i have to trigger a recompile for the gpu to work?

Heres the screenshot you asked.

Thank you.

i dont think change on Sampler will lead recompiling for GPU. but I will try to reproduce your case. thanks for sharing.

Sen-sou · 2025-04-09T06:04:50Z

i dont think change on Sampler will lead recompiling for GPU. but I will try to reproduce your case. thanks for sharing.

Thank you for looking, but i resolved the problem. It was the low memory which was causing it to go back to cpu, Increasing memory helped resolve this. Thanks a lot anyway.

Edit: Changing Image Size made it switch back to cpu, image size of 512x512 works fine with gpu, changing it to 768x768 causes it to use cpu. Changing to any image size causes it to use cpu.

Sen-sou · 2025-04-09T07:26:21Z

Thank you for looking, but i resolved the problem. It was the low memory which was causing it to go back to cpu, Increasing memory helped resolve this. Thanks a lot anyway.

Edit: Changing Image Size made it switch back to cpu, image size of 512x512 works fine with gpu, changing it to 768x768 causes it to use cpu. Changing to any image size causes it to use cpu.

To fix this, had to switch openvino device to CPU and queue, then switch back to GPU and queue.
i think changing image size is not triggering a model recompile.

openvino-dev-samples · 2025-04-10T14:37:24Z

Thank you for looking, but i resolved the problem. It was the low memory which was causing it to go back to cpu, Increasing memory helped resolve this. Thanks a lot anyway.
Edit: Changing Image Size made it switch back to cpu, image size of 512x512 works fine with gpu, changing it to 768x768 causes it to use cpu. Changing to any image size causes it to use cpu.

To fix this, had to switch openvino device to CPU and queue, then switch back to GPU and queue. i think changing image size is not triggering a model recompile.

Torch Compile will recompile the PyTorch model once it is changed. So I guess changing image size will update the original pytorch model's network.

Kosinkadink · 2025-05-21T09:00:37Z

Hey, sorry that this did not get an official review for so long!

For the lora fix for torch.compile, even with the keys workaround, there was indeed a fundamental issue with the way torch.compile was implemented in ComfyUI with the object_patches, with anything relating to keys still being broken while the model was loaded. I created a PR that just got merged that fixes this properly: #8213, with torch.compile now being managed by a set_torch_compile_wrapper function in comfy_api.torch_helpers and needing no workarounds in ModelPatcher code. I also exposed a bunch of torch.compile kwargs on that new set_torch_compile_wrapper function, so that an openvino implementation can pass in the 'options' param.

For the OpenVino portion of the PR, managing the openvino dependency should be done in some sort of standardized manner - this can be done the cleanest by implementing the Torch Compile OpenVino node in a custom node pack. This will ensure that any time a workflow uses the OpenVino node, users are able to acquire the dependencies via ComfyUI-Manager. Creating/publishing the custom node pack is something that can be done on your end, so that you will be free to make any edits in the future.

Here are the docs for publishing a custom node to registry: https://docs.comfy.org/registry/publishing#publishing-nodes

Since the review was so delayed, I went ahead and wrote the core of the custom node code that would achieve the goal of this PR, feel free to use it/reference it:

import torch
import openvino as ov
import openvino.frontend.pytorch.torchdynamo.execute as ov_ex

from comfy_api.torch_helpers import set_torch_compile_wrapper


class TorchCompileModelOpenVINO:
    @classmethod
    def INPUT_TYPES(s):
        core = ov.Core()
        available_devices = core.available_devices

        return {
            "required": {
                "model": ("MODEL",),
                "backend": (["openvino"],),
                "openvino_device": (available_devices,),
            },
        }

    RETURN_TYPES = ("MODEL",)
    FUNCTION = "patch"

    CATEGORY = "_for_testing"
    EXPERIMENTAL = True

    def patch(self, model, backend, openvino_device):
        print(mod
6D40
el.__class__.__name__)
        if backend == "openvino":
            options = {"device": openvino_device}
            torch._dynamo.reset()
            ov_ex.compiled_cache.clear()
            ov_ex.req_cache.clear()
            ov_ex.partitioned_modules.clear()
        else:
            options = None
        m = model.clone()
        set_torch_compile_wrapper(m,
                                  backend=backend,
                                  options=options,
                                  )
        return (m,)

# The node ID in NODE_CLASS_MAPPINGS should be globally unique across ComfyUI ecosystem
NODE_CLASS_MAPPINGS = {
    "OpenVINO_TorchCompileModel": TorchCompileModelOpenVINO,
}

Let me know if you have any questions or concerns, I'll do my best to answer promptly!

openvino-dev-samples added 3 commits January 24, 2025 01:37

update openvino backend

33e71e0

add openvino to torch compile

d1f61cc

remove history commit

317af72

remove history commit remove history commit

openvino-dev-samples requested review from yoland68, robinjhuang, huchenlei, pythongosssss, ltdrdata, Kosinkadink and comfyanonymous as code owners January 29, 2025 15:13

add Query Device

77e9294

Panchovix added a commit to Panchovix/stable-diffusion-webui-reForge that referenced this pull request Feb 9, 2025

Add proper support for LoRAs when using torch.compile.

a6de2b6

Finally. Thanks to comfyanonymous/ComfyUI#6638 to use as a guide for add_patches function

fix the issue for model first inference with lora

ebfe7a5

openvino-dev-samples added 2 commits March 3, 2025 14:36

Merge pull request #2 from comfyanonymous/master

8558803

rebase

fix the memory leakage issue

bc9eb9d

NineMeowICT approved these changes Mar 7, 2025

View reviewed changes

Update nodes_torch_compile.py

e14b8df

Kosinkadink mentioned this pull request May 20, 2025

Make torch.compile LoRA/key-compatible #8213

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OpenVINO backend for torch.compile node #6638

Add OpenVINO backend for torch.compile node #6638

Add OpenVINO backend for torch.compile node #6638

Are you sure you want to change the base?

Add OpenVINO backend for torch.compile node #6638

Conversation

Features description

Installation

Test cases

Screenshots