Torchdynamo with onnxrt backend generating fake tensor errors

@ezyang

🐛 Describe the bug

Hi,
I'm trying to use torchdynamo with onnxrt (both onnxrt_cuda and onnxrt_cpu backends) following the resnet example described in https://pytorch.org/tutorials/intermediate/dynamo_tutorial.html but I'm hitting an error

torch._dynamo.exc.BackendCompilerFailed: onnxrt_cuda raised Exception: Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten.convolution.default(*(FakeTensor(FakeTensor(..., device='meta', size=(16, 3, 128, 128)), cuda:0), Parameter containing: tensor([[[[...]]]], device='cuda:0', requires_grad=True),

I've tried many examples from torchbench and all are hitting same error. It does not happen with other backends e.g. aot_cudagraphs etc. I've also force set the FakeTensorMode in fake_tensor.py but that hit another error down the line. Am I missing something obvious here?

Torch version: 2.0.0a0+gitdf46ba4
onnxrt version: 1.13.1

I was thinking something obvious is missing here before I dig into this in detail.

Error logs

torch._dynamo.exc.BackendCompilerFailed: onnxrt_cuda raised Exception: Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten.convolution.default(*(FakeTensor(FakeTensor(..., device='meta', size=(16, 3, 128, 128)), cuda:0), Parameter containing: tensor([[[[...]]]], device='cuda:0', requires_grad=True),

Minified repro

import torch
import torch._dynamo as dynamo

# Returns the result of running `fn()` and the time it took for `fn()` to run,
# in seconds. We use CUDA events and synchronization for the most accurate
# measurements.
def timed(fn):
    start = torch.cuda.Event(enable_timing=True)
    end = torch.cuda.Event(enable_timing=True)
    start.record()
    result = fn()
    end.record()
    torch.cuda.synchronize()
    return result, start.elapsed_time(end) / 1000

# Generates random input and targets data for the model, where `b` is
# batch size.
def generate_data(b):
    return (
        torch.randn(b, 3, 128, 128).to(torch.float32).cuda(),
        torch.randint(1000, (b,)).cuda(),
    )

from torchvision.models import resnet18
def init_model():
    return resnet18().to(torch.float32).cuda()

def eval(mod, inp):
    return mod(inp)
torch._dynamo.config.verbose=True
model = init_model()
eval_opt = dynamo.optimize("onnxrt_cuda")(eval)

inp = generate_data(16)[0]
print("eager:", timed(lambda: eval(model, inp))[1])
print("dynamo:", timed(lambda: eval_opt(model, inp))[1])

cc @ezyang @msaroufim @wconstab @bdhirsh @anijain2305 @zou3519 @Chillee @samdow @kshitij12345 @janeyx99 @soumith @ngimel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🐛 Describe the bug

Error logs

Minified repro

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

🐛 Describe the bug

Error logs

Minified repro

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions