Description
🐛 Describe the bug
Hi,
I'm trying to use torchdynamo with onnxrt (both onnxrt_cuda and onnxrt_cpu backends) following the resnet example described in https://pytorch.org/tutorials/intermediate/dynamo_tutorial.html but I'm hitting an error
torch._dynamo.exc.BackendCompilerFailed: onnxrt_cuda raised Exception: Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten.convolution.default(*(FakeTensor(FakeTensor(..., device='meta', size=(16, 3, 128, 128)), cuda:0), Parameter containing: tensor([[[[...]]]], device='cuda:0', requires_grad=True),
I've tried many examples from torchbench and all are hitting same error. It does not happen with other backends e.g. aot_cudagraphs etc. I've also force set the FakeTensorMode in fake_tensor.py but that hit another error down the line. Am I missing something obvious here?
Torch version: 2.0.0a0+gitdf46ba4
onnxrt version: 1.13.1
I was thinking something obvious is missing here before I dig into this in detail.
Error logs
torch._dynamo.exc.BackendCompilerFailed: onnxrt_cuda raised Exception: Please convert all Tensors to FakeTensors first or instantiate FakeTensorMode with 'allow_non_fake_inputs'. Found in aten.convolution.default(*(FakeTensor(FakeTensor(..., device='meta', size=(16, 3, 128, 128)), cuda:0), Parameter containing: tensor([[[[...]]]], device='cuda:0', requires_grad=True),
Minified repro
import torch
import torch._dynamo as dynamo
# Returns the result of running `fn()` and the time it took for `fn()` to run,
# in seconds. We use CUDA events and synchronization for the most accurate
# measurements.
def timed(fn):
start = torch.cuda.Event(enable_timing=True)
end = torch.cuda.Event(enable_timing=True)
start.record()
result = fn()
end.record()
torch.cuda.synchronize()
return result, start.elapsed_time(end) / 1000
# Generates random input and targets data for the model, where `b` is
# batch size.
def generate_data(b):
return (
torch.randn(b, 3, 128, 128).to(torch.float32).cuda(),
torch.randint(1000, (b,)).cuda(),
)
from torchvision.models import resnet18
def init_model():
return resnet18().to(torch.float32).cuda()
def eval(mod, inp):
return mod(inp)
torch._dynamo.config.verbose=True
model = init_model()
eval_opt = dynamo.optimize("onnxrt_cuda")(eval)
inp = generate_data(16)[0]
print("eager:", timed(lambda: eval(model, inp))[1])
print("dynamo:", timed(lambda: eval_opt(model, inp))[1])
cc @ezyang @msaroufim @wconstab @bdhirsh @anijain2305 @zou3519 @Chillee @samdow @kshitij12345 @janeyx99 @soumith @ngimel
Metadata
Metadata
Assignees
Labels
Type
Projects
Status