Closed
Description
🐛 Describe the bug
layer_norm
triggers INTERNAL ASSERT with input requiring grad + zero-size int tensor
import torch
input = torch.randint(0, 8, [0, 0, 1024], dtype=torch.int64)
normalized_shape = [1024]
eps = 1e-05
weight = torch.rand([1024], dtype=torch.float64, requires_grad=True)
bias = torch.rand([1024], dtype=torch.float64, requires_grad=True)
torch.nn.functional.layer_norm(input, normalized_shape, weight=weight, bias=bias, eps=eps, )
# RuntimeError: isDifferentiableType(variable.scalar_type())INTERNAL ASSERT FAILED at "/Users/distiller/project/pytorch/torch/csrc/autograd/functions/utils.h":65, please report a bug to PyTorch
layer_norm
does check the dtype of zero-size tensor, like input
in this example. If input
is not zero-size, it will raise an error that RuntimeError: "LayerNormKernelImpl" not implemented for 'Long'
Versions
pytorch: 1.11.0
cc @ezyang @albanD @zou3519 @gqchen @pearu @nikitaved @soulitzer @lezcano @Varal7