Closed
Description
import torch
import torch.nn as nn
from torch.autograd import Variable
lstm = nn.LSTM(2, 3, 1, bidirectional=True)
lstm.cuda()
for param in lstm.parameters():
param.requires_grad = False # NOTE: Failed when 'False' but succeed when 'True'
input = torch.ones(4, 5, 2) # [T, b, i]
input = input.cuda()
input = Variable(input, requires_grad=True)
output, _ = lstm(input)
output.backward(torch.ones(output.size()).cuda())
When run on cpu or requires_grad
set to True
, it does succeed.
Otherwise, it fails with error message:
Traceback (most recent call last):
File "test_lstm.py", line 17, in <module>
output.backward(torch.ones(output.size()).cuda())
File "/home/jrmei/.local/lib/python2.7/site-packages/torch/autograd/variable.py", line 158, in backward
self._execution_engine.run_backward((self,), (gradient,), retain_variables)
RuntimeError: CudnnRNN returned an invalid number of gradient tensors (expected 11, but got 4)
It seems like the check done by the framework has a bug.