Description
The PyTorch 0.4 Migration Guide, simplifies writing device-agnostic code as follows:
# at beginning of the script
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
...
# then whenever you get a new Tensor or Module
# this won't copy if they are already on the desired device
input = data.to(device)
model = MyModule(...).to(device)
However, this is still not clean.
Ideally, we would like PyTorch to move everything over to the GPU, if it's available...
much like TensorFlow.
I tried setting the global tensor type to a cuda tensor using the torch.set_default_tensor_type()
method.
However, there are some fundamental problems with setting the default tensor type.
-
Dataloaders give normal (non-cuda) tensors by default. They have to be manually cast using the
Tensor.to()
method. -
Many methods are simply not implemented for
torch.cuda.*Tensor
. Thus, setting the global tensor type to cuda fails. -
Conversions to numpy using the
numpy()
method aren’t available for cuda tensors. One has to gox.cpu().numpy()
.
Although this chain is agnostic, it defeats the purpose.
I find that I use methods like .to(device)
and .cpu()
far too often in my projects.
In my view, it makes the code more verbose than it needs to be and makes it just a little harder to read.
I think that there is room for a global use_gpu
flag that can enable developers to run the entire subsequent code in the GPU, where required.
Specifically, my request is the following:
1. Abolish need for the .to(device)
suffix:
Circumvent it by letting the developer set the device using a global method like torch.set_default_device()
or a convinience method/flag like use_gpu
Then, whenever, an error is encountered because a CUDA tensor is expected in place of a regular tensor or vice-versa, automatically cast the tensor to the expected device.
Additionally,
a. Move nn.Module
automatically to the default device.
b. Move the yield of DataLoader
s to the default device.
Prevent the need to manually cast to the default device.
2. Add the numpy() method to cuda tensors:
The existing way is to move the tensor to cpu first.
Thus, we have x.cpu().numpy()
, which is agnostic but redundant.
3. Use GPU by default if available:
PyTorch is built from the ground up with the Deep Learning community in mind.
With most Deep Learning done on GPUs, they be considered as the default device automatically.
Let PyTorch give first preference to the GPU.