Closed
Description
🚀 The feature, motivation and pitch
Environment of Mac M2
Python3.10
torch 2.1.0.dev20230717
torchaudio 2.1.0.dev20230717
torchvision 0.15.2a0
I want to use the Adam optimizer to train my model. And got an error:
NotImplementedError: The operator 'aten::lerp.Scalar_out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS
And when I set the PYTORCH_ENABLE_MPS_FALLBACK=1
, the training speed is quite slower.
I'm testing the tiny VIT model with minist dataset.
The details is following:
M2 chip takes about 2.4 minutes on CPU with Adam for one epoch.
M2 chip takes about 2.0 minutes on GPU with Adam for one epoch( with PYTORCH_ENABLE_MPS_FALLBACK=1
).
M2 chip takes about 30 seconds on GPU with SGD for one epoch.
Alternatives
No response
Additional context
No response
cc @vincentqb @jbschlosser @albanD @janeyx99 @crcrpar @kulinseth @malfet @DenisVieriu97 @razarmehr @abhudev @ezyang @gchanan @zou3519