-
Notifications
You must be signed in to change notification settings - Fork 77
CPU version #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The code is hardcoded to cuda because CPU would be ridiculously slow with this network. However it is very to easy to fix, just all CudaTensors have to be substituted with FloatTensors. |
I have the same problem. I deleted :cuda() from train.lua and :ceil() from provider.lua. I also changed CudaTensors to FloatTensors for train.lua. Still the same error. Is there anything I could do? |
I have the same problem and finally solve by cast the model and criterion into float:
and I do not delete the ceil() in provider.lua under CPU but it still works |
The issue is finally solved by this commit: torch/nn@738057e |
You mention that this is slow on a CPU and report steptimes of 70ms to 500 ms for the CUDA GPU hardware in the blog: http://torch.ch/blog/2015/07/30/cifar.html I did a CPU test on a macbook pro with i5 2.9 GHz and Intel Iris 6400 graphics card with lua instead of luajit and default apple clang compiler, so no openmp. The reported CPU load on MacOs is approx. 150% to 180%. I started the model with
The log looks like this:
So I have a step time of approx. 8600 ms... I played around with torch-cl which reduced the cpu load but the fan remained removing lot of heat. So I guess the Iris card was really doing something, but the steptime was only slightly reduced to 7500 ms. I also tried a gcc5 build with openmp, but I installed the libraries via macports. Therefore I think, the openblas library is not compiled with openmp. The resulting steptime was also around 7500 ms. But the openmp gcc5 build increased the cpu load. So, maybe I better rent some time on amazon ec2... |
This is a test on a Amazon EC g2.2xlarge but with only the cpu, i.e. not the gpu
So from the MacBook Pro i5 with approx. 8s per step, I increased to 31s per step... This is 4 times slower. |
And now on the Amazon EC g2.2xlarge with
The steptime is 1676ms! It is getting better! |
This looks good. How do you run this on CPU? Is this hardcoded to run only on GPUs (NVIDIA)? When I tried removing reference to CUDA and try running on CPU it gives me an error saying:
/home/sanoob/torch/install/bin/luajit: ...ob/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:100: bad argument #1 (field finput is not a torch.FloatTensor)
stack traceback:
[C]: in function 'SpatialConvolutionMM_updateOutput'
...ob/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:100: in function 'updateOutput'
/home/sanoob/torch/install/share/lua/5.1/nn/Sequential.lua:39: in function 'updateOutput'
/home/sanoob/torch/install/share/lua/5.1/nn/Sequential.lua:39: in function 'updateOutput'
/home/sanoob/torch/install/share/lua/5.1/nn/Sequential.lua:39: in function 'forward'
train-cpu.lua:106: in function 'opfunc'
/home/sanoob/torch/install/share/lua/5.1/optim/sgd.lua:43: in function 'sgd'
train-cpu.lua:115: in function 'train'
train-cpu.lua:190: in main chunk
at line
Should we declare inputs as FloatTensor too?
The text was updated successfully, but these errors were encountered: