8000 GitHub - hughperkins/clnn: OpenCL backend for Torch nn neural networks library
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

hughperkins/clnn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

clnn

OpenCL backend for Torch nn neural networks library.

Installation

Please see distro-cl for installation instructions.

What works

Parameterized Modules

  • nn.Linear

Basic Tensor methods

These mostly 'just work', since based on underlying tensor methods, already implemented in cltorch. Tested with:

  • nn.Narrow

Miscellaneous modules

  • nn.Identity
  • nn.Dropout

Convolution layers

  • nn.SpatialConvolutionMM
  • nn.SpatialMaxPooling (including ceil mode)
  • nn.SpatialAveragePooling
  • nn.TemporalConvolution2 This is specific to clnn. It works on cpu and cuda too, not just on OpenCL. It is API-compatible with TemporalConvolution, and faster than TemporalConvolution, on both CUDA and OpenCL.

Transfer function layers

  • nn.Tanh
  • nn.Sigmoid
  • nn.ReLU
  • nn.ELU
  • nn.Exp
  • nn.Sqrt
  • nn.Square
  • nn.Abs
  • nn.LogSigmoid
  • nn.HardTanh
  • nn.LogSoftMax
  • nn.SoftMax (including spatial mode)

Table layers

These 'just work', since they are based on underlying torch operations, which are already implemented in cltorch. Tested with:

  • nn.CMulTable
  • nn.CAddTable

Criterions

  • nn.MSECriterion
  • nn.ClassNLLCriterion

Containers:

Containers 'just work', since they just call standard operations on the contained modules. Tested with:

  • nn.Sequential
  • nngraph

Trainers

In theory, trainers 'just work', since they just call standard torch methods on the network. The following are good first choices:

  • nn.StochasticGradient
  • optim.lbfgs
  • optim.adam

Timings

Soumith benchmark layers

Please see https://github.com/soumith/convnet-benchmarks#imagenet-winners-benchmarking

  • On a Titan X, OpenCL torch is about 3 times slower than CUDA torch
    • eg for VGG, cutorch takes 1100ms, and cltorch takes 3400ms

Example networks

Porting guidelines

Porting guidelines, for project maintainers, available here: porting-guidelines.md.

Recent changes

  • 2nd May:
    • Re-applied:
      • 26th March:
        • add TemporalConvolution2: same API and usage as TemporalConvolution, but faster on GPUs
  • 31st April:
    • Re-applied:
      • 10th March:
        • @pawni (Nick Pawlowski) added SpatialUpSamplingNearest. Thank you Nick
      • 20th February:
        • @gloine (Jaehyung Lee) added support for non-batched input to ClassNLLCriterion. Thank you Jaehyung
  • 30th April:
    • rolled back to as-of 21st February, prior to lots of THNN changes in upstream Torch
    • additionally, installation procedure is now to use a specific torch distro, for stability
  • 1st Feb:
    • merged/ported THNN phase 3. Any weird build issues, please update both nn and clnn.
  • 2nd January, 2016:
    • merged/ported THNN architecture across, and the implementation of Abs, so the unit-tests pass again now
  • 15th December:
  • 29th November:
    • added ELU
  • 25th September:
  • 23rd September:
    • ported latest cunn implementation of SpatialMaxPooling across, ie approximately Sergey's Deterministic max-pooling PR
      • this includes :ceil() implementation
  • 22nd September:
    • added non-batch implementation of LogSoftMax (previously only handled batched input)
    • added SoftMax, for both batched and non-batched
  • 20th September:
    • added non-batch implementation for SpatialMaxPooling (previously only handled batched input), for contiguous pools

Older changes

About

OpenCL backend for Torch nn neural networks library

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 5

0