Tags · MegEngine/cutlass

v2.3.0

Merge pull request NVIDIA#135 from NVIDIA/cutlass_2.3_final

CUTLASS 2.3.0

Sep 25, 2020
c2b80ad
zip
tar.gz

v2.2.0

Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>. (N…

…VIDIA#100)

- Updated mma_sm80.h to avoid perf penalty due to reinterpret_cast<>.
- Enhancement to CUTLASS Utility Library's HostTensorPlanarComplex template to support copy-in and copy-out
- Added test_examples target to build and test all CUTLASS examples
- Minor edits to documentation to point to GTC 2020 webinar

Jun 15, 2020
1ab1027
zip
tar.gz

v2.1.0

update tools/library/CMakeLists to require python 3.6 according to NV…

…IDIA#70 (NVIDIA#82)

NVIDIA#70 only updates the documentation. This commit reflects this bump in python version to the CMake configuration as well.

Apr 8, 2020
e33d90b
zip
tar.gz

v2.0.0

Need Python 3.6 to use enum.auto() (NVIDIA#70)

Nov 22, 2019
7c0cd26
zip
tar.gz

v1.3.3

Performance enhancement for Volta Tensor Cores TN layout (NVIDIA#53)

* Fixed performance defect with indirect access to pointer array for Volta TensorCores TN arrangement.

* Updated patch version and changelog.

* Updated patch version and changelog.

* Added link to changelog in readme.

* Fixed markdown link

Jul 10, 2019
b5cab17
zip
tar.gz

v1.3.2

Performance enhancement for Volta Tensor Cores TN layout (NVIDIA#53)

* Fixed performance defect with indirect access to pointer array for Volta TensorCores TN arrangement.

* Updated patch version and changelog.

* Updated patch version and changelog.

* Added link to changelog in readme.

* Fixed markdown link

Jul 10, 2019
b5cab17
zip
tar.gz

v1.3.0

Cutlass 1.3 Release (NVIDIA#42)

CUTLASS 1.3 Release
- Efficient GEMM kernel targeting Volta Tensor Cores via mma.sync instruction added in CUDA 10.1.

Mar 20, 2019
877bdca
zip
tar.gz

v1.2.0

Merge pull request NVIDIA#33 from NVIDIA/cutlass_1.2

CUTLASS 1.2

Oct 26, 2018
ed2ed4d
zip
tar.gz

v1.1.0

Merge pull request NVIDIA#28 from NVIDIA/cutlass_1.1

Fixed typeo

Sep 28, 2018
6877595
zip
tar.gz

v1.0.1

Merge pull request NVIDIA#15 from NVIDIA/release_1.0.1_edits

Minor edits to README and changelog pursuant CUTLASS 1.0.1 patch.

Jun 26, 2018
cf0301e
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.3.0

v2.2.0

v2.1.0

v2.0.0

v1.3.3

v1.3.2

v1.3.0

v1.2.0

v1.1.0

v1.0.1

Tags: MegEngine/cutlass