1.1.0

@Muxas

Main improvements

fp32_fast_tf32_t data type to store data in fp32_t while doing compute-bound operation within tf32_t data type
bf16_t data type
LlamaForCausalLM model
QA: automatic testing and linting

What's Changed

Update readme by @Muxas in #1
Muxas/sha by @Muxas in #15
Set up basic GitHub CI workflow by @daskol in #45
Muxas/license header by @Muxas in #50
Muxas/simgrid by @Muxas in #51
Run pre-commit checks on diff with trunk by @daskol in #58
Fix linting for forks by @daskol in #61
Fp32 fast tf32 by @Muxas in #60
Run linting and building workflows on push by @daskol in #65
Lint everything on merge by @daskol in #66
Fix linting on merge by @daskol in #67
Decouple functions from tensor by @svtdanny in #62
new basic types by @Muxas in #64
Add BatchNorm2d implementation by @svtdanny in #63
Fix build by @svtdanny in #70
Muxas/fix cpp tests by @Muxas in #72
Fix invlidate_submit bug with FlashAttention logic by @Muxas in #69
Logger by @Muxas in #68
Update Dockerfile by @Muxas in #53
Add support of bf16_t for DeepRelu example by @amkatrutsa in #71
Fix base_types and tests/kernel/randn by @Muxas in #77
Run pre-commit for whitespaces and empty lines by @Muxas in #78
Add default args to strapu::Config and fix gpt2_custom_train by @Muxas in #79
Improve logger server options by @Muxas in #80
Set default env values for logger server by @Muxas in #81
Fix logger/server.py by @Muxas in #82
Add handy methods for gpt2 class by @svtdanny in #83
Add support of bf16 for gpt2 training by @amkatrutsa in #87
Add greedy generation strategy by @svtdanny in #88
Add upper level Dockerfile, update README by @Muxas in #89
Add base unoptimized inference engine by @svtdanny in #90
Rework GPT2 examples and optimizers by @Muxas in #92
add inference server + example by @svtdanny in #96
Fix SGD to support bf16 by @amkatrutsa in #97
Add SiLU activation by @amkatrutsa in #93
Adjust linting and testing CI workflows by @daskol in #94
Add workflow for nightly linting and testing by @daskol in #98
Add rmsnorn and test by @amkatrutsa in #100
Muxas/gqa by @Muxas in #101
Refurbish python tests for green trunk by @daskol in #99
Add handling bus info for logger by @multeng in #103
add usage memory size handling by @multeng in #107
Add typing stubs for a native extension by @daskol in #108
Configure regular typing checks by @daskol in #109
Incorporate Rotary positional embedding into LlamaAttention by @Muxas in #106
Maintain typing compatibility with python_version<3.12 by @daskol in #110
Add Llama model by @amkatrutsa in #111
Revise LLaMA ingredients testing by @amkatrutsa in #120
Add model LLaMaForCausalLM and tests by @amkatrutsa in #122
Svtdanny/kvcache attention by @svtdanny in #121
Improve utils/constructors.py by @Muxas in #124
Pin down version of action/upload-artifact by @daskol in #132
Conv2D layer with fwd/bwd, mixed precision and testing by @Muxas in #131
Svtdanny/dynamic layers by @svtdanny in #123
Support stride parameter for Conv2d layer by @Muxas in #133
conv2d dilation parameter by @Muxas in #136
Add Add layer into init.py by @Muxas in #137
Update Dockerfile and README by @Muxas in #138
Fix lint of some python files by @Muxas in #139
Ruff+isort optimizers by @Muxas in #140
Lint DeepLinear, DeepRelu, MLPMixer models by @Muxas in #141
Lint many python files by @Muxas in #142
Lint other not-currently-PRed python files by @Muxas in #143
Norm-fiber op for batchnorm2d by @gogolgrind in #105
Add simple GPT2 training example via Jupyter notebook by @Muxas in #144
Add LLaMa training script by @amkatrutsa in #125
Add Llama jupyter notebook by @Muxas in #145
Bump version; add missing copyright headers by @Muxas in #146

New Contributors

@Muxas made their first contribution in #1
@daskol made their first contribution in #45
@svtdanny made their first contribution in #62
@amkatrutsa made their first contribution in #71
@multeng made their first contribution in #103
@gogolgrind made their first contribution in #105

Full Changelog: 1.0.0...1.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.1.0

Main improvements

What's Changed

New Contributors

Contributors