RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…

Python 13,707 916 Updated Jun 11, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 17,840 1,743 Updated Jun 15, 2025

privateai / deid-examples

Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.

Jupyter Notebook 82 2 Updated May 12, 2025

synercys / annotated_latex_equations

Examples of how to create colorful, annotated equations in Latex using Tikz.

TeX 3,853 219 Updated Jul 12, 2022

vene / sparse-structured-attention

Sparse and structured neural attention mechanisms

Python 223 36 Updated Aug 31, 2020

ofirpress / attention_with_linear_biases

Code for the ALiBi method for transformer language models (ICLR 2022)

Python 534 38 Updated Oct 30, 2023

deep-spin / S7

Smoothing and Shrinking the Sparse Seq2Seq Search Space

Python 6 Updated Mar 18, 2021

zhuchen03 / gradinit

Learning to Initialize Neural Networks for Stable and Efficient Training

Python 139 12 Updated May 24, 2022

r0mainK / outperformer

Code for scaling Transformers

Python 26 1 Updated Dec 2, 2020

asheeshcric / awesome-contrastive-self-supervised-learning

A comprehensive list of awesome contrastive self-supervised learning papers.

1,277 127 Updated Sep 10, 2024

dkozlov / awesome-knowledge-distillation

Awesome Knowledge Distillation

3,683 512 Updated Jun 10, 2025

kourgeorge / arxiv-style

A Latex style and template for paper preprints (based on NIPS style)

TeX 1,301 336 Updated Jan 2, 2024

daviddao / awful-ai

😈Awful AI is a curated list to track current scary usages of AI - hoping to raise awareness

7,062 237 Updated Feb 20, 2025

jason718 / awesome-self-supervised-learning

A curated list of awesome self-supervised methods

6,287 835 Updated Jul 3, 2024

autoliuweijie / FastBERT

The score code of FastBERT (ACL2020)

Python 605 90 Updated Oct 29, 2021

sIncerass / powernorm

[ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845

Python 120 17 Updated Jun 20, 2021

rycolab / entropyRegularization

Code for Generalized Entropy Regularization paper

Python 14 3 Updated May 2, 2020

seraphlabs-ca / SentenceMIM-demo

This repo contains code to reproduce some of the results presented in the paper "SentenceMIM: A Latent Variable Language Model"

Python 28 4 Updated Jun 22, 2022

archsyscall / aquvitae

Knowledge Distillation Toolkit

Python 88 10 Updated Jun 27, 2020

prajjwal1 / adaptive_transformer

Code for the paper "Adaptive Transformers for Learning Multimodal Representations" (ACL SRW 2020)

Jupyter Notebook 43 8 Updated Oct 20, 2022

dev-sungman / Awesome-Self-Supervised-Papers

Paper bank for Self-Supervised Learning

581 59 Updated Mar 14, 2023

huggingface / awesome-papers

Papers & presentation materials from Hugging Face's internal science day

2,047 119 Updated Oct 31, 2020

martius-lab / blackbox-backprop

Torch modules that wrap blackbox combinatorial solvers according to the method presented in "Differentiating Blackbox Combinatorial Solvers"

Python 345 40 Updated Dec 21, 2021

mblondel / projection-losses

Python implementation of projection losses.

Python 26 2 Updated Nov 18, 2019

szhangtju / The-compression-of-Transformer

Python 64 13 Updated Dec 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arvie Frydenlund arvieFrydenlund

Highlights

Block or report arvieFrydenlund

Stars

AharonSambol / PrettyPrintTree

rycolab / sampling-adapters

BaohaoLiao / 3ml

machelreid / diffuser

thu-coai / DA-Transformer

BlinkDL / RWKV-LM