-
University of Toronto
- Toronto
- http://www.cs.toronto.edu/~arvie/
Highlights
- Pro
Stars
DiffusER: Discrete Diffusion via Edit-based Reconstruction (Reid, Hellendoorn & Neubig, 2022)
Official Implementation for the ICML2022 paper "Directed Acyclic Transformer for Non-Autoregressive Machine Translation"
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
Fast and memory-efficient exact attention
Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.
Examples of how to create colorful, annotated equations in Latex using Tikz.
Sparse and structured neural attention mechanisms
Code for the ALiBi method for transformer language models (ICLR 2022)
Smoothing and Shrinking the Sparse Seq2Seq Search Space
Learning to Initialize Neural Networks for Stable and Efficient Training
A comprehensive list of awesome contrastive self-supervised learning papers.
Awesome Knowledge Distillation
A Latex style and template for paper preprints (based on NIPS style)
😈Awful AI is a curated list to track current scary usages of AI - hoping to raise awareness
A curated list of awesome self-supervised methods
[ICML 2020] code for "PowerNorm: Rethinking Batch Normalization in Transformers" https://arxiv.org/abs/2003.07845
Code for Generalized Entropy Regularization paper
This repo contains code to reproduce some of the results presented in the paper "SentenceMIM: A Latent Variable Language Model"
Code for the paper "Adaptive Transformers for Learning Multimodal Representations" (ACL SRW 2020)
Paper bank for Self-Supervised Learning
Papers & presentation materials from Hugging Face's internal science day
Torch modules that wrap blackbox combinatorial solvers according to the method presented in "Differentiating Blackbox Combinatorial Solvers"
Python implementation of projection losses.