attention

collection of resources to understand attention and implementation of attention models

TODO

https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c74ce0
- RNN shortcoming's in remembering -> translation models more effective when reversed
  - memory bandwidth limitations - observed in waveRNN
- alternatives
- 2d causal con nets for seq-seq: https://arxiv.org/abs/1808.03867
  - expands on seq2seq inspiration for ettts: https://arxiv.org/abs/1705.03122
- transformer, attention is all you need: https://arxiv.org/abs/1706.03762
  - https://towardsdatascience.com/memory-attention-sequences-37456d271992
    - read
    - diagram attention model
    - review of attention: https://blog.heuritech.com/2016/01/20/attention-mechanism/
      - read
      - diagram attention model
      - workshop demonstrating other memory and attention architectures: http://www.thespermwhale.com/jaseweston/ram/
      - vgg architecture
    - another review of attention: https://distill.pub/2016/augmented-rnns/
      - read
      - NTM
      - Attention
      - adaptive computation time
      - nueral programmer
      - reinforcement learning
    - discussion of downside of attention as well as applications: http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/
      - n^2 attentions values
      - RL for attention model: http://arxiv.org/abs/1406.6247
      - image captioning: http://arxiv.org/abs/1502.03044
      - parse trees: http://arxiv.org/abs/1412.7449
      - comprehension: http://arxiv.org/abs/1506.03340
      - idea of attention as fuzzy memory - similar to NTM
      - end to end memory: http://arxiv.org/abs/1503.08895
      - NTM: https://github.com/dennybritz/deeplearning-papernotes/blob/master/neural-turing-machines.md
      - RLNTM: http://arxiv.org/abs/1505.00521
  - illustration: https://jalammar.github.io/illustrated-transformer/
  - annotation: http://nlp.seas.harvard.edu/2018/04/03/attention.html
  - alternative architectures
    - some combination of recurrence and attention
    - fast weights: https://arxiv.org/abs/1610.06258
    - https://medium.com/@sanyamagarwal/understanding-attentive-recurrent-comparators-ea1b741da5c3
- heirarchichal attention
  - similar to temporal convolution
  - similar to wavenet
  - TCN vs RNN: https://arxiv.org/abs/1803.01271
  - TCN outperforms RNN(LSTM) on variety of tasks
- nueral turing machine
  - https://github.com/snipsco/ntm-lasagne
  - https://medium.com/snips-ai/ntm-lasagne-a-library-for-neural-turing-machines-in-lasagne-2cdce6837315
- unsupervised attention model: https://blog.openai.com/language-unsupervised/
- attention for medical image segmentation
  - https://openreview.net/forum?id=Skft7cijM
- simple implementation of techniques
  - https://github.com/e-lab/pytorch-demos/tree/master/seq-learning-basic
    - reimplement in notebook
    - CNN
    - RNN
    - ATT
  - https://github.com/e-lab/pytorch-demos/tree/master/seq-learning-char
    - reimplement in notebook
    - CNN
    - RNN
    - ATT
    - positional encodings
      - None
      - Absolute
      - sinusoidal
    - figure out why swapping axes of data and convolution makes training faster but worse
      - fast & bad: nb x 1 x nh x L -> conv2d -> nb * nh * 1 * L
      - slow & good: nb x 1 x L x nh -> conv2d -> nb * nh * L * 1
    - test out limiting attention window
    - test out source implementation for attention to see if attention model is doing anythin
    - submit pull request to repo to correct model
    - see if attention works for generating different length sequences than training sequence length
    - inspect attention matrices
    - mask attention to make it causal
    - figure out how to pad to receptive field to speed generation

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md
seqLearnBasic.ipynb		seqLearnBasic.ipynb
seqLearnChar.ipynb		seqLearnChar.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

attention

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

aduriseti/attention

Folders and files

Latest commit

History

Repository files navigation

attention

TODO

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages