- 2017-11
- 2019-02
- Peters et al. – 2018- Deep contextualized word representations(ELMo) [pdf] [note]
- Howard and Ruder – 2018 – Universal language model fine-tuning for text classification(ULMFit) [pdf]
- Radford et al. – 2018 – Improving language understanding by generative pre-training [pdf]
- Devlin et al. – 2018 – Bert: Pre-training of deep bidirectional transformers for language understanding [pdf]
- references
- Blog:The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)
- ELMo
- Quick Start: Training an IMDb sentiment model with ULMFiT
- finetune-transformer-lm: Code and model for the paper “Improving Language Understanding by Generative Pre-Training”
- awesome-bert: bert nlp papers, applications and github resources , BERT 相关论文和 github 项目
Machine Translation
- 2017-12
- Oda et al. – 2017 – Neural Machine Translation via Binary Code Predict [pdf] [note]
- Kalchbrenner et al. – 2016 – Neural machine translation in linear time [pdf] [pdf (annotated)] [note]
- 2018-05
- Sutskever et al. – 2014 – Sequence to Sequence Learning with Neural Networks [pdf]
- Cho et al. – 2014 – Learning Phrase Representations using RNN Encoder-Decoder for NMT [pdf]
- Bahdanau et al. – 2014 – NMT by Jointly Learning to Align and Translate [pdf]
- Luong et al. – 2015 – Effective Approaches to Attention-based NMT [pdf]
- 2018-06
- Gehring et al. – 2017 – Convolutional sequence to sequence learning [pdf]
- Vaswani et al. – 2017 – Attention is all you need [pdf] [note1:The Illustrated Transformer] [note2:The Annotated Transformer]
- references