FAIR Fairseq Toolkit

From GM-RKB
Jump to navigation Jump to search

A FAIR Fairseq Toolkit is a neural sequence modeling toolkit.



References

2018

   Convolutional Neural Networks (CNN)
       Dauphin et al. (2017): Language Modeling with Gated Convolutional Networks
       Gehring et al. (2017): Convolutional Sequence to Sequence Learning
       Edunov et al. (2018): Classical Structured Prediction Losses for Sequence to Sequence Learning
       Fan et al. (2018): Hierarchical Neural Story Generation
   Long Short-Term Memory (LSTM) networks
       Luong et al. (2015): Effective Approaches to Attention-based Neural Machine Translation
       Wiseman and Rush (2016): Sequence-to-Sequence Learning as Beam-Search Optimization
   Transformer (self-attention) networks
       Vaswani et al. (2017): Attention Is All You Need
       Ott et al. (2018): Scaling Neural Machine Translation
       Edunov et al. (2018): Understanding Back-Translation at Scale

Fairseq features:

   multi-GPU (distributed) training on one machine or across multiple machines
   fast beam search generation on both CPU and GPU
   large mini-batch training even on a single GPU via delayed updates
   fast half-precision floating point (FP16) training
   extensible: easily register new models, criterions, and tasks

We also provide pre-trained models for several benchmark translation and language modeling datasets.