HumanBrain
1.0.0
1.0.0
  • What is notes
  • Knowledge Base
    • Machine Learning
      • Gausian Process
    • Math
      • Statistics
        • Importance Sampling
        • Probability And Counting
      • Linear Algebra
        • Dummy
    • Deep Learning
      • Deep Learning
  • Code
    • Code
      • Generative
      • NLP
      • RL
      • Vision
  • Papers
    • papers
  • Notes
    • Cognitive
      • (2016. 4) ML Learn And Think Like Human
    • Optimization
      • (2010. 5) Xavier Initialization
      • (2015. 2) Batch Normalization
      • (2015. 2) He Initialization
    • Reinforcement Learning
      • (2017. 6) Noisy Network Exploration
    • Vision
      • (2013. 12) Network In Network
      • (2014. 12) Fractional Max-pooling
      • (2015. 12) Residual Network
    • Natural Language Processing
      • (2014. 9) Bahdanau Attention
      • (2015. 11) Diversity Conversation
      • (2015. 11) Multi Task Seq2seq
      • (2015. 12) Byte To Span
      • (2015. 12) Vocabulary Strategy
      • (2015. 6) Skip Thought
      • (2015. 6) Teaching Machine Read And Comprehend
      • (2015. 8) Luong Attention
      • (2015. 8) Subword NMT
      • (2016. 10) Bytenet
      • (2016. 10) Diverse Beam Search
      • (2016. 10) Fully Conv NMT
      • (2016. 11) Bidaf
      • (2016. 11) Dual Learning NMT
      • (2016. 11) Generate Wiki
      • (2016. 11) NMT With Reconstruction
      • (2016. 2) Exploring Limits Of Lm
      • (2016. 3) Copynet
      • (2016. 4) NMT Hybrid Word And Char
      • (2016. 5) Adversarial For Semi Supervised Text Classification
      • (2016. 6) Sequence Knowledge Distillation
      • (2016. 6) Squad
      • (2016. 7) Actor Critic For Seq
      • (2016. 7) Attn Over Attn NN RC
      • (2016. 9) PS LSTM
      • (2017. 10) Multi Paragraph RC
      • (2017. 11) Neural Text Generation
      • (2017. 12) Contextualized Word For RC
      • (2017. 3) Self Attn Sentence Embed
      • (2017. 6) Slicenet
      • (2017. 6) Transformer
      • (2017. 7) Text Sum Survey
      • (2018. 1) Mask Gan
      • (2018. 2) Qanet
      • (2018. 5) Minimal Qa
    • Generative
      • (2013. 12) VAE
      • (2014. 6) Gan
      • (2016. 7) Seq Gan
    • Model
      • (2012. 7) Dropout
      • (2013. 6) Dropconnect
      • (2015. 7) Highway Networks
      • (2015. 9) Pointer Network
      • (2016. 10) Fast Weights Attn
      • (2016. 10) Professor Forcing
      • (2016. 3) Stochastic Depth
      • (2016. 7) Layer Normalization
      • (2016. 7) Recurrent Highway
      • (2017. 1) Very Large NN More Layer
      • (2017. 6) Relational Network
Powered by GitBook
On this page
  1. Notes
  2. Natural Language Processing

(2018. 2) Qanet

Previous(2018. 1) Mask GanNext(2018. 5) Minimal Qa

Last updated 6 years ago

  • Submitted on 2018. 2

  • Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, Quoc V. Le

Simple Summary

Propose a new Q\&A model that does not require recurrent networks: It consists exclusively of attention and convolutions, yet achieves equivalent or better performance than existing models. On the SQuAD dataset, our model is 3x to 13x faster in training and 4x to 9x faster in inference. The speed-up gain allows us to train the model with much more data. We hence combine our model with data generated by backtranslation from a neural machine translation model. This data augmentation technique not only enhances the training examples but also diversifies the phrasing of the sentences, which results in immediate accuracy improvements.

  • Aiming to make the machine comprehension fast, we propose to remove the recurrent nature of these models. (bottleneck)

  • Model design:

    • Convolution captures the localstructure of the text

    • Self-attention learns the global interaction between each pair of words.

  • Data augmentation:

    • use two translation models (Eng -> Fre, Fre -> Eng)

  • Achieving up to 13x speedup in training and 9x per training iteration, compared to the RNN counterparts.

  • Single model, trained with augmented data, achieves 84.6 F1 score on the test set

images
images