(2016. 4) NMT Hybrid Word And Char

  • Submitted on 2016. 4

  • Minh-Thang Luong and Christopher D. Manning

Simple Summary

word-character solution to achieving open vocabulary NMT. We build hybrid systems that translate mostly at the word level and consult the character components for rare words. Our character-level recurrent neural networks compute source word representations and recover unknown target words when needed.

  • The core of the design is a word-level NMT with the advantage of being fast and easy to train.

  • Source Character-based Representation: always initialized with zero states and use Final hidden state as word representation.

  • Target Character-level Generation:

    1. Hidden-state Initialization

      • target character-level generation requires the current word-level context to produce meaningful translation.

      • separate-path target generation approach works as follows. to create a counterpart vector h˘t that will be used to seed the character-level decoder.

      • h˘t = tanh(W˘[c_t;h_t])

    2. Word-Character Generation Strategy - \ is fed to the word-level decoder “as is” using its corresponding word embedding.

      • training: choice decouples all executions over instances of the character-level decoder as soon the word-level NMT completes.

      • test: utilize our character-level decoder with beam search to generate actual words for these .

  • demonstrated the potential of purely character-based models in producing good translations.

Last updated