Bilingual Association in Neural Machine Translation
In recent years sequence-to-sequence neural machine translation (NMT) models have become highly effective in improving the state of the art of machine translation, achieving considerable improvements in fluency and adequacy over statistical machine translation (SMT).
In these NMT models, a crucial component is the attention model.
Nevertheless, it is questionable to what extent current attention models can succeed achieve adequate attention for discontiguous source translations. As an example, when we the Dutch sentence, “zet vrij hoog in” is to be translated to the English “aim quite high”, both the words “zet” and “in” are necessary to come up with the correct translation “aim” in English. With such discontiguous translations, especially when the relevant words are far apart, current attention models may fail to adequately model the required discontiguous attention.
In my EDGE project, I aim to improve the attention model of NMT, following different approaches including relaxing the independence assumptions made by this model, adding more structure to the model, looking at the role syntax could play in improving this model, exploring the way different learning approaches including multi-task learning may improve these models and considering different representations beyond mere word representations that the attention model may choose from. Special interest goes to improving the attention model with respect to discontiguous translation and reordering, as well as creating different and more structured encodings of the source that helps the attention model to be more effective.