但实际上,我们认为 BLI 本质上是一个排序的问题,而不是上述训练目标定义的回归问题。因为 BLI 的目标是:对于每一个源语言中的词,寻找目标语言中置信度最高的 k 个候选词。也就是说,映射函数实际上应当具备辨别正确翻译与错误翻译之间的相对顺序的能力。以前的工作使用的目标函数只关注正例(互为翻译的单词对)之间的距离,没有明确地提供重要的排序信息,导致不能有效的提高模型的判别能力。
▲ 表4:关于Householder投影的消融实验3. 对于提出的基于排序的学习目标,我们同样比较了去掉其中的 BPR 部分或是 MSE 部分对模型效果的影响。从表 5 的实验结果,我们可以看出两种损失函数都对模型的性能有帮助,并且在去掉排序相关的 BPR loss 后,模型效果下降的更多。这证明了我们所提出的排序目标对 BLI 任务来说更为重要。
▲ 表5:关于损失函数的消融实验
结语在本文中,我们提出了一个新的基于自适应排序学习的双语词典归纳模型,RAPO。与以往的工作不同,RAPO 将 BLI 看作排序任务,并使用一个基于排序的学习目标对模型进行优化。除此之外,通过深入挖掘 BLI 任务的独特特征,我们进一步设计了两个新的模块:在梯度下降优化中严格正交的映射函数 Householder 投影,和为每个单词提供个性化偏移的个性化适配器。我们在 MUSE 数据集的 20 个翻译任务上对模型进行评估,并且进行了充足的实验分析,证明了 RAPO 的优越性。
参考文献
[1] Tomás Mikolov, Quoc V. Le, and Ilya Sutskever. 2013. Exploiting similarities among languages for machine translation. CoRR, abs/1309.4168.
[2] Chao Xing, Dong Wang, Chao Liu, and Yiye Lin. 2015. Normalized word embedding and orthogonal transform for bilingual word translation. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1006–1011, Denver, Colorado. Association for Computational Linguistics.
[3] Alexis Conneau, Guillaume Lample, Marc’Aurelio Ranzato, Ludovic Denoyer, and Hervé Jégou. 2018. Word translation without parallel data. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.
[4] Mikel Artetxe, Gorka Labaka, and Eneko Agirre. 2018a. Generalizing and improving bilingual word embedding mappings with a multi-step framework of linear transformations. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pages 5012–5019.
[5] Xu Zhao, Zihao Wang, Hao Wu, and Yong Zhang. 2020. Semi-supervised bilingual lexicon induction with two-way interaction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2973–2984, Online. Association for Computational Linguistics.
[6] Goran Glavaš and Ivan Vulic. 2020. Non-linear instance-based cross-lingual mapping for non-isomorphic embedding spaces. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7548–7555, Online. Association for Computational Linguistic.
[7] Shuo Ren, Shujie Liu, Ming Zhou, and Shuai Ma. 2020. A graph-based coarse-to-fine method for unsupervised bilingual lexicon induction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3476–3485, Online. Association for Computational Linguistics.