Projection-Based Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Evaluation, and Some Misconceptions
This event took place on Thursday 10 October 2019 at 15:15
Cross-lingual word embeddings (CLEs) hold promise of multilingual modeling of meaning and cross-lingual transfer of NLP models. Early models for inducing cross-lingual word vector spaces, requiring sentence- or document-level bilingual signal (i.e., parallel or comparable corpora) have recently been replaced by resource-leaner projection-based CLE models, which require cheap word-level bilingual supervision or even no supervision as all. Despite the ubiquitous usage of CLEs in downstream tasks, they are almost exclusively evaluated intrinsically only on the task of bilingual lexicon induction (BLI). Even BLI evaluations vary greatly, preventing us from correctly interpreting performance and behavior of different CLE models. In this talk, I will present initial steps towards a comprehensive evaluation of cross-lingual word embeddings. I will present results of a systemmatic comparative evaluation of both supervised and unsupervised projection-based CLE models on a large number of language pairs, both in BLI and three diverse downstream tasks, and provide new insights about the ability of cutting-edge CLE models to support cross-lingual NLP. Our study shows that performance of CLE models largely depends on the downstream task and that overfitting CLE models to BLI can severely hurt downstream performance. Finally, I will indicate the most robust supervised and unsupervised CLE models and emphasize the need to reassess simple baselines, which display competitive performance in many settings.
Watch the webcast replay >>