Unsupervised machine translation: A novel approach to provide fast, accurate translations for more languages

One problem with existing training methods for unsupervised learning used in human language translation is the availability of text "pairs" (such as the same sentence written twice) between obscure languages – such as Welsh to Urdu. Facebook is testing out a new mechanism involving "word embeddings" – vector-space representation of words in a given language (e.g. kitty -> cat is closer than kitty -> rocket). Using this model, the relative positions of words can be used to create the translation between languages for which there is little existing identical material. Although promising, performance is still lower than using supervised methods.
Automatic language translation is important to Facebook as a way to allow the billions of people who use our services to connect and communicate in their preferred language. To do this well, curren…