Skip to content
Notes from the field

Notes from the field

Unsupervised machine translation: A novel approach to provide fast, accurate translations for more languages

Unsupervised machine translation: A novel approach to provide fast, accurate translations for more languages

One problem with existing training methods for unsupervised learning used in human language translation is the availability of text "pairs" (such as the same sentence written twice) between obscure languages – such as Welsh to Urdu. Facebook is testing out a new mechanism involving "word embeddings" – vector-space representation of words in a given language (e.g. kitty -> cat is closer than kitty -> rocket). Using this model, the relative positions of words can be used to create the translation between languages for which there is little existing identical material. Although promising, performance is still lower than using supervised methods.
https://rob.al/2Cdsi6C
Automatic language translation is important to Facebook as a way to allow the billions of people who use our services to connect and communicate in their preferred language. To do this well, curren…

Share this:

  • Click to email a link to a friend (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on WhatsApp (Opens in new window)
2018-09-03

linkedin cross-post

Post navigation

PREVIOUS
Study: Autonomous Vehicle Awareness Rising, Acceptance Declining
NEXT
Introducing a New Framework for Flexible and Reproducible Reinforcement Learning Research
Comments are closed.

Archives

The standard disclaimer…

The views, thoughts, and opinions expressed in the text belong solely to the me, and not necessarily to the my employer, organization, committee or other group that I belong to or am associated with.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© 2023 Rob Aleck, licensed under CC BY-NC 4.0