Facebook has shared an implementation for cross-lingual language model (XLM) pretraining with the machine translation community, so if you’re interested in language processing and know your way around PyTorch, go and have a look.
Natural Language Processing is one of the main research fields in machine learning and with the amount of text Facebook harbours, it shouldn’t come as a surprise that automatically understanding language is on the company’s research agenda as well. After all, you need a certain degree of automation to, for example, find problematic content.
To advance supervised as well as unsupervised machine translation, Facebook’s AI research team has developed an extension for masked language modeling (MLM) which was put forward by Google senior research scientist Jacob Devlin et al. in October 2018. While MLM trains systems by making deductions from other words in a sentence, Facebook’s approach gives the system access to two versions of a sentence in different languages to use during the prediction process.
This method is named translation language modeling and should improve the initialisation of neural machine translation systems as well as help NLP systems with understanding across different languages. Sharing its implementation isn’t really meant as a way of finding more people to work on it, but helps forward research in that area, since it offers a way for others to test the approach with different data sets and therefore validating Facebook’s findings.
The open source repository contains code for monolingual and cross-lingual language model pretrainings to compare TLM against earlier concepts, different training scenarios and XNLI as well as GLUE fine-tuning. There are also pretrained models available. To shorten the time it takes to create a working model, XLM is able to work with multiple GPUs and nodes during training.
The XLM implementation is licensed under the Attribution-NonCommercial 4.0 International license as formulated by the Creative Commons Corporation and can be found on GitHub. To run it, Python 3, NumPy, PyTorch, fastBPE and Moses have to be installed. A research paper fleshing out the theoretical details can be found on the open access platform arXiv.