Working at Microsoft Research Lab

I have always wanted to work in the field of machine learning and during my work at Valeo , I was participating in kaggle for learning purposes. Moving to Microsoft Lab helped me gain more industrial and research experience in the field.

Interview Process

Microsoft Research lab interview process was a long process of around 6 interviews differing between system design, machine learning, statistics and algorithms. And specifically for the hiring team i got an interview on skype with Hany Hassan from Redmond. The interview was a hard process but i got an acceptance and was lucky to work with brilliant people from Cairo site and Redmond.

Team

I joined Machine Translation sub team working on conversational models used in Skype Translator. When I joined this team, i started the NMT project along with my manager Ahmed Tawfik and mentor Hany Hassan. Both of them gave me good guidance in the field of NMT which was a new field to me based on my previous experience.

Work

Dialectical NMT Systems

Struggling to get a good bleu score on dialectical Levantine to English systems with the lack of data was just giving me hard times, so we got an idea of generating artificial data from standard arabic data to augment our corpus. This work took around 2 months of experimentation to get a good technique which produced a publication later (Hassan Awadalla et al., 2017) ( discussed here ).

We then started working on tuning the Levantine model and successfully gained a good experience in training/tuning Phrasal and Neural machine translation systems and how to use both of them to get good bleu score.

In order to get in depth in the NMT systems i participated in an online contest English to Chinese Machine Translation system , from this competition i gained more knowledge which helped me create better NMT models.

Spoken Language translation

Based on my work on Skype translator, i got into speech recognition models which helped me participate in kaggle Tensorflow competition. During that period i got a new idea of training a gender aware machine translation system (Elaraby et al., 2018)that’s able to leverage gender information from input Speech utterances. As a research project we worked on it as part of the internship of Mahmoud khaled, this work resulted in a publication (can read it here)

Chat Bots

After going deeper in seq2seq models, i added some features to a forked version of google seq2seq. Tried using seq2seq to train an end2end chat bot system, the bot was a successful attempt but got some problems in the case of dynamic responses (the one that needs input from another service). During that work , a project showed up from Bot Framework team to contribute in their github repo to make a multi-lingual bot. This bot was mainly found to make it easier for the developers to support multiple languages by training only their main language and using our services for a customized machine translation.

Journey end

My journey at Microsoft Egypt came to an end to move to another challenge.

References

  1. Hassan Awadalla, H., Elaraby, M., & Tawfik, A. Y. (2017, December). Synthetic Data for Neural Machine Translation of Spoken-Dialects. IWSLT 2017. https://www.microsoft.com/en-us/research/publication/synthetic-data-neural-machine-translation-spoken-dialects/
  2. Elaraby, M., Tawfik, A. Y., Khaled, M., Hassan, H., & Osama, A. (2018). Gender aware spoken language translation applied to English-Arabic. 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP), 1–6. https://doi.org/10.1109/ICNLSP.2018.8374387