QCRI Works with EML to Improve Arabic Speech Recognition Systems

qcri-arabic-languageQatar Computing Research Institute (QCRI) and EML European Media Laboratory, Heidelberg (Germany), have combined efforts to advance very large vocabulary speech recognition technology and systems for Arabic. The result of the cooperation will be an automatic Arabic language transcription system.

The applications for automatic language transcription are wide-ranging. Video and audio content, from broadcast news programmes to lectures and meetings, will become searchable, enabling knowledge sharing. It benefits persons who are hearing impaired, and may allow for the alternative communication of messages in noisy environments such as airports, or quiet areas such as libraries.

Dr Ahmed Elmagarmid, Executive Director of QCRI, said:

Our collaboration with EML is an ideal union of expertise in language technology research…The result of our cooperation opens up numerous possibilities for the future of communication enabling and information sharing.  We look forward to further strengthening our relationship with EML.’

Prof Andreas Reuter, CEO of EML, said:

We are excited about the collaboration with QCRI for two reasons: First, it gets us into contact with a world-class research organisation, especially in the area of language-oriented technologies, and second, we can add another extremely important language to our portfolio.’

EML’s research and development expertise lies with speech processing technologies. It builds automatic speech transcription systems for speech messaging, speech analytics, and media transcription solutions. EML has developed the EML Transcription Platform, a scalable speech recognition platform that is language independent.

The goal of QCRI’s Arabic language technologies team is to develop technology, which helps to break down language barriers to improve access to information and the ability to communicate. The team is undertaking major research efforts to improve machine translation and speech recognition for the Arabic language and its dialects through building language models for different domains, or uses, such as meetings, broadcast news or lectures.

The collaboration between QCRI and EML will rapidly progress the development of high performance Arabic language components on the EML Transcription Platform, which would ultimately provide a rich source for Arabic language data collection to further enhance QCRI’s speech recognition models.