Given the progress of language and the development of natural languages, it is hoped that one day you will be able to ask your assistant what are the best salad ingredients. In the meantime, it is possible to ask your home instrument to play a song, or open vocals, which is a feature that is already available on most devices.
If you speak Moroccan, Algerian, Egyptian, Sudanese, or any other Arabic language, which varies greatly from region to region, where some are not understood, that is another matter. If your language is Arabic, Finnish, Mongolian, Navajo, or any other language that has a lot of technical difficulties, you may feel left out.
This complex construction aroused Ahmed Ali’s interest in finding an answer. He is a senior engineer at the Arabic Language Technologies team at the Qatar Computing Research Institute (QCRI) – part of the Qatar Foundation of Hamad Bin Khalifa University and founder of ArabicSpeech, “an existing platform to benefit from Arabic language science and vocabulary.”
Ali was fascinated by the idea of talking to cars, electronics, electronics several years ago while at IBM. “Can we make a multilingual machine — an Egyptian pediatrician to write for himself, a Syrian teacher to help children get the most out of their education, or a Moroccan chef who explains the best way for couscous?” he says. However, the algorithms that run the machine cannot detect about 30 Arabic versions, not to mention comprehension. Today, most voice recognition tools are available in English as well as a few other languages.
The coronavirus has also encouraged greater reliance on word-of-mouth technologies, while modern language-changing technologies have helped people to follow home-based instruction and distance. However, even though we have been using commanding words to help buy e-commerce and improve our families, the future has many uses.
Millions of people around the world are using the advanced online training (MOOC) to gain free access and to participate without limitation. Speech recognition is one of the key features of the MOOC, where students can search for specific areas of speech in the course and assist in translation through footnotes. Vocabulary technology enables digital learning to reflect spoken words as writing texts in university classrooms.
According to a recent article in Speech Technology magazine, the word-for-word market is expected to reach $ 26.8 billion by 2025, while millions of consumers and companies around the world rely on word bots not only to connect their devices or vehicles but also. as well as to promote customer service, manage health skills, and improve access and integration for those with hearing, speech, or motor impairment.
In a 2019 survey, Capgemini predicted that by 2022, more than two-thirds of consumers would choose voice assistants instead of visiting stores or bank branches; an area that can rise accordingly, taking into account the domestic, remote and commercial needs of the plague that has plagued the world for over a year and a half.
However, these weapons fail to reach many parts of the world. For the 30 Arab tribes and millions of people, this is a rare opportunity.
Arabic for machines
English or French word boats are not perfect. However, training machines to understand Arabic are more complex for a number of reasons. Here are three well-known problems:
- Lack of letters. Arabic languages are common languages, as they are spoken. Most of the existing text is unspecified, meaning that it does not contain words like acute (´) or grave (`) that indicate the meaning of the letters. Therefore, it is difficult to determine where vowels go.
- Lack of resources. There is a shortage of data written in various Arabic languages. Taken together, they do not have the perfect rules of grammar, including routine or style, word pronunciation, pronunciation, and emphasis. These resources are very important in the teaching of computer models, and for some reason they have hampered the development of the understanding of Arabic words.
- Morphological pull. Arabic speakers fluctuate many codes. For example, in the French colonies — North Africa, Morocco, Algeria, and Tunisia — these languages contain many French words that are borrowed. As a result, there is a large number of so-called unused words, which word recognition technologies cannot understand because the words are not Arabic.
“But the field is moving fast,” says Ali. It is a collaborative work among many researchers to move it faster. Ali’s Arabic Language Technology is leading an ArabicSpeech project to integrate Arabic translations and dialects from each region. For example, Arabic can be divided into four regional languages: North Africa, Egypt, the Gulf, and Levantine. However, considering that languages do not fit the border, this can only work as well as one language per city; for example, an Egyptian native speaker can distinguish between the Alexandrian language and another Aswan native (1,000 miles[1,000 km]on the map).
Building a professional future for all
At present, machines are almost as accurate as human writers, thanks in large part to the advancement of deep neural networks, a small part of mechanical learning that relies on algorithms inspired by how the human brain works, naturally and functionally. Until recently, however, word recognition has been slightly distorted. The technology has a history of relying on a variety of modouches for acoustic modeling, the construction of lexicons of pronunciation, and grammar; all modules that need to be trained separately. More recently, researchers have been teaching models that convert acoustics into text, which can refine all aspects of the final work.
Despite this progress, Ali is still unable to command many weapons in Arabic. “It’s 2021, and I can’t communicate with many machines in my language,” he said. I mean, I now have a device that can understand my English, but the recognition of most Arabic-speaking machines has not happened. “
Doing this is the goal of Ali’s work, which has culminated in a well-known transformation of the Arabic language and dialects; which has done well so far. Called the QCRI Advanced Transcription System, the technology is used by broadcasters Al-Jazeera, DW, and the BBC to record online content.
There are a few reasons why Ali and his team have done well in building speech engines right now. Initially, he says, “There is a need for resources in all languages. We need to develop the tools to be able to teach the model.” As Ali says, “We have good infrastructure, good modules, and we have data that represents reality.”
Researchers from QCRI and Canary AI recently developed models that can achieve social cohesion in Arabic-speaking media. This system shows how Aljazeera’s daily subtitling reports. Although the English human error rate (HER) is about 5.6%, this study showed that Arabic HER is very high and can reach 10% due to morphological difficulties in the language and lack of acceptable spelling rules in dialectal Arabic. Thanks to the recent advances in in-depth study and final design, the Arabic language recognition engine is capable of surpassing the speakers in audio broadcasts.
Although Modern Standard Arabic grammar seems to work well, researchers from QCRI and Canary AI are busy testing the limits of dialectal pronunciation and finding the best results. Since no one speaks Modern Standard Arabic at home, caring for the language is one of the things we need in order for our vocabulary to understand us.
This was written by Qatar Computing Research Institute, Hamad Bin Khalifa University, member of the Qatar Foundation. Not written by MIT Technology Review authors.