Advocating for African Languages preservation

Co-organized

Bonaventure Dossou and Chris Emezue are organizing this fundraiser.

Donation protected

We are Bonaventure Dossou and Chris Emezue, two holders of a Bachelor of Science in Mathematics with distinctions from Russia, currently pursuing our Master degrees in Germany:

Bonaventure Dossou is studying Data Engineering at Jacobs University, and is a research intern at Mila Québec AI Institute. His interest lies in Deep Learning especially in the fields of Computer Vision and Natural Language Processing with a particular focus on African Languages and Healthcare. He is also interested in Computational Biology and Network Sciences approaches in Biology and Medicine.

Chris Emezue is in Mathematics in Data Science at Technical University of Munich. His interest lies in Machine/Deep learning, Reinforcement Learning, NLP with a focus on African Languages, and NLP for healthcare.

Together, and for more than a year now, we have launched and been leading the FFR project, whose goal is to create efficient translation systems among African languages (more details on the picture below from Sep. 2020 - a better version available here).

Initially intended to be publicly available by the end of 2020, we faced delay mostly due to financial reasons. We have been funding every step of the project on our own as students (from the crowdsourcing data collection, preprocessing, to training the models). For example, training the models requires high computational powers which cost money, which became a burden for us as students. We utilized the free (but very little) credits from Google cloud, and PaperSpace, which quickly finished and incurred more non-free computational resources. We have outstanding debts from Paperspace and Google Cloud due to this.

While some languages like English dominate the NLP and AI research and real-world application, many African languages receive little to no efforts or attention from big companies like Google, Facebook, etc. Thankfully enough, great workshops like AfricaNLP and WMT, which foster the inclusion of low-resource languages were the main factors that led to our work being published .

It is important to mention that our interest in preserving African languages led us to other great initiatives like Masakhane , which works towards the inclusion of African languages in the NLP map, as well as Deep learning Indaba, to mention just a few.

Our work on FFR has led us to many international conferences (International Conference of Learning Représentations, Associations of Computational Linguistics, Empirical Methods in Natural Language Processing, AI Expo Africa, etc...) and many interviews, (TV and radio) reportages all around the world like BBC and many more. You can find more about it here. Additionally all the efforts have also yielded in scientific publications that can be found here: Bonaventure Dossou, Chris Emezue.

Today, we finished the first training stage of FFR and are working toward making it available to the public. The performance of the system has been checked by Fon natives, including Bonaventure Dossou. The video below showcase the current state and design of the platform we intend to release. The system translates from Fon to French, and vice-versa. As we intend this to be subject to further improvements, we offer the possibility to users to suggest translations: this will allow us to gather more data, and improve translations quality.

Today's call is an ongoing call, looking out for any help, contributions to:

Pay the outstanding debts for computational resources, as well as give us the opportunity to embark on the next stage which involves adding more African languages. Currently we are working on Fon-Igbo translation project and therefore need money to handle the dataset creation and model training.
Settle down the web and mobile equipments, settings and working tools for the publication of the FFR platform (web servers, hosting services, cloud storage services, etc.)
Carry on our research to ensure AI benefits every body and equaly, in regulation with ethics. One of the other main projects we are working on (whose papers are also under peer-reviewing process) is Automatic Speech Recognition models for Fon and Igbo. We have the first prototypes, already quite good performing already but we are looking forward to improving them. While embarking on this research, we discovered that there exists very very few speech dataset for African languages. The few that exist are minuscule and/or not open-sourced. Not to mention, they are not usually processed well or accurately represent the current speech pattern of the languages (this was the case for the Igbo speech dataset we found). Inspired by the need to stop this, we initiated efforts in both Nigeria and Benin to gather authentic speech data from the native speakers. We are using our little income to finance these efforts and some funding will help push these efforts further.

We hope to have embarked on the adventure and convinced of the good cause with what we accomplished so far with very little.

Any contributions, will be useful, and the results will be there to show you that your confidence will not be in vain.

Thank you for your time and attention,

Bonaventure Dossou & Chris Emezue.

Co-organizers (2)

Bonaventure Dossou

Organizer

Bremen, Bremen

Chris Emezue

Co-organizer

February 20th, 2021
Business

Advocating for African Languages preservation

Co-organizers (2)

Your easy, powerful, and trusted home for help

Easy

Powerful

Trusted