- Bonaventure Dossou is studying Data Engineering at Jacobs University, and is a research intern at Mila Québec AI Institute. His interest lies in Deep Learning especially in the fields of Computer Vision and Natural Language Processing with a particular focus on African Languages and Healthcare. He is also interested in Computational Biology and Network Sciences approaches in Biology and Medicine.
- Chris Emezue is in Mathematics in Data Science at Technical University of Munich. His interest lies in Machine/Deep learning, Reinforcement Learning, NLP with a focus on African Languages, and NLP for healthcare.
While some languages like English dominate the NLP and AI research and real-world application, many African languages receive little to no efforts or attention from big companies like Google, Facebook, etc. Thankfully enough, great workshops like AfricaNLP and WMT, which foster the inclusion of low-resource languages were the main factors that led to our work being published .
It is important to mention that our interest in preserving African languages led us to other great initiatives like Masakhane , which works towards the inclusion of African languages in the NLP map, as well as Deep learning Indaba, to mention just a few.
- Pay the outstanding debts for computational resources, as well as give us the opportunity to embark on the next stage which involves adding more African languages. Currently we are working on Fon-Igbo translation project and therefore need money to handle the dataset creation and model training.
- Settle down the web and mobile equipments, settings and working tools for the publication of the FFR platform (web servers, hosting services, cloud storage services, etc.)
- Carry on our research to ensure AI benefits every body and equaly, in regulation with ethics. One of the other main projects we are working on (whose papers are also under peer-reviewing process) is Automatic Speech Recognition models for Fon and Igbo. We have the first prototypes, already quite good performing already but we are looking forward to improving them. While embarking on this research, we discovered that there exists very very few speech dataset for African languages. The few that exist are minuscule and/or not open-sourced. Not to mention, they are not usually processed well or accurately represent the current speech pattern of the languages (this was the case for the Igbo speech dataset we found). Inspired by the need to stop this, we initiated efforts in both Nigeria and Benin to gather authentic speech data from the native speakers. We are using our little income to finance these efforts and some funding will help push these efforts further.