Now showing items 1-3 of 3
Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks
Indigenous African languages are categorized as under-served in Artificial Intelligence and suffer poor digital inclusivity and information access. The challenge has been how to use machine learning and deep learning models ...
KenSwQuAD – A Question Answering Dataset for Swahili Low Resource Language
This research developed a Kencorpus Swahili Question Answering Dataset KenSwQuAD from raw data of Swahili language, which is a low resource language predominantly spoken in Eastern African and also has speakers in other ...
Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili
(Cornell University, 2022)
Building automatic speech recognition (ASR) systems is a challenging task, especially for underresourced languages that need to construct corpora nearly from scratch and lack sufficient training data. It has emerged that ...