Show simple item record

dc.contributor.authorAmol, Cynthia Jayne
dc.contributor.authorAwuor, Lilian Diana Wanzare
dc.date.accessioned2024-03-18T16:24:02Z
dc.date.available2024-03-18T16:24:02Z
dc.date.issued2023-08-30
dc.identifier.urihttps://repository.maseno.ac.ke/handle/123456789/6046
dc.description.abstractIn the age of freedom of speech, users of the social media platform Twitter post millions of messages per day. These messages are not always fact-checked resulting in misinformation which is false or misleading news. Misinformation classification involves identifying and classifying text as either false or fact by comparing the text against fact-checked news. On political matters, misinformation online can result in mistrust of political figures, polarization of communities and violence offline. Existing studies mostly address misinformation detection for messages written in a single language such as English. Among most bilingual or multilingual user groups in countries like Kenya, the use of Swahili-English code-switching and code-mixing is a common practice in informal text-based communication such as messaging on social media platforms like Twitter. There is therefore need for more research in low-resource languages such as Swahili. The PolitiKweli dataset introduced by this study, which a novel Swahili-English misinformation classification dataset, contains 6,345 Swahili-English texts, 22,957 English texts and 211 Swahili texts. The texts are labelled as fake, fact or neutral as compared to a fact-checked dataset also created for this study. The dataset curation process including data collection, processing and annotation are explained. Challenges during annotation are also discussed. The result of experiments conducted using a pretrained language model prove the dataset’s usefulness in training Swahili-English code-switched misinformation classification models.en_US
dc.publisherDeep Learning Indaba 2023en_US
dc.titlePolitiKweli: A Swahili-English Code-switched Twitter Political Misinformation Classification Dataseten_US
dc.typeArticleen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record