Development of a deep learning model for classification of Swahili news text sentences.

Yiiki, Afedra Brian

dc.contributor.author	Yiiki, Afedra Brian
dc.date.accessioned	2023-08-14T09:07:41Z
dc.date.available	2023-08-14T09:07:41Z
dc.date.issued	2023-07-07
dc.identifier.citation	Yiiki, Afedra Brian. (2023). Development of a deep learning model for classification of Swahili news text sentences. (Unpublished undergraduate dissertation) Makerere University; Kampala, Uganda.	en_US
dc.identifier.uri	http://hdl.handle.net/20.500.12281/16200
dc.description	A research report submitted to the College of Engineering Design and Art in partial fulfillment of the requirement for the award of the degree Bachelor of Science Electrical Engineering of Makerere University.	en_US
dc.description.abstract	In Uganda, native languages, despite being understood by majority of the population, could entirely disappear from online news spaces as English becomes dominant. The dominance of English is attributed to the aid of intelligent models for processing news in English, contrary to native languages, for which manual practices of publishing news online result in significant time delays. It has been the goal of several efforts to build such intelligent models for native languages, with key interest in Swahili, which is spoken by over 100 million people and Africa’s most spoken native language. Native languages are still low resource in data for training language processing models for them. There is need of deliberate effort to collect more data for news in native languages and subsequent need to build models of better performance thereof. Our case study was Swahili language for which we collected more text data for news in Swahili in addition to that in literature. This data was labelled with six (6) news categories. We were able to build five (5) multi label classification models for Swahili news text. We evaluated these models and our best model posed a better performance than those in literature. We were also able to explain the performance this model in context of the bias within the data that led to the small confusion in classification of the test data. The best model was deployed in a real time web application. Swahili language, being widely spoken, is a fair representation of native languages and therefore, this project greatly contributes to the body of work to increase usage of native languages on online news media. We recommend more data, focused on more news categories and local news scenarios, is collected as a step to enable transfer learning of existing language models.	en_US
dc.language.iso	en	en_US
dc.publisher	Makerere University.	en_US
dc.subject	Deep learning model	en_US
dc.subject	Swahili news text	en_US
dc.title	Development of a deep learning model for classification of Swahili news text sentences.	en_US
dc.type	Thesis	en_US

Files in this item

Name:: Yiiki-CEDAT-BELE.pdf
Size:: 1.514Mb
Format:: PDF
Description:: Undergraduate Dissertation

View/Open

This item appears in the following Collection(s)

School of Engineering (SEng.) Collections

Show simple item record