Show simple item record

dc.contributor.authorYiiki, Afedra Brian
dc.date.accessioned2023-08-14T09:07:41Z
dc.date.available2023-08-14T09:07:41Z
dc.date.issued2023-07-07
dc.identifier.citationYiiki, Afedra Brian. (2023). Development of a deep learning model for classification of Swahili news text sentences. (Unpublished undergraduate dissertation) Makerere University; Kampala, Uganda.en_US
dc.identifier.urihttp://hdl.handle.net/20.500.12281/16200
dc.descriptionA research report submitted to the College of Engineering Design and Art in partial fulfillment of the requirement for the award of the degree Bachelor of Science Electrical Engineering of Makerere University.en_US
dc.description.abstractIn Uganda, native languages, despite being understood by majority of the population, could entirely disappear from online news spaces as English becomes dominant. The dominance of English is attributed to the aid of intelligent models for processing news in English, contrary to native languages, for which manual practices of publishing news online result in significant time delays. It has been the goal of several efforts to build such intelligent models for native languages, with key interest in Swahili, which is spoken by over 100 million people and Africa’s most spoken native language. Native languages are still low resource in data for training language processing models for them. There is need of deliberate effort to collect more data for news in native languages and subsequent need to build models of better performance thereof. Our case study was Swahili language for which we collected more text data for news in Swahili in addition to that in literature. This data was labelled with six (6) news categories. We were able to build five (5) multi label classification models for Swahili news text. We evaluated these models and our best model posed a better performance than those in literature. We were also able to explain the performance this model in context of the bias within the data that led to the small confusion in classification of the test data. The best model was deployed in a real time web application. Swahili language, being widely spoken, is a fair representation of native languages and therefore, this project greatly contributes to the body of work to increase usage of native languages on online news media. We recommend more data, focused on more news categories and local news scenarios, is collected as a step to enable transfer learning of existing language models.en_US
dc.language.isoenen_US
dc.publisherMakerere University.en_US
dc.subjectDeep learning modelen_US
dc.subjectSwahili news texten_US
dc.titleDevelopment of a deep learning model for classification of Swahili news text sentences.en_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record