Natural Language Processing: A Luganda Part of Speech Tagger

Muhanuzi, Stewart

View/Open

Undergraduate dissertation (2.391Mb)

Date

2020-12-15

Author

Muhanuzi, Stewart

Metadata

Show full item record

Abstract

This research study describes the initial experiment in designing a Hidden Markov Model (HMM)-based part-of-speech tagger for the Luganda language. Part-of-speech tagging involves assigning the proper tag to each word in a text based on its context. The process was accomplished in two primary steps: morphological analysis and disambiguation. This study focuses on tagging accuracy, specifically the challenge of correctly tagging each token and handling new tokens. We constructed a first-order stochastic disambiguation algorithm, using supervised learning techniques, to learn HMM parameters from hand-crafted corpora. The Viterbi algorithm was employed to determine the most probable tag for each word.

URI

http://hdl.handle.net/20.500.12281/18673

Collections

School of Engineering (SEng.) Collections