Real time translation from Ugandan sign language to speech

Nakuwanda, Bridget Hellen; Luwaga, Micheal; Sozzi, Henry; Egesa, Alex

Real time translation from Ugandan sign language to speech

dc.contributor.author	Nakuwanda, Bridget Hellen
dc.contributor.author	Luwaga, Micheal
dc.contributor.author	Sozzi, Henry
dc.contributor.author	Egesa, Alex
dc.date.accessioned	2024-11-22T13:06:19Z
dc.date.available	2024-11-22T13:06:19Z
dc.date.issued	2024
dc.description	A report submitted to the School of Computing and Informatics Technology for the study leading to the implementation of a project in partial fulfillment of the requirements for the award of the Degree of Bachelor of Science in Computer Science of Makerere University.	en_US
dc.description.abstract	Sign Language is the mode of communication used by the deaf and hard of hearing communities in the entire world. It involves use of body key points to make standard glosses that have meaning. It differs from region to region and in Uganda specifically, Uganda Sign Language is gazetted. Real-Time Translation from Ugandan Sign Language to Speech project intends to leverage on pose estimation, hand tracking computer vision, and sequence to sequence modeling to translate visual Ugandan Sign Language into English speech in real time. The core methodology included developing a comprehensive dataset of USL gestures with corresponding English annotations. Non-manual linguistic features such as hand position and orientation were emphasized extracted using MediaPpipe Library. An encoder neural network was also used to extract context-specific spatio-temporal information from the hand tracking data. Comparative pipelines that involve use Resnet50 and VGG19 were run to compare their performance. The model architecture incorporated convolutional layers for initial feature extraction, followed by multiple transformer blocks designed to capture long-range dependencies and contextual nuances inherent in sign language glosses. The model was trained, tested, and validated, achieving a perfect accuracy rate of 100 across all classes. Evaluation metrics such as precision, recall, and F1-score also reached perfect scores, demonstrating the model’s robustness and effectiveness. The training process involved 60 epochs, with consistent improvements observed in both training and validation metrics. The training loss decreased from 0.3851 to 0.3574, while the training accuracy increased from 87.16 to 88.12. The validation loss showed significant reduction, reaching as low as 0.000042, and the validation accuracy consistently reached 100, underscoring the model’s ability to generalize well to unseen data. To prevent overfitting, advanced regularization techniques such as dropout were employed. The model’s performance was further confirmed through a detailed confusion matrix, which indicated no misclassifications. This project is the first to deliver an inclusive Ugandan Sign Language translation system using state-of-the-art techniques in computer vision and artificial intelligence.	en_US
dc.identifier.citation	Nakuwanda, B. H. (2024). Real time translation from Ugandan sign language to speech (Unpublished undergraduate dissertation). Makerere University, Kampala, Uganda.	en_US
dc.identifier.uri	http://hdl.handle.net/20.500.12281/19438
dc.language.iso	en	en_US
dc.publisher	Makerere University	en_US
dc.subject	Sign language	en_US
dc.title	Real time translation from Ugandan sign language to speech	en_US
dc.type	Thesis	en_US

Collections

School of Computing and Informatics Technology Collection

Real time translation from Ugandan sign language to speech

Files

Collections