Show simple item record

dc.contributor.authorKatende, Jericho
dc.contributor.authorMawejje, Mark William
dc.contributor.authorMusemeza, Murungi Isaac
dc.date.accessioned2024-11-18T08:09:51Z
dc.date.available2024-11-18T08:09:51Z
dc.date.issued2024
dc.identifier.citationKatende, J., Mawejje, M. W., & Musemeza, M. I. (2024). Explainable real-time sign language to text conversion (Unpublished undergraduate dissertation). Makerere University, Kampala, Uganda.en_US
dc.identifier.urihttp://hdl.handle.net/20.500.12281/19281
dc.descriptionA project report submitted to the School of Computing and Informatics Technology in partial fulfilment of requirements for the award of the Degree of Bachelor of Science in Computer Science of Makerere University.en_US
dc.description.abstractSign languages are the primary means of communication for the Deaf and Hard of Hearing (DHH) community, but communication barriers remain between sign language users and the hearing population. Existing sign language translation models frequently lack transparency, which undermines trust and adoption. In this paper, we develop an explainable real-time sign language to text translation system that employs deep learning techniques and multiple interpretability methods, including SHapley Additive Explanations (SHAP), Gradient-weighted Class Activation Mapping (Grad-CAM), and Local Interpretable Model-Agnostic Explanations (LIME). We use a pre-trained VGG-16 network for feature extraction in conjunction with a custom classification model. In addition, we also finetuned other pre-trained models like VGG-19, Vision Transformers, EffecientNet, etc. The models were trained on the WLASL and Synthetic ASL Alphabet datasets, yielding explanations that shed light on the neural network’s decisionmaking process. We used SHAP, Grad-CAM, and LIME to enhance model interpretability. SHAP assigns importance values to each input feature, Grad-CAM depicts the input regions most relevant to the model’s predictions, and LIME generates local explanations for individual predictions. The combination of these methods yields a thorough understanding of the model’s behavior. We demonstrated the effectiveness of our approach through extensive experiments, achieving a translation accuracy of 97.8% on the test set and outperforming baseline methods. SHAP, Grad-CAM, and LIME explanations showed that the model relies on hand shape, movement, and facial expression features to accurately classify signs. These findings not only boost confidence in the model’s predictions, but also emphasize the importance of considering multiple aspects of sign language for effective translation. We use the trained model to create a user-friendly mobile application that provides real-time sign language translation to the DHH community while also encouraging inclusive communication. Our approach not only yields accurate and interpretable results, but it also encourages responsible AI practices in sign language translation. Our system’s explainability, achieved through the integration of multiple interpretability techniques, promotes trust and adoption, with the potential to bridge communication gaps between sign language users and the hearing population.en_US
dc.language.isoenen_US
dc.publisherMakerere Universityen_US
dc.subjectSign language translationen_US
dc.subjectExplainable AI (XAI)en_US
dc.subjectSHapley Additive exPlanations (SHAP)en_US
dc.subjectGradient-weighted Class Activation Mapping (Grad-CAM)en_US
dc.subjectLocal Interpretable Model-Agnostic Explanations (LIME)en_US
dc.subjectDeep learningen_US
dc.subjectDeaf and Hard of Hearing (DHH) communityen_US
dc.subjectInclusive communicationen_US
dc.titleExplainable real-time sign language to text conversionen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record