Show simple item record

dc.contributor.authorTumuramye, Ronald
dc.date.accessioned2023-01-17T09:05:49Z
dc.date.available2023-01-17T09:05:49Z
dc.date.issued2023-01-17
dc.identifier.citationTumuramye, Ronald. (2023). Deep Learning-aided Image Captioning In Chest X-rays for TB Screening. (Unpublished undergraduate dissertation) Makerere University; Kampala, Uganda.en_US
dc.identifier.urihttp://hdl.handle.net/20.500.12281/14309
dc.descriptionA report submitted in partial fulfilment of the requirements of Makerere University of the degree of Bachelor of Science in Telecommunication Engineering.en_US
dc.description.abstractThe fight against TB in the African region and other low and middle income countries is mainly challenged by the limited number of skilled radiologists in those countries. There are large populations yet few radiologists who then face a problem of attending to the many patients. Writing medical reports for each patient takes relatively a lot of time hence diagnosing many patients and manually writing a medical report for each of them is very time consuming and laborious. A deep learning-aided image captioning system offers great support to the radiologists in automatic image captioning of the CXRs which can be a valuable tool to the TB detection and medical report writing for the patients as well as offer faster and more accurate results. In this work, application of deep learning towards image captioning of CXRs for TB was investigated. An open source dataset from Indiana University and a local clinical dataset from Mengo Hospital were obtained. The Indiana University dataset contained 7470 chest X-ray images with their 2955 associated reports in xml format. 311 images with their reports were also collected from Mengo Hospital to comprise the local dataset.Two pretrained models i.e. EfficientNet and CheXNet were used as baseline feature extractors and used to design two models that can generate captions for Chest X-ray images. The Efficient-Net transformer model comprised of the Efficient CNN used as a feature extractor, a vanilla transformer encoder and decoder used to generate the captions. The CheXnet-LSTM model comprised of the CheXNet CNN used as a feature extractor, an encoder and an LSTM used as decoder to generate captions. The models were trained using the Indiana University dataset and evaluated using the Indiana University dataset and the local dataset. The EfficientNet-Transformer model emerged the best performance model with a BLEU score of 0.515 which was better than the results of the state of art approaches. The model was deployed in a web application allowing the user to upload a chest X-ray image and get a predicted caption in seconds.en_US
dc.language.isoenen_US
dc.publisherMakerere Universityen_US
dc.subjectDeep Learningen_US
dc.subjectImage Captioningen_US
dc.subjectChest X-raysen_US
dc.subjectTB Screeningen_US
dc.titleDeep Learning-aided Image Captioning In Chest X-rays for TB Screeningen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record