Deep Learning-aided Image Captioning In Chest X-rays for TB Screening

Tumuramye, Ronald

dc.contributor.author	Tumuramye, Ronald
dc.date.accessioned	2023-01-17T09:05:49Z
dc.date.available	2023-01-17T09:05:49Z
dc.date.issued	2023-01-17
dc.identifier.citation	Tumuramye, Ronald. (2023). Deep Learning-aided Image Captioning In Chest X-rays for TB Screening. (Unpublished undergraduate dissertation) Makerere University; Kampala, Uganda.	en_US
dc.identifier.uri	http://hdl.handle.net/20.500.12281/14309
dc.description	A report submitted in partial fulfilment of the requirements of Makerere University of the degree of Bachelor of Science in Telecommunication Engineering.	en_US
dc.description.abstract	The fight against TB in the African region and other low and middle income countries is mainly challenged by the limited number of skilled radiologists in those countries. There are large populations yet few radiologists who then face a problem of attending to the many patients. Writing medical reports for each patient takes relatively a lot of time hence diagnosing many patients and manually writing a medical report for each of them is very time consuming and laborious. A deep learning-aided image captioning system offers great support to the radiologists in automatic image captioning of the CXRs which can be a valuable tool to the TB detection and medical report writing for the patients as well as offer faster and more accurate results. In this work, application of deep learning towards image captioning of CXRs for TB was investigated. An open source dataset from Indiana University and a local clinical dataset from Mengo Hospital were obtained. The Indiana University dataset contained 7470 chest X-ray images with their 2955 associated reports in xml format. 311 images with their reports were also collected from Mengo Hospital to comprise the local dataset.Two pretrained models i.e. EfficientNet and CheXNet were used as baseline feature extractors and used to design two models that can generate captions for Chest X-ray images. The Efficient-Net transformer model comprised of the Efficient CNN used as a feature extractor, a vanilla transformer encoder and decoder used to generate the captions. The CheXnet-LSTM model comprised of the CheXNet CNN used as a feature extractor, an encoder and an LSTM used as decoder to generate captions. The models were trained using the Indiana University dataset and evaluated using the Indiana University dataset and the local dataset. The EfficientNet-Transformer model emerged the best performance model with a BLEU score of 0.515 which was better than the results of the state of art approaches. The model was deployed in a web application allowing the user to upload a chest X-ray image and get a predicted caption in seconds.	en_US
dc.language.iso	en	en_US
dc.publisher	Makerere University	en_US
dc.subject	Deep Learning	en_US
dc.subject	Image Captioning	en_US
dc.subject	Chest X-rays	en_US
dc.subject	TB Screening	en_US
dc.title	Deep Learning-aided Image Captioning In Chest X-rays for TB Screening	en_US
dc.type	Thesis	en_US

Files in this item

Name:: Tumuramye-ceadt-BSTE.pdf
Size:: 5.202Mb
Format:: PDF
Description:: Undergraduate Dissertation

View/Open

This item appears in the following Collection(s)

School of Engineering (SEng.) Collections

Show simple item record