Show simple item record

dc.contributor.authorSebampitako, Duncan
dc.date.accessioned2023-01-17T08:38:10Z
dc.date.available2023-01-17T08:38:10Z
dc.date.issued2022
dc.identifier.citationSebampitako, Duncan. (2022). Deep Learning-Aided Image Captioning In Chest X-Rays For TB Screening. (Unpublished undergraduate dissertation) Makerere University; Kampala, Uganda.en_US
dc.identifier.urihttp://hdl.handle.net/20.500.12281/14306
dc.descriptionA research report submitted to the College of Engineering Design and Art in partial fulfillment of the requirement for the award of the degree Bachelor of Telecommunications Engineering of Makerere University.en_US
dc.description.abstractTuberculosis (TB) is a contagious disease that is a major source of illness and one of the top causes of mortality across the world.TB can be screened through Chest X-ray, Ultrasound, Computed Tomography and Magnetic Resonance Imaging(MRI). Chest X-ray is cost effective and has widespread availability hence it is preferred for TB screening. Chest X-ray images must be interpreted by radiologists. The radiologists must describe the findings of each part of the body inspected in the imaging scan in textual reports, specifying whether each area was determined to be normal, abnormal, or potentially abnormal. Writing medical-imaging reports is time-consuming, error-prone and laborious for radiologists especially those operating in rural areas where healthcare quality is low. Two deep learning models were developed in this work to automate the task of medical report writing. The CheXNet Convolutional Neural Network-Long Short Term Memory(LSTM) model applied transfer learning from the pretrained CheXNet CNN to extract visual features from chest X-ray images. The LSTM model was then used as the medical report generator from the extracted visual features. The second model was the EfficientNet CNN-Transformer model. It used the EfficientNet CNN to extract the visual features from the chest X-ray images. The EfficientNet CNN exploits compound scaling of dimensions such as width, depth and resolution of the network to achieve high accuracy and efficiency. The transformer model was then used for visio-language attention and generation of the medical report. Both models were trained on the Indiana University chest X-ray dataset for 70 epochs. The EfficientNet CNN-Transformer model outperformed the CheXNet CNN LSTM model on all the BLEU score metrics with a BLEU score of 0.515 for a one word n-gram. The results demonstrate the importance of the choice for the visual feature extractor as well as the language generator models. We also demonstrated the importance of a robust dataset in achieving the best results when training AI models. Data bias is a severe problem that may degrade even the best models. It is crucial in the development of medical reports that the data acquired not only accounts for all stages of a single pathology, but is also sufficient across all lung pathologies to create a clinically accurate and coherent medical report.en_US
dc.description.sponsorshipECUREIen_US
dc.language.isoenen_US
dc.publisherMakerere Universityen_US
dc.subjectDeep Learningen_US
dc.subjectComputer Visionen_US
dc.subjectMedical Image Captioningen_US
dc.titleDeep Learning-Aided Image Captioning In Chest X-Rays For TB Screeningen_US
dc.typeThesisen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record