Credit card fraud detection using machine learning with deployment
Credit card fraud detection using machine learning with deployment
Date
2025
Authors
Mwesigwa, Brian
Journal Title
Journal ISSN
Volume Title
Publisher
Makerere University
Abstract
The proliferation of digital payment systems has led to a significant increase in credit card fraud,
which poses major challenges to both financial institutions and consumers. Traditional rule-based
fraud detection systems are often ineffective at identifying sophisticated fraudulent activities,
making it necessary to use more advanced techniques like machine learning (ML). A major hurdle
in this area is the issue of imbalanced datasets, where fraudulent transactions make up a tiny
fraction of the total.
This study's main objective was to develop, evaluate, and deploy a credit card fraud detection
model using five machine learning algorithms on the 2013 European credit card transaction dataset
from Kaggle. The study utilized a quantitative, experimental research design that involved data
preprocessing, model training, and evaluation.
The methodology included exploring and preprocessing the dataset, which involved data cleaning,
feature scaling using StandardScaler, and handling class imbalance with four resampling
techniques: SMOTE, random under sampling, random oversampling, and a combination of both.
The five machine learning models; Logistic Regression, Decision Tree, Random Forest, XGBoost,
and K-Nearest Neighbors (KNN) were then trained and tested on the data, with performance
evaluated using the Area Under the ROC Curve (AUC-ROC) and Confusion Matrix. The best
performing model was then selected for deployment on an R Shiny web dashboard prototype.
The results show that the combination of the XGBoost algorithm and the SMOTE resampling
technique achieved the highest performance, with an AUC of 98.72%. This significantly
outperformed all other model-sampling combinations tested. The findings confirm that addressing
class imbalance is crucial for developing effective fraud detection models, as the performance of
all tested algorithms improved significantly after applying resampling techniques. Furthermore,
The study concluded that the optimal resampling strategy is highly dependent on the chosen
algorithm. For example, Logistic Regression and Random Forests performed best with under
sampling, while Decision Trees performed best with oversampling.
The study recommends that financial institutions adopt advanced models such as XGBoost and
integrate sophisticated resampling techniques, such as SMOTE, into their fraud detection
pipelines. The best-performing model should be deployed on an interactive platform to support
real-time monitoring and decision-making. For future research, it is recommended to focus on
acquiring and analyzing local datasets from Low- and Middle-Income Countries (LMICs) like
Uganda and to conduct real-world deployment studies.
Description
A dissertation submitted to the School of Statistics and Planning in partial fulfillment of the requirements for the award of the degree of Bachelor of Statistics of Makerere University
Keywords
Credit card,
Fraud detection,
Machine learning,
machine learning deployment
Citation
Mwesigwa, B. (2025). Credit card fraud detection using machine learning with deployment. Unpublished undergraduate dissertation. Makerere University. Kampala.