Supervised learning for weather prediction model using global meteorological data
Supervised learning for weather prediction model using global meteorological data
| dc.contributor.author | Kabila, Francis Edrias | |
| dc.date.accessioned | 2025-11-13T14:34:37Z | |
| dc.date.available | 2025-11-13T14:34:37Z | |
| dc.date.issued | 2025 | |
| dc.description | A dissertation submitted to the School of Statistics and Planning in partial fulfilment for the award of degree of Bachelor of Statistics of Makerere University | en_US |
| dc.description.abstract | Accurate weather prediction plays a crucial role in the stability and safety of diverse sectors such as agriculture, transportation, energy management, and disaster response. NWP models, which use complex physical equations to simulate atmospheric processes, have long been the foundation of weather forecasting. Despite their successes, these models require extensive computational resources and depend heavily on dense, high-quality observational data, often lacking in many regions, especially across Africa. This research explores the potential of supervised machine learning as an alternative, data-driven approach to address these limitations by efficiently modeling complex, nonlinear relationships in global meteorological data. The study focuses on developing a supervised learning model capable of classifying weather into 15 distinct categories using an extensive dataset comprising 8,141 observations from 24 African countries. The key objectives included optimizing the predictive model, identifying the meteorological variables most influential to classification accuracy, rigorously assessing model performance through metrics like accuracy and the F1-score, validating the model’s robustness across diverse African climates, and generating practical insights for forecasting and climate research. The study employed data preprocessing techniques such as feature engineering— combining wind speed and direction into a singular wind vector—and stratified sampling to mitigate class imbalance in the dataset. Z-score normalization standardized predictor variables including temperature, humidity, wind components, pressure, cloud cover, and “feels like” temperature. Six supervised machine learning algorithms were implemented and compared: Logistic Regression, SVM, Decision Trees, KNN, Random Forest, and GBM. The models were evaluated based on training accuracy, validation accuracy, and the F1-score, particularly emphasizing the latter due to the substantial class imbalance dominated by categories like "Partly cloudy" and "Sunny." Results revealed the Random Forest algorithm achieved the highest overall accuracy at 91.4%, though its perfect training accuracy of 1.0 indicated overfitting. The GBM model proved the most effective, balancing accuracy (90.8%) with superior generalization and achieving the highest F1-score of 0.4645. This metric confirmed GBM’s strength in accurately predicting minority weather classes representing critical conditions such as "Heavy rain" and "Thundery outbreaks." Feature importance analysis highlighted cloud cover, humidity, visibility, and air quality as the strongest predictors, reinforcing the soundness of the model’s learning. In conclusion, this research validates that supervised machine learning, and specifically the Gradient Boosting Machine, offers a reliable, efficient, and scalable approach to weather classification. It presents a compelling complement or alternative to traditional NWP models, especially in data-scarce regions. The developed model’s deployment and performance suggest significant potential for enhancing weather prediction capabilities, advancing operational decision-making, and improving climate resilience across Africa and similar contexts worldwide. This work contributes to the growing body of evidence supporting data-driven meteorological forecasting methods, paving the way for future innovations in the field. | en_US |
| dc.identifier.citation | Kabila, F. E. (2025). Supervised learning for weather prediction model using global meteorological data. Unpublished Undergraduate dissertation, Makerere University, Kampala | en_US |
| dc.identifier.uri | http://hdl.handle.net/20.500.12281/21054 | |
| dc.language.iso | en | en_US |
| dc.publisher | Makerere University | en_US |
| dc.subject | Machine learning | en_US |
| dc.subject | Supervised learning | en_US |
| dc.subject | Weather prediction | en_US |
| dc.subject | Weather prediction model | en_US |
| dc.subject | Global meteorological data | en_US |
| dc.title | Supervised learning for weather prediction model using global meteorological data | en_US |
| dc.type | Other | en_US |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- Kabila-cobams-bstat.pdf
- Size:
- 963.66 KB
- Format:
- Adobe Portable Document Format
- Description:
- Undergraduate dissertation