Skip navigation
st. Mary's University Institutional Repository St. Mary's University Institutional Repository

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/8780
Full metadata record
DC FieldValueLanguage
dc.contributor.authorTeklemarkos, Senait-
dc.date.accessioned2025-07-01T12:18:24Z-
dc.date.available2025-07-01T12:18:24Z-
dc.date.issued2025-02-
dc.identifier.urihttp://hdl.handle.net/123456789/8780-
dc.description.abstractCredit risk is an important factor influencing bank financial performance, and the capacity to foresee it enables institutions such as Banks to manage potential risks and maintain their profitability. Accurate credit risk prediction allows banks to make informed decisions by identifying customers who are likely to default in advance. In this study, multiple machine learning approaches are used on Awash Bank customer data to create a prediction model capable of predicting credit risk. Missing values in numerical features are filled using the mean, while categorical features are filled with the mode. Categorical features are encoded using Label Encoding, except the 'Branch' variable, which, due to its high cardinality of 124 unique values, is encoded using the Hasher function, a method suggested for features of this type. The dataset is split into training, testing, and validation sets using an 80:20:10 ratio, where 10% of the training set is reserved for validation. Key characteristics are identified by applying a correlation analysis and the ExtraTreesClassifier, and class imbalance is handled using the SMOTE oversampling approach to avoid bias against the majority class. Five machine learning models—XGBoost, CatBoost, Random Forest, Support Vector Machine (SVM), and Deep Neural Networks (DNN)—are trained on the dataset and tested for accuracy, precision, recall, and F1 score. Hyperparameter tuning is performed using RandomizedSearchCV() to optimize the performance of each selected model. The results show that the XGBoost algorithm outperformed the others, with an accuracy of 92.2%, followed by CatBoost and Random Forest.This study contributes to the limited research on credit risk prediction in the Ethiopian banking sector by utilizing real data from Awash Bank and demonstrating the potential for machine learning, particularly ensemble methods such as XGBoost, to improve credit risk management in the banking industry. However, a major limitation of this study is the reliance on a limited dataset focused exclusively on loans, which may not fully represent the diverse customer base of Awash Bank, particularly those seeking other types of credit products. Future research could address this limitation by incorporating additional data sources or conducting longitudinal studies to enhance predictive accuracy and generalizability.en_US
dc.language.isoenen_US
dc.publisherSt. Mary’s Universityen_US
dc.subjectBanking industry, Credit risk prediction, Ensemble Machine Learning, Deep Neural Networksen_US
dc.titlePredicting Financial Credit Risks of Banks Using Machine Learning Algorithmen_US
dc.typeThesisen_US
Appears in Collections:Master of computer science

Files in This Item:
File Description SizeFormat 
Senait Teklemarkos.pdf3.69 MBAdobe PDFView/Open
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.