Predicting Financial Credit Risks of Banks Using Machine Learning Algorithm

Teklemarkos, Senait

Full metadata record

DC Field	Value	Language
dc.contributor.author	Teklemarkos, Senait	-
dc.date.accessioned	2025-07-01T12:18:24Z	-
dc.date.available	2025-07-01T12:18:24Z	-
dc.date.issued	2025-02	-
dc.identifier.uri	http://hdl.handle.net/123456789/8780	-
dc.description.abstract	Credit risk is an important factor influencing bank financial performance, and the capacity to foresee it enables institutions such as Banks to manage potential risks and maintain their profitability. Accurate credit risk prediction allows banks to make informed decisions by identifying customers who are likely to default in advance. In this study, multiple machine learning approaches are used on Awash Bank customer data to create a prediction model capable of predicting credit risk. Missing values in numerical features are filled using the mean, while categorical features are filled with the mode. Categorical features are encoded using Label Encoding, except the 'Branch' variable, which, due to its high cardinality of 124 unique values, is encoded using the Hasher function, a method suggested for features of this type. The dataset is split into training, testing, and validation sets using an 80:20:10 ratio, where 10% of the training set is reserved for validation. Key characteristics are identified by applying a correlation analysis and the ExtraTreesClassifier, and class imbalance is handled using the SMOTE oversampling approach to avoid bias against the majority class. Five machine learning models—XGBoost, CatBoost, Random Forest, Support Vector Machine (SVM), and Deep Neural Networks (DNN)—are trained on the dataset and tested for accuracy, precision, recall, and F1 score. Hyperparameter tuning is performed using RandomizedSearchCV() to optimize the performance of each selected model. The results show that the XGBoost algorithm outperformed the others, with an accuracy of 92.2%, followed by CatBoost and Random Forest.This study contributes to the limited research on credit risk prediction in the Ethiopian banking sector by utilizing real data from Awash Bank and demonstrating the potential for machine learning, particularly ensemble methods such as XGBoost, to improve credit risk management in the banking industry. However, a major limitation of this study is the reliance on a limited dataset focused exclusively on loans, which may not fully represent the diverse customer base of Awash Bank, particularly those seeking other types of credit products. Future research could address this limitation by incorporating additional data sources or conducting longitudinal studies to enhance predictive accuracy and generalizability.	en_US
dc.language.iso	en	en_US
dc.publisher	St. Mary’s University	en_US
dc.subject	Banking industry, Credit risk prediction, Ensemble Machine Learning, Deep Neural Networks	en_US
dc.title	Predicting Financial Credit Risks of Banks Using Machine Learning Algorithm	en_US
dc.type	Thesis	en_US
Appears in Collections:	Master of computer science