|Application of Data Mining Technique for Predicting Airtime Credit Risk: The Case of Ethio Telecom
airtime credit, risk prediction
|St. Mary's University
|Airtime credit is a service that enable prepaid mobile subscribers to use telecom services any time even after running out of balance and pay for it later. This created convenience among users, and it became an additional source of revenue for operators. But this service has its own risk due to many subscribers failing to repay their credit and ending up as defaulters. The fact that telecom prepaid service users are not required to present any guarantee to get airtime credit makes the risk even worse. This study explored the role of data mining in predicting airtime credit risk. An open source data mining tool called WEKA was used to conduct the experiment. Various classification algorithms were applied in order to find the best performing model. These algorithms were J48 decision tree, Naïve Bayes, Multilayer Perceptron and Logistic Regression. Ethio Telecom prepaid subscriber’s usage data which consisted 86, 024 instances and eleven attributes were used for building and testing the algorithms. For all experiments performed, WEKA’s tool 10-fold cross validation and percentage split test options were used. Confusion matrix was also used to evaluate the performance of the models using different measures such as accuracy, precision, recall, f-measure and ROC area. The model built with J48 decision tree outperformed the other classifiers by an accuracy of 98.5632%, and Precision, Recall and F-measure of 0.986 and its ROC area threshold 0.996 with 10-fold cross validation test option. The model built with Logic regression has an accuracy of 97.1717%. Whereas Multilayer Perceptron and Naïve Bayes classifiers recoded an accuracy of 96.7622% and 94.6355% respectively. From the selected classifier there are some important rules and parameters generated which can help in airtime credit decision making process. Data usage is the main attribute which showed the potential prediction power. Which is, for a subscriber having high data usage with other usages set to low can predict a subscriber ending up as defaulter. Also, attributes such as voice usage and topping up channel has shown high airtime credit risk prediction power.
|Appears in Collections:
|Master of computer science
|Thesis - Oliyad Tarekegn.pdf
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.