Skip navigation
st. Mary's University Institutional Repository St. Mary's University Institutional Repository

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/6226
Title: Telecom Voice Traffic Termination Fraud Detection Using Ensemble Learning: The Case of Ethio Telecom
Authors: Getahun, Alemeshet
Keywords: Ensemble methods, Data mining, Boosting, Bagging, Stacking, Voting
Issue Date: Jul-2020
Publisher: ST. MARY’S UNIVERSITY
Abstract: One of the major developments in machine learning is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this thesis, ensemble classification methods were proposed. This proposed model provides the important information which can be used for decision making. A comparison study was also made for finding the suitable classifier on an ensemble technique used in the proposed model We selected around 126736 records from two months’ collection of call detail record data. After eliminating irrelevant and unnecessary data, a total of 50516 datasets were used for the purpose of conducting this study. The researcher also selected 10 attributes for this study based on their relevant for this research. Data preprocessing was done to clean the datasets. After data preprocessing, the collected data has been prepared in a format suitable for the DM tasks. The study was conducted using Waikato environment for knowledge analysis (WEKA) version 3.8.3 machine learning software and four ensemble based machine learning paradigms for classification techniques was used, namely boosting, bagging, stacking and voting classifiers, based on 2 basic learners (decision tree and neural network) algorithms. The training models are built using cross validation and tested for reliability by default values of percentage split (66%). The performances of the model in this study were evaluated using the standard metrics of prediction accuracy, error rate analysis, FP rate, TP rate, recall, precision, F-measure and ROC curve which are calculated using the predictive classification table, known as Confusion matrix. Comparison of the performance of each algorithm made to select the algorithm with best performance. The results of the study show that ensemble J48 decision tree algorithm with 10-fold cross validation registered better performance of 96.73%. The boosting classifier provides highest prediction accuracy than the other classifiers. In this study, we found that the proposed ensemble methods provide significant improvement of prediction accuracy compared to individual classifiers.
URI: .
http://hdl.handle.net/123456789/6226
Appears in Collections:Master of computer science

Files in This Item:
File Description SizeFormat 
Thesis - Alemeshet Getahun.pdf2.41 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.