Skip navigation
st. Mary's University Institutional Repository St. Mary's University Institutional Repository

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/7499
Title: Stroke Risk Prediction using Machine Learning
Authors: Gebremariam, Bezawit
Keywords: Stroke Risk; Stroke Risk Prediction; Machine learning; Logistic Regression; SVM; Random Forest
Issue Date: Feb-2023
Publisher: ST. MARY’S UNIVERSITY
Abstract: Stroke occurs due to an interruption of supply in oxygen, blood and other nutrients. Identifying and treating stroke is time consuming and expensive specially, in developing countries like Ethiopia. Prediction of stroke risk will help to recognize, detect and treat the disease at early stage and this will reduce (disability, death and cost) that occur from stroke. By addressing the problem at early stage individuals can control their life style and medical status, government can prepare healthcare strategy towards the solution. This will save life, reduce disability and the amount of investment the government dedicate for the disease. By utilizing ML techniques, it is possible to anticipate the onset of stroke with the development of technology in medical sector. ML is a science of feeding computers data and information inorder to make them learn then improve the learning through time. An ideal stroke risk assessment tool that takes into account different risk factors, widely applicable and acceptable does not exist. Stroke has different risk factors including non clinical risk factors like genetic, life style, living area of individuals. In this study, three machine learning algorithm models are developed for stroke risk prediction. Demographic and diagnosis data from Hallelujah and Zewditu hospitals is used to analyze and come up with stroke risk prediction models. After the business understanding and data understanding phases, data preparation task is done to clean the data from inconsistency, duplication and error then the data becomes ready for the experiment. For predictive model construction, machine learning algorithms such as Logistic Regression, SVM, and Random Forest (RF) Decision Tree with Anaconda python programming was used to conduct all the experiments. Confusion Matrix is used to test the performance of the models. Based on the research findings, the Random Forest (RF) Decision classifier produced an accuracy of 99.3%, SVM an accuracy of 96.63% and Logistic Regression an accuracy of 94%. Therefore, the Random Forest (RF) Decision Tree classifier is proposed for constructing stroke risk prediction model. Based on the proposed optimal model in this study, we recommend future research to integrate the stroke risk prediction model with Health Information System and to use different attributes on addiction of patients’ towards Cigarette smoking, drug use, alcohol consumption, which are not included in this study.
URI: .
http://hdl.handle.net/123456789/7499
Appears in Collections:Master of computer science

Files in This Item:
File Description SizeFormat 
Bezawit_ID_SGS-0442-2010A_Stroke Risk Prediction.pdf2.23 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.