Stroke Risk Prediction using Machine Learning

Gebremariam, Bezawit

Full metadata record

DC Field	Value	Language
dc.contributor.author	Gebremariam, Bezawit	-
dc.date.accessioned	2023-03-07T11:49:02Z	-
dc.date.available	2023-03-07T11:49:02Z	-
dc.date.issued	2023-02	-
dc.identifier.uri	.	-
dc.identifier.uri	http://hdl.handle.net/123456789/7499	-
dc.description.abstract	Stroke occurs due to an interruption of supply in oxygen, blood and other nutrients. Identifying and treating stroke is time consuming and expensive specially, in developing countries like Ethiopia. Prediction of stroke risk will help to recognize, detect and treat the disease at early stage and this will reduce (disability, death and cost) that occur from stroke. By addressing the problem at early stage individuals can control their life style and medical status, government can prepare healthcare strategy towards the solution. This will save life, reduce disability and the amount of investment the government dedicate for the disease. By utilizing ML techniques, it is possible to anticipate the onset of stroke with the development of technology in medical sector. ML is a science of feeding computers data and information inorder to make them learn then improve the learning through time. An ideal stroke risk assessment tool that takes into account different risk factors, widely applicable and acceptable does not exist. Stroke has different risk factors including non clinical risk factors like genetic, life style, living area of individuals. In this study, three machine learning algorithm models are developed for stroke risk prediction. Demographic and diagnosis data from Hallelujah and Zewditu hospitals is used to analyze and come up with stroke risk prediction models. After the business understanding and data understanding phases, data preparation task is done to clean the data from inconsistency, duplication and error then the data becomes ready for the experiment. For predictive model construction, machine learning algorithms such as Logistic Regression, SVM, and Random Forest (RF) Decision Tree with Anaconda python programming was used to conduct all the experiments. Confusion Matrix is used to test the performance of the models. Based on the research findings, the Random Forest (RF) Decision classifier produced an accuracy of 99.3%, SVM an accuracy of 96.63% and Logistic Regression an accuracy of 94%. Therefore, the Random Forest (RF) Decision Tree classifier is proposed for constructing stroke risk prediction model. Based on the proposed optimal model in this study, we recommend future research to integrate the stroke risk prediction model with Health Information System and to use different attributes on addiction of patients’ towards Cigarette smoking, drug use, alcohol consumption, which are not included in this study.	en_US
dc.language.iso	en	en_US
dc.publisher	ST. MARY’S UNIVERSITY	en_US
dc.subject	Stroke Risk; Stroke Risk Prediction; Machine learning; Logistic Regression; SVM; Random Forest	en_US
dc.title	Stroke Risk Prediction using Machine Learning	en_US
dc.type	Thesis	en_US
Appears in Collections:	Master of computer science