AMHARIC LANGUAGE SENTIMENT ANALYSIS USING DEEP LEARNING IN THE CASE OF ETHIOPIA BROADCASTING CORPORATION

Nekatibeb, Haimanot

Full metadata record

DC Field	Value	Language
dc.contributor.author	Nekatibeb, Haimanot	-
dc.date.accessioned	2025-07-01T12:14:10Z	-
dc.date.available	2025-07-01T12:14:10Z	-
dc.date.issued	2025-02	-
dc.identifier.uri	http://hdl.handle.net/123456789/8778	-
dc.description.abstract	Sentiment analysis has become an essential tool for understanding public opinion on social media platforms, particularly for organizations like the Ethiopian Broadcasting Corporation (EBC). However, the unique linguistic features of Amharic, a morphologically rich and underresourced language, pose significant challenges to developing accurate sentiment analysis models. This research explores sentiment analysis on Amharic Facebook comments related to EBC using state-of-the-art deep learning models. A comprehensive dataset of 22,000 labeled Amharic Facebook comments is prepared, with sentiments classified into five classes: positive, neutral, negative, strongly positive, and strongly negative. The labeling workflow for Amharic sentiment analysis begins with manual annotation, where each sentence has been reviewed by trained annotators who Amharic linguistic professionals assign the most fitting sentiment label based on predefined criteria. These annotators are skilled at identifying key phrases and contextual signals that align with each sentiment category. Multiple annotators review each sentence to ensure consistency, and any disagreements are resolved either through a voting system or by an expert review to achieve consensus. The data underwent rigorous preprocessing, including tokenization, character-level normalization, and noise reduction, to address challenges such as data sparsely and mixed sentiment labels. Four deep learning models—LSTM, Bi-LSTM, CNN, and fine-tuned Multilingual BERT—have been assessed using accuracy, precision, recall, and F1-score as performance metrics. Among the models, Multilingual BERT achieved the highest performance, with an accuracy of 87.2%, precision of 0.796, recall of 0.7815, and F1-score of 0.8143, highlighting its capacity to leverage pre-trained multilingual embedding for handling Amharic's complex structure. BiLSTM, CNN, and LSTM models demonstrated moderate performance, with Bi-LSTM achieving an accuracy of 81.3% and an F1-score of 0.76, CNN reaching an accuracy of 79.68% and an F1-score of 0.743, and LSTM achieving an accuracy of 77.12% and an F1-score of 0.752, showcasing strengths in capturing sequential and local patterns but lagging behind BERT due to the absence of pre-trained embedding and attention mechanisms.2 This study adds to the growing body of research on sentiment analysis for under-resourced languages, providing insights into the efficacy of deep learning models for EBC Facebook Amharic sentiment categorization. The findings underscore the importance of pre-trained models like BERT in overcoming linguistic challenges and advancing sentiment analysis applications for public service broadcasters. These results serve as a foundation for further exploration of Amharic sentiment analysis, with implications for improving public engagement and decision-making within EBC.	en_US
dc.language.iso	en	en_US
dc.publisher	St. Mary’s University	en_US
dc.subject	Amharic Sentiments, Sentiment Analysis, Deep Learning, BERT, Bi-LSTM, LSTM & CNN	en_US
dc.title	AMHARIC LANGUAGE SENTIMENT ANALYSIS USING DEEP LEARNING IN THE CASE OF ETHIOPIA BROADCASTING CORPORATION	en_US
dc.type	Thesis	en_US
Appears in Collections:	Master of computer science