Skip navigation
st. Mary's University Institutional Repository St. Mary's University Institutional Repository

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/8778
Title: AMHARIC LANGUAGE SENTIMENT ANALYSIS USING DEEP LEARNING IN THE CASE OF ETHIOPIA BROADCASTING CORPORATION
Authors: Nekatibeb, Haimanot
Keywords: Amharic Sentiments, Sentiment Analysis, Deep Learning, BERT, Bi-LSTM, LSTM & CNN
Issue Date: Feb-2025
Publisher: St. Mary’s University
Abstract: Sentiment analysis has become an essential tool for understanding public opinion on social media platforms, particularly for organizations like the Ethiopian Broadcasting Corporation (EBC). However, the unique linguistic features of Amharic, a morphologically rich and underresourced language, pose significant challenges to developing accurate sentiment analysis models. This research explores sentiment analysis on Amharic Facebook comments related to EBC using state-of-the-art deep learning models. A comprehensive dataset of 22,000 labeled Amharic Facebook comments is prepared, with sentiments classified into five classes: positive, neutral, negative, strongly positive, and strongly negative. The labeling workflow for Amharic sentiment analysis begins with manual annotation, where each sentence has been reviewed by trained annotators who Amharic linguistic professionals assign the most fitting sentiment label based on predefined criteria. These annotators are skilled at identifying key phrases and contextual signals that align with each sentiment category. Multiple annotators review each sentence to ensure consistency, and any disagreements are resolved either through a voting system or by an expert review to achieve consensus. The data underwent rigorous preprocessing, including tokenization, character-level normalization, and noise reduction, to address challenges such as data sparsely and mixed sentiment labels. Four deep learning models—LSTM, Bi-LSTM, CNN, and fine-tuned Multilingual BERT—have been assessed using accuracy, precision, recall, and F1-score as performance metrics. Among the models, Multilingual BERT achieved the highest performance, with an accuracy of 87.2%, precision of 0.796, recall of 0.7815, and F1-score of 0.8143, highlighting its capacity to leverage pre-trained multilingual embedding for handling Amharic's complex structure. BiLSTM, CNN, and LSTM models demonstrated moderate performance, with Bi-LSTM achieving an accuracy of 81.3% and an F1-score of 0.76, CNN reaching an accuracy of 79.68% and an F1-score of 0.743, and LSTM achieving an accuracy of 77.12% and an F1-score of 0.752, showcasing strengths in capturing sequential and local patterns but lagging behind BERT due to the absence of pre-trained embedding and attention mechanisms.2 This study adds to the growing body of research on sentiment analysis for under-resourced languages, providing insights into the efficacy of deep learning models for EBC Facebook Amharic sentiment categorization. The findings underscore the importance of pre-trained models like BERT in overcoming linguistic challenges and advancing sentiment analysis applications for public service broadcasters. These results serve as a foundation for further exploration of Amharic sentiment analysis, with implications for improving public engagement and decision-making within EBC.
URI: http://hdl.handle.net/123456789/8778
Appears in Collections:Master of computer science

Files in This Item:
File Description SizeFormat 
Haimanot Nekatibeb.pdf3.37 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.