DC Field | Value | Language |
dc.contributor.author | Nekatibeb, Haimanot | - |
dc.date.accessioned | 2025-07-01T12:14:10Z | - |
dc.date.available | 2025-07-01T12:14:10Z | - |
dc.date.issued | 2025-02 | - |
dc.identifier.uri | http://hdl.handle.net/123456789/8778 | - |
dc.description.abstract | Sentiment analysis has become an essential tool for understanding public opinion on social
media platforms, particularly for organizations like the Ethiopian Broadcasting Corporation
(EBC). However, the unique linguistic features of Amharic, a morphologically rich and underresourced language, pose significant challenges to developing accurate sentiment analysis
models. This research explores sentiment analysis on Amharic Facebook comments related to
EBC using state-of-the-art deep learning models.
A comprehensive dataset of 22,000 labeled Amharic Facebook comments is prepared, with
sentiments classified into five classes: positive, neutral, negative, strongly positive, and
strongly negative. The labeling workflow for Amharic sentiment analysis begins with manual
annotation, where each sentence has been reviewed by trained annotators who Amharic
linguistic professionals assign the most fitting sentiment label based on predefined criteria.
These annotators are skilled at identifying key phrases and contextual signals that align with
each sentiment category. Multiple annotators review each sentence to ensure consistency, and
any disagreements are resolved either through a voting system or by an expert review to
achieve consensus.
The data underwent rigorous preprocessing, including tokenization, character-level
normalization, and noise reduction, to address challenges such as data sparsely and mixed
sentiment labels. Four deep learning models—LSTM, Bi-LSTM, CNN, and fine-tuned
Multilingual BERT—have been assessed using accuracy, precision, recall, and F1-score as
performance metrics.
Among the models, Multilingual BERT achieved the highest performance, with an accuracy of
87.2%, precision of 0.796, recall of 0.7815, and F1-score of 0.8143, highlighting its capacity to
leverage pre-trained multilingual embedding for handling Amharic's complex structure. BiLSTM, CNN, and LSTM models demonstrated moderate performance, with Bi-LSTM
achieving an accuracy of 81.3% and an F1-score of 0.76, CNN reaching an accuracy of 79.68%
and an F1-score of 0.743, and LSTM achieving an accuracy of 77.12% and an F1-score of
0.752, showcasing strengths in capturing sequential and local patterns but lagging behind
BERT due to the absence of pre-trained embedding and attention mechanisms.2
This study adds to the growing body of research on sentiment analysis for under-resourced
languages, providing insights into the efficacy of deep learning models for EBC Facebook
Amharic sentiment categorization. The findings underscore the importance of pre-trained
models like BERT in overcoming linguistic challenges and advancing sentiment analysis
applications for public service broadcasters. These results serve as a foundation for further
exploration of Amharic sentiment analysis, with implications for improving public engagement
and decision-making within EBC. | en_US |
dc.language.iso | en | en_US |
dc.publisher | St. Mary’s University | en_US |
dc.subject | Amharic Sentiments, Sentiment Analysis, Deep Learning, BERT, Bi-LSTM, LSTM & CNN | en_US |
dc.title | AMHARIC LANGUAGE SENTIMENT ANALYSIS USING DEEP LEARNING IN THE CASE OF ETHIOPIA BROADCASTING CORPORATION | en_US |
dc.type | Thesis | en_US |
Appears in Collections: | Master of computer science
|