DC Field | Value | Language |
dc.contributor.author | Wolderufael, Yared | - |
dc.date.accessioned | 2024-05-15T07:20:09Z | - |
dc.date.available | 2024-05-15T07:20:09Z | - |
dc.date.issued | 2024-02 | - |
dc.identifier.uri | http://hdl.handle.net/123456789/7888 | - |
dc.description.abstract | Textual communication is globally prevalent, with individuals relying on email and social networking platforms for information exchange. Word prediction systems offer a time-saving solution by anticipating the next word during data entry. However, typing complete text can be time-consuming. Despite the development of language models for various languages, research on prediction models for Amharic is limited. Existing studies primarily utilize statistical language models for Amharic prediction, which struggle with data sparsity and fail to capture long-term dependencies. To address these limitations, this study proposes a deep learning approach for Amharic next-word prediction. The dataset is preprocessed and collected with a vocabulary of 18,085 unique words. Bi-directional Long Short-Term Memory (Bi-LSTM) models are employed, along with popular pre-trained word embedding models (Word2vec, Fasttext, Glove, and Keras) for feature extraction. Experiments encompass various hyperparameter values and optimization methods (Adam and Nadam), significantly influencing model training and performance. Model accuracy is compared to identify the most effective solution for Amharic word sequence prediction. Evaluation is conducted using accuracy measurements to assess overall prediction system correctness. Among the tested models, the Fasttext model combined with Bi-LSTM architecture and Adam optimizer achieves the highest training accuracy (97.5%) and validation accuracy (95.6%), surpassing other embedding methods. This research contributes to Amharic language model development, demonstrating the capacity to capture long-term dependencies and accurately predict the next word in Amharic text. The findings highlight the potential of Bi-LSTM-based approaches in enhancing text prediction systems. | en_US |
dc.language.iso | en | en_US |
dc.publisher | St. Mary's University | en_US |
dc.subject | Word prediction, Amharic language, Bi-LSTM, Word embedding, Fasttext, Long-term dependencies | en_US |
dc.title | WORD SEQUENCE PREDICTION FOR AMHARIC LANGUAGE USING DEEP LEARNING | en_US |
dc.type | Thesis | en_US |
Appears in Collections: | Master of computer science
|