Sebat Bet Gurage (Chaha)-Amharic Machine Translation using Deep Learning

Yirga ,Dilu

Full metadata record

DC Field	Value	Language
dc.contributor.author	Yirga ,Dilu	-
dc.date.accessioned	2024-04-23T06:45:28Z	-
dc.date.available	2024-04-23T06:45:28Z	-
dc.date.issued	2024-02	-
dc.identifier.uri	http://hdl.handle.net/123456789/7875	-
dc.description.abstract	Natural Language Processing (NLP) is defined as a method for computers to intelligently analyze, understand, and derive meaning from human language. Machine translation is a branch of natural language processing that is used to translate text or speech from one language to another. Since before the thirteenth century, the sociolinguistic group of people living in the southwest of Ethiopia known as the administrative "Gurage Zone" has been referred to as "Gurage" (“ጉጉጉ” for the people and “ጉጉጉጉ” for the language). In this days with the advancement of technology there is the need to translate different official documents, news and other written texts in different languages. The Sebat Bet Gurage-Amharic language translation is one of the concern that needs such translation technologies. However there is no research conducted on machine translation between Sebat Bet Gurage particularly Chaha to Amharic. In this study, we have developed a Chaha-Amharic machine translation model using an encoder decoder machine translation approach. In the study we have collected 5200 Chaha-Amharic parallel sentences from different sources. We then perform cleaning, normalization and tokenization stages to preprocess the dataset. We have experimented an encoder decoder model using LSTM, Bi-LSTM and GRU deep learning algorithms. Based on the result of our experiments done in this study, the encoder decoder model using the Bi-LSTM algorithm has a better BLEU score. The encoder decoder model using the Bi-LSTM algorithm scored 22, the encoder decoder model using the LSTM algorithm scored 17 and the encoder decoder model using the GRU algorithm scored 20. From the experiment the encoder decoder model using the Bi-LSTM algorithm took a long training time of 1:30 hours.	en_US
dc.language.iso	en	en_US
dc.publisher	St. Mary's University	en_US
dc.subject	Natural Language Processing,Machine translation	en_US
dc.title	Sebat Bet Gurage (Chaha)-Amharic Machine Translation using Deep Learning	en_US
dc.type	Thesis	en_US
Appears in Collections:	Master of computer science