Skip navigation
st. Mary's University Institutional Repository St. Mary's University Institutional Repository

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/7506
Title: Attention-based Neural Machine Translation from English- Wolaytta
Authors: Melese, Mekdes
Keywords: Machine Translation, Neural Machine Translation, English, Wolaytta, Attention Mechanism, Encoder-Decoder Architecture, Natural Language Processing
Issue Date: Jan-2023
Publisher: ST. MARY’S UNIVERSITY
Abstract: Machine translation (MT) is one of the applications of natural language processing which involves using computers to translate from one source language to another target language. For many years, Statistical Machine Translation (SMT) dominated the field of machine translation technology. Long sentences are broken up into small pieces in classical statistical machine translation, which results in poor levels of accuracy. Neural Machine Translation (NMT) is a new paradigm that swiftly superseded SMT as the predominant method of MT, developed with the development of deep learning. NMT approach differs from SMT systems as all parts of the neural translation model are trained jointly (end-to-end) to maximize the translation performance. In an encoder-decoder design, the entire source sequence's input is condensed into a single context vector, that is then sent to the decoder to create the output sequence. The major drawback of encoder-decoder model is that it can only work on short sequences. It is difficult for the encoder model to memorize long sequences and convert it into a fixed-length vector. One realistic solution to this problem is the attention mechanism. The attention mechanism predicts the next word by concentrating on a few relevant parts of the sequence rather than looking on the entire sequence. Hence, the objective of this research work is to develop a neural machine translation system for English-Wolaytta using attention mechanism. The English-Wolaytta machine translation system has been trained on parallel corpus covering the religious, and frequently used sentences or phrases which can be used in day to day communication. A total of 27351 parallel English-Wolaytta sentences were prepared and the system is trained and tested using 80/20 ratio. These data were preprocessed in the suitable format in way to be used in neural machine translation. For building the proposed English-Wolaytta NMT model, an LSTM encoder and decoder architecture with an attention mechanism has been proposed in the Sequence-to-Sequence concept. In order to evaluate the efficiency of the proposed system, BLUE score metrics is used, and for testing the efficiency of attention mechanism, we have developed non-attention model and compared it with the attention mechanism. Hence, we have proved that the attention mechanism has a better translation and has achieved a BLEU score of 5.16 and 88.65 accuracy.
URI: .
http://hdl.handle.net/123456789/7506
Appears in Collections:Master of computer science

Files in This Item:
File Description SizeFormat 
Mekdes Melese.pdf1.17 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.