Improving Afaan Oromo Question Answering System: Definition, List and Description Question Types for Non-factoid Questions

Daba, Endale

st. Mary's University Institutional Repository

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/6240

Title:	Improving Afaan Oromo Question Answering System: Definition, List and Description Question Types for Non-factoid Questions
Authors:	Daba, Endale
Keywords:	Non-factoid Question-Answering, Afaan Oromo Question Answering System, Description Question types, Question Classification, Document Filtering, Sentence Extraction, Answer,Selection,RuleBased.
Issue Date:	Jul-2021
Publisher:	ST. MARY’S UNIVERSITY
Abstract:	Question Answering (QA) can go beyond the retrieval of relevant documents, it is an option for efficient information access to such text data. The task of QA is to find the accurate and precise answer to a natural language question from a source text. The existing Afaan Oromo QA systems handle questions that usually take named entities as the answers. A different type of Afaan Oromo Question answer such as list, definition and description. The goal of this study is to propose approaches that tackle important problems in Afaan Oromo non-factoid QA, specifically in list, definition and description questions. The proposed QA system comprises of document preprocessing, question analysis, document analysis, and answer extraction components. Rule based techniques are used for the question classification. The approach in the document analysis component retrieves relevant documents and filters the retrieved documents using filtering patterns for list, definition and description questions a retrieved document is only retained if it contains all terms in the target in the same order as in the question. The answer extraction component works in type by type manner. The extracted sentences are scored and ranked, and then the answer selection algorithm selects top 5 non-redundant sentences from the candidate answer set. Finally the sentences are ordered to keep their coherence. The system is tested using evaluation metrics and used percentage ratio for evaluating question classification which classified 98.3% correctly. The document retrieval component is tested on two data sets that are analyzed by a stemmer and morphological analyzer. The F-score on the stemmed documents is 0.729 and on the other data it set is 0.764. Moreover, the average Fscore of the answer extraction component is 0.592.
URI:	. http://hdl.handle.net/123456789/6240
Appears in Collections:	Master of computer science Master of computer science

Files in This Item:

File	Description	Size	Format
Endale- Afan Oromo NLP 2021.pdf		1.36 MB	Adobe PDF	View/Open

Show full item record