Customers Identity Card Data Detection and Recognition Using Image Processing

Tamirat, Chala

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/7500

Title:	Customers Identity Card Data Detection and Recognition Using Image Processing
Authors:	Tamirat, Chala
Keywords:	Amharic text information extraction, Accuracy, Identity card, OCR, Page layout segmentation, Precision, Recall
Issue Date:	Jan-2023
Publisher:	ST. MARY’S UNIVERSITY
Abstract:	Many business sectors require the information contained in the ID card to perform the registration process. Previously customer data was inputted manually. Therefore, we need a system that processes automatically. Based on that problem, the Image Processing technique can be used as an alternative solution to the manual input. This process starts by extracting information from ID cards. Then, it will be pre-processed to obtain the necessary part of the image. This research follows the experimental research approach in which independent variables are manipulated or introduced, and all other variables are carefully controlled for the experimenter to measure the dependent variable. To conduct an extensive experiment first image data is captured from customers’ identity cards and prepared using image pre-processing. The main objective of this study is to detect and identify Amharic text from customers'' identity cards by applying effective page segmentation that can recognize text and non-text blocks from ID cards. Effective page layout segmentation is performed to detect and identify object information captured from the ID cards to achieve this goal. first image pre-processing techniques skew, and a perspective correction method is implemented to make collected document images ready for processing. Then, binarization methods are used to solve lightning issues. Based on the experiment Sauvola’s method worked better and faster. The second process is segmentation. This is done by applying page layout segmentation techniques, morphological dilation, and connected component (CC) to separate graphics from the text area and segment text line areas. For document images containing a small amount of noise, the system's performance without skew correction shows 90.87% precision and 98.40% recall. After the proposed skew and perspective rectification were applied a 93.6 % precision and 100% recall were registered. This study tried to detect and identify Amharic text from the ID cards of customers. Customer ID cards have different physical and logical layouts such as complicated graphics, logos, pictures, etc. The proposed study adopts google tesseract OCR for Amharic ID card document recognition; However, the recognition accuracy depends on the quality of ID cards. The study focuses on determining and identifying sample attributes. Therefore, to determine the overall layout of every scanned ID card, extracting a sense of the format and content of every scanned ID card needs further research to be conducted.
URI:	. http://hdl.handle.net/123456789/7500
Appears in Collections:	Master of computer science

Files in This Item:

File	Description	Size	Format
SMU_Chala_Tamirat_Customers_Identity_Card_Data_Detection_and_Recognition(2).pdf		1.7 MB	Adobe PDF	View/Open

Show full item record