A method for Arabic handwritten diacritics characters
An Optical Character Recognition (OCR) is the process of converting an image representation of a document into an editable format. In addition, people have the ability to recognize characters without difficulty as reading papers or books. However, developing an OCR system that has the ability to r...
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Blue Eyes Intelligence Engineering & Sciences Publication
2019
|
| Online Access: | http://psasir.upm.edu.my/id/eprint/80428/ http://psasir.upm.edu.my/id/eprint/80428/1/ARABIC.pdf |
| _version_ | 1848858903254663168 |
|---|---|
| author | Abdullah, Muhamad Taufik Alotaibi, Faiz Azmi Murad, Masrah Azrifah O.K. Rahmat, Rahmita Wirza Abdullah, Rusli |
| author_facet | Abdullah, Muhamad Taufik Alotaibi, Faiz Azmi Murad, Masrah Azrifah O.K. Rahmat, Rahmita Wirza Abdullah, Rusli |
| author_sort | Abdullah, Muhamad Taufik |
| building | UPM Institutional Repository |
| collection | Online Access |
| description | An Optical Character Recognition (OCR) is the process of converting an image representation of a document into
an editable format. In addition, people have the ability to
recognize characters without difficulty as reading papers or books. However, developing an OCR system that has the ability to read and recognized Arabic diacritics characters as human still, remain a problem. More, specifically, poor recognition rate in most of optical diacritics characters recognition is mainly attributed to failing in segmenting a handwritten text correctly. To overcome this problem, we perform develop a method based on seven operations; it starts with searching the text-line height followed by reading words from the line. Then identify the diacritics regions. The segmentation is also applied during this operation by converting the text-line into a grayscale and binary image. Moreover, we introduced a new model based on k-nearest neighbors (KNN) algorithm to identify diacritics and characters segmentation. KNN is trained to directly predict the diacritic from the text-line. Finally, we offer an evaluation discussion on optical diacritics characters recognition. |
| first_indexed | 2025-11-15T12:20:50Z |
| format | Article |
| id | upm-80428 |
| institution | Universiti Putra Malaysia |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T12:20:50Z |
| publishDate | 2019 |
| publisher | Blue Eyes Intelligence Engineering & Sciences Publication |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | upm-804282021-01-26T20:37:35Z http://psasir.upm.edu.my/id/eprint/80428/ A method for Arabic handwritten diacritics characters Abdullah, Muhamad Taufik Alotaibi, Faiz Azmi Murad, Masrah Azrifah O.K. Rahmat, Rahmita Wirza Abdullah, Rusli An Optical Character Recognition (OCR) is the process of converting an image representation of a document into an editable format. In addition, people have the ability to recognize characters without difficulty as reading papers or books. However, developing an OCR system that has the ability to read and recognized Arabic diacritics characters as human still, remain a problem. More, specifically, poor recognition rate in most of optical diacritics characters recognition is mainly attributed to failing in segmenting a handwritten text correctly. To overcome this problem, we perform develop a method based on seven operations; it starts with searching the text-line height followed by reading words from the line. Then identify the diacritics regions. The segmentation is also applied during this operation by converting the text-line into a grayscale and binary image. Moreover, we introduced a new model based on k-nearest neighbors (KNN) algorithm to identify diacritics and characters segmentation. KNN is trained to directly predict the diacritic from the text-line. Finally, we offer an evaluation discussion on optical diacritics characters recognition. Blue Eyes Intelligence Engineering & Sciences Publication 2019 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/80428/1/ARABIC.pdf Abdullah, Muhamad Taufik and Alotaibi, Faiz and Azmi Murad, Masrah Azrifah and O.K. Rahmat, Rahmita Wirza and Abdullah, Rusli (2019) A method for Arabic handwritten diacritics characters. International Journal of Engineering and Advanced Technology, 8 (6S3). pp. 209-212. ISSN 2249-8958 https://www.ijeat.org/wp-content/uploads/papers/v8i6S3/F10340986S319.pdf 10.35940/ijeat.F1034.0986S319 |
| spellingShingle | Abdullah, Muhamad Taufik Alotaibi, Faiz Azmi Murad, Masrah Azrifah O.K. Rahmat, Rahmita Wirza Abdullah, Rusli A method for Arabic handwritten diacritics characters |
| title | A method for Arabic handwritten diacritics characters |
| title_full | A method for Arabic handwritten diacritics characters |
| title_fullStr | A method for Arabic handwritten diacritics characters |
| title_full_unstemmed | A method for Arabic handwritten diacritics characters |
| title_short | A method for Arabic handwritten diacritics characters |
| title_sort | method for arabic handwritten diacritics characters |
| url | http://psasir.upm.edu.my/id/eprint/80428/ http://psasir.upm.edu.my/id/eprint/80428/ http://psasir.upm.edu.my/id/eprint/80428/ http://psasir.upm.edu.my/id/eprint/80428/1/ARABIC.pdf |