Chinese character recognition using non-negative matrix factorization
Non-negative matrix factorization (NMF) was introduced by Paatero and Tapper in 1994 and it was a general way of reducing the dimension of the matrix with non-negative entries. Non-negative matrix factorization is very useful in many data analysis applications such as character recognition, text min...
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Penerbit Universiti Kebangsaan Malaysia
2024
|
| Online Access: | http://journalarticle.ukm.my/25254/ http://journalarticle.ukm.my/25254/1/kejut_24.pdf |
| _version_ | 1848816308601225216 |
|---|---|
| author | Chen, Huey Voon Tang, Ker Shin Ng, Wei Shean |
| author_facet | Chen, Huey Voon Tang, Ker Shin Ng, Wei Shean |
| author_sort | Chen, Huey Voon |
| building | UKM Institutional Repository |
| collection | Online Access |
| description | Non-negative matrix factorization (NMF) was introduced by Paatero and Tapper in 1994 and it was a general way of reducing the dimension of the matrix with non-negative entries. Non-negative matrix factorization is very useful in many data analysis applications such as character recognition, text mining, and others. This paper aims to study the application in Chinese character recognition using non-negative matrix factorization. Python was used to carry out the LU factorization and non-negative matrix factorization of a Chinese character in Boolean Matrix. Preliminary analysis confirmed that the data size of and and are chosen for the NMF of the Boolean matrix. In this project, one hundred printed Chinese characters were selected, and all the Chinese characters can be categorized into ten categories according to the number of strokes , for . The Euclidean distance between the Boolean matrix of a Chinese character and the matrix after both LU factorization and NMF is calculated for further analysis. Paired t-test confirmed that the factorization of Chinese characters in the Boolean matrix using NMF is better than the LU factorization. Finally, ten handwritten Chinese characters were selected to test whether the program is able to identify the handwritten and the printed Chinese characters. Experimental results showed that 70% of the characters can be recognized via the least Euclidean distance obtained. NMF is suitable to be applied in Chinese character recognition since it can reduce the dimension of the image and the error between the original Boolean matrix and after NMF is less than 5%. |
| first_indexed | 2025-11-15T01:03:49Z |
| format | Article |
| id | oai:generic.eprints.org:25254 |
| institution | Universiti Kebangasaan Malaysia |
| institution_category | Local University |
| language | English |
| last_indexed | 2025-11-15T01:03:49Z |
| publishDate | 2024 |
| publisher | Penerbit Universiti Kebangsaan Malaysia |
| recordtype | eprints |
| repository_type | Digital Repository |
| spelling | oai:generic.eprints.org:252542025-05-23T13:30:59Z http://journalarticle.ukm.my/25254/ Chinese character recognition using non-negative matrix factorization Chen, Huey Voon Tang, Ker Shin Ng, Wei Shean Non-negative matrix factorization (NMF) was introduced by Paatero and Tapper in 1994 and it was a general way of reducing the dimension of the matrix with non-negative entries. Non-negative matrix factorization is very useful in many data analysis applications such as character recognition, text mining, and others. This paper aims to study the application in Chinese character recognition using non-negative matrix factorization. Python was used to carry out the LU factorization and non-negative matrix factorization of a Chinese character in Boolean Matrix. Preliminary analysis confirmed that the data size of and and are chosen for the NMF of the Boolean matrix. In this project, one hundred printed Chinese characters were selected, and all the Chinese characters can be categorized into ten categories according to the number of strokes , for . The Euclidean distance between the Boolean matrix of a Chinese character and the matrix after both LU factorization and NMF is calculated for further analysis. Paired t-test confirmed that the factorization of Chinese characters in the Boolean matrix using NMF is better than the LU factorization. Finally, ten handwritten Chinese characters were selected to test whether the program is able to identify the handwritten and the printed Chinese characters. Experimental results showed that 70% of the characters can be recognized via the least Euclidean distance obtained. NMF is suitable to be applied in Chinese character recognition since it can reduce the dimension of the image and the error between the original Boolean matrix and after NMF is less than 5%. Penerbit Universiti Kebangsaan Malaysia 2024 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/25254/1/kejut_24.pdf Chen, Huey Voon and Tang, Ker Shin and Ng, Wei Shean (2024) Chinese character recognition using non-negative matrix factorization. Jurnal Kejuruteraan, 36 (2). pp. 653-660. ISSN 0128-0198 https://www.ukm.my/jkukm/volume-3602-2024/ |
| spellingShingle | Chen, Huey Voon Tang, Ker Shin Ng, Wei Shean Chinese character recognition using non-negative matrix factorization |
| title | Chinese character recognition using non-negative matrix factorization |
| title_full | Chinese character recognition using non-negative matrix factorization |
| title_fullStr | Chinese character recognition using non-negative matrix factorization |
| title_full_unstemmed | Chinese character recognition using non-negative matrix factorization |
| title_short | Chinese character recognition using non-negative matrix factorization |
| title_sort | chinese character recognition using non-negative matrix factorization |
| url | http://journalarticle.ukm.my/25254/ http://journalarticle.ukm.my/25254/ http://journalarticle.ukm.my/25254/1/kejut_24.pdf |