Improved Classification of Handwritten Jawi Script Based on Main Part of Script Body
Abstract
Since the entry of Islam, many ancient relics in the archipelago were written using Jawi script. Due to human or natural factors, these ancient relics will be damaged or destroyed. To avoid the loss of this ancient heritage data, the data must be stored in digital documents. In order to convert digital documents into machine-readable text format, the use of Optical Character Recognition (OCR) technology is inevitable. In this research, OCR technology is implemented on isolated Jawi scripts. Freeman Chain Code (FCC) is used to extract the isolated Jawi script features. Subsequently, the FCC feature is fed into Support Vector Machine (SVM) in order to classify the character. The decision rule classification is applied to the class of SVM classification in the Jawi script form. The results of the SVM classification into 19 classes reached 81.58%, while the results for merging into 15 classes produced better results with the accuracy 84.21%. Feature extraction of dot location is divided into the top, middle, and bottom. Feature extraction of the number of dotss is done by counting the number of dots, while feature extraction of the presence of holes is carried out by detecting the presence of holes in the characters. These features are applied to the class of results from SVM classification with decision-making rules. The percentage of success in applying the decision rules to the results of the classification of incorporation into 15 classes by SVM reached 92.86%. Further research will be conducted to determine the effect of the feature of the location of the dot and the number of dots on the shape of the main part of the character.
Downloads
References
K. Saddami, K. Munadi, and F. Arnia, “A database of printed Jawi character image,” in 2015 Third International Conference on Image Information Processing (ICIIP), Dec. 2015, pp. 56–59. doi: 10.1109/ICIIP.2015.7414740.
M. F. Nasrudin, K. Omar, M. S. Zakaria, and L. C. Yeun, “Handwritten Cursive Jawi Character Recognition : A Survey,” pp. 247–256, 2008, doi: 10.1109/CGIV.2008.36.
F. Mushtaq, M. M. Misgar, M. Kumar, and S. S. Khurana, “UrduDeepNet: offline handwritten Urdu character recognition using deep neural network,” Neural Comput Appl, vol. 33, no. 22, pp. 15229–15252, Nov. 2021, doi: 10.1007/s00521-021-06144-x.
M. A. K.O and S. Poruran, “OCR-Nets: Variants of Pre-trained CNN for Urdu Handwritten Character Recognition via Transfer Learning,” Procedia Comput Sci, vol. 171, pp. 2294–2301, 2020, doi: 10.1016/j.procs.2020.04.248.
S. Naz et al., “Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks,” Neurocomputing, vol. 177, pp. 228–241, Feb. 2016, doi: 10.1016/j.neucom.2015.11.030.
G. A. Montazer, H. Q. Saremi, and V. Khatibi, “A neuro-fuzzy inference engine for Farsi numeral characters recognition,” Expert Syst Appl, vol. 37, no. 9, pp. 6327–6337, Sep. 2010, doi: 10.1016/j.eswa.2010.02.088.
A. Broumandnia and J. Shanbehzadeh, “Fast Zernike wavelet moments for Farsi character recognition,” Image Vis Comput, vol. 25, no. 5, pp. 717–726, May 2007, doi: 10.1016/j.imavis.2006.05.014.
M. Namazi, Ms. Student, and K. Faez, “Application of a Neural Network for multifont Farsi character recognition using fuzzified Pseudo - Zernike moments,” in Proceedings IWISP ’96, Elsevier, 1996, pp. 361–364. doi: 10.1016/B978-044482587-2/50080-X.
. S., F. Arnia, and R. Muharar, “Pengenalan Aksara Jawi Tulisan Tangan Menggunakan Freemen Chain Code (FCC), Support Vector Machine (SVM) dan Aturan Pengambilan Keputusan,” Jurnal Nasional Teknik Elektro, vol. 5, no. 1, p. 45, 2016, doi: 10.25077/jnte.v5n1.185.2016.
S. Safrizal, “Pengenalan Karakter Jawi Tulisan Tangan Menggunakan Fitur Sudut,” VOCATECH: Vocational Education and Technology Journal, vol. 1, no. 1, pp. 1–4, 2019, doi: 10.38038/vocatech.v1i0.1.
N. Otsu, “A Threshold Selection Method from Gray-Level Histograms,” IEEE Trans Syst Man Cybern, 2008, doi: 10.1109/tsmc.1979.4310076.
W. Chen, L. Sui, Z. Xu, and Y. Lang, “Improved Zhang-Suen thinning algorithm in binary line drawing applications,” in 2012 International Conferesnce on Systems and Informatics (ICSAI2012), May 2012, no. Icsai, pp. 1947–1950. doi: 10.1109/ICSAI.2012.6223430.
D. Nasien, H. Haron, and S. S. Yuhaniz, “Support Vector Machine (SVM) for English Handwritten Character Recognition,” in 2010 Second International Conference on Computer Engineering and Applications, 2010, vol. 1, no. January, pp. 249–252. doi: 10.1109/ICCEA.2010.56.
Copyright (c) 2023 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;