Penerapan Convolutional Neural Networks untuk Mesin Penerjemah Bahasa Daerah Minangkabau Berbasis Gambar

Application of Convolutional Neural Networks for Image-Based Minangkabau Language Translator Machines

  • Mayanda Mega Santoni UPN Veteran Jakarta
  • Nurul Chamidah UPN Veteran Jakarta
  • Desta Sandya Prasvita UPN Veteran Jakarta
  • Helena Nurramdhani Irmanda UPN Veteran Jakarta
  • Ria Astriratma UPN Veteran Jakarta
  • Reza Amarta Prayoga Kementerian Pendidikan dan Kebudayaan
Keywords: Convolutional Neural Networks, Translation, Indonesia Language, Local Language Minangkabau, Optical Character Recognition (OCR)

Abstract

One of efforts by the Indonesian people to defend the country is to preserve and to maintain the regional languages. The current era of modernity makes the regional language image become old-fashioned, so that most them are no longer spoken.  If it is ignored, then there will be a cultural identity crisis that causes regional languages to be vulnerable to extinction. Technological developments can be used as a way to preserve regional languages. Digital image-based artificial intelligence technology using machine learning methods such as machine translation can be used to answer the problems. This research will use Deep Learning method, namely Convolutional Neural Networks (CNN). Data of this research were 1300 alphabetic images, 5000 text images and 200 vocabularies of Minangkabau regional language. Alphabetic image data is used for the formation of the CNN classification model. This model is used for text image recognition, the results of which will be translated into regional languages. The accuracy of the CNN model is 98.97%, while the accuracy for text image recognition (OCR) is 50.72%. This low accuracy is due to the failure of segmentation on the letters i and j. However, the translation accuracy increases after the implementation of the Leveinstan Distance algorithm which can correct text classification errors, with an accuracy value of 75.78%. Therefore, this research has succeeded in implementing the Convolutional Neural Networks (CNN) method in identifying text in text images and the Leveinstan Distance method in translating Indonesian text into regional language texts.

 

Downloads

Download data is not yet available.

References

, “Bahasa dan Peta Bahasa di Indonesia,” 2019. https://petabahasa.kemdikbud.go.id/ (accessed Nov. 08, 2021).

National geographic, “How many of the world’s languages are endangered National Geographic,” 2018. https://www.nationalgeographic.co.uk/travel/2018/07/how-many-worlds-languages-are-endangered (accessed Nov. 10, 2021).

Lindawati, “Bahasa Minangkabau Di Masa Depan Sebuah Proyeksi,” in International Seminar on Language Maintenance and Shift (LAMAS) 7, 2017, pp. 348–352.

P. Reza Amarta and K. Husnul, “Pola Pikir Penggunaan Bahasa Inggris Pada Masyarakat Perkotaan Di Jabodetabek,” SIMULACRA | Pusat Studi Sosiologi dan Pengembangan Masyarakat, vol. 2, no. 1, pp. 39–52, 2018.

M. Agus, S. Erlina, R. Farli, Wajiran, and B. Rohmat Indra, “Penerapan Convolutional Neural Network (CNN) pada Pengenalan Aksara Lampung Berbasis Optical Character Recognition (OCR),” Jurnal Edukasi dan Penelitian Informatika (JEPIN), vol. 7, no. 1, pp. 52–57, 2021.

W. Antonius Kevin, S. Nanik, and K. Wijayanti Nurul, “Aplikasi Penerjemah Gambar Teks Berbahasa Inggris Menggunakan Teknologi Realitas Tertambah Pada Perangkat Berbasis Android,” 2018.

A. Ginting and A. Nazori, “PENERJEMAH DUA ARAH BAHASA INDONESIA KE BAHASA DAERAH (KARO) MENGGUNAKAN TEKNIK STATISTICAL MACHINE TRANSLATION (SMT) SEBAGAI FITUR PADA SITUS WEB UNTUK MENINGKATKAN WEB TRAFFIC,” 2012.

H. Sujaini, “Peningkatan Akurasi Penerjemah Bahasa Daerah dengan Optimasi Korpus Paralel,” Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI), vol. 7, no. 1, pp. 7–12, 2018.

D. Soyusiawaty, “E-TRANSLATOR WITH RULE BASED INDONESIA-MINANG DAN MINANG-INDONESIA,” JURNAL INFORMATIKA, vol. 2, no. 2, 2008, [Online]. Available: http://cimbuak.net.

R. Darwis, H. Sujaini, R. Dwi, and N. #3, “Peningkatan Mesin Penerjemah Statistik dengan Menambah Kuantitas Korpus Monolingual (Studi Kasus : Bahasa Indonesia-Sunda),” vol. 7, no. 1, 2019.

U. Anisa Eka, N. Oky Dwi, and M. Kurniawan Teguh, “Aplikasi Penerjemah Bahasa Inggris – Indonesia dengan Optical Character Recognition Berbasis Android,” Jurnal Teknologi dan Sistem Komputer, vol. 4, no. 1, pp. 167–177, 2016.

Y. Jia et al., “Caffe: Convolutional architecture for fast feature embedding,” in MM 2014 - Proceedings of the 2014 ACM Conference on Multimedia, Nov. 2014, pp. 675–678. doi: 10.1145/2647868.2654889.

A. Naseer and K. Zafar, “Comparative Analysis of Raw Images and Meta Feature based Urdu OCR using CNN and LSTM,” 2018. [Online]. Available: www.ijacsa.thesai.org

M. R. Phangtriastu, J. Harefa, and D. F. Tanoto, “Comparison between Neural Network and Support Vector Machine in Optical Character Recognition,” in Procedia Computer Science, 2017, vol. 116, pp. 351–357. doi: 10.1016/j.procs.2017.10.061.

S. Srivastava, J. Priyadarshini, S. Gopal, S. Gupta, and H. S. Dayal, “Optical character recognition on bank cheques using 2D convolution neural network,” in Advances in Intelligent Systems and Computing, vol. 697, Springer Verlag, 2019, pp. 589–596. doi: 10.1007/978-981-13-1822-1_55.

A. Kumar Bhunia, A. Konwer, A. Kumar Bhunia, A. Bhowmick, P. P. Roy, and U. Pal, “Script Identification in Natural Scene Image and Video Frame using Attention based Convolutional-LSTM Network.”

J. Wang, J. Qin, X. Xiang, Y. Tan, and N. Pan, “CAPTCHA recognition based on deep convolutional neural network,” Mathematical Biosciences and Engineering, vol. 16, no. 5, pp. 5851–5861, 2019, doi: 10.3934/mbe.2019292.

M. Mega Santoni, N. Chamidah, D. Sandya Prasvita, R. Amarta Prayoga, and B. Permana Sukma, “Penerjemahan Bahasa Indonesia ke Bahasa Minang dari Optical Character Recognition dengan Menggunakan Algoritme Edit Distance Translating Indonesian into Minang Languages from Optical Character Recognition Using the Edit Distance Algorithm,” Jurnal Ilmu Komputer Agri-Informatika, vol. 7, no. 2, pp. 105–113, 2020, [Online]. Available: http://journal.ipb.ac.id/index.

H. Md. Mosabbir, L. Md. Farhan, R. Ahmed Sady, D. Amit Kumar, and Monira Mukta, “Auto-correction of English to Bengali Transliteration System using Levenshtein Distance,” International Conference on Smart Computing & Communications (ICSCC), 2019.

W. Zar Zar, Th ́eo Ducros, and A. Masayoshi, “Spell Corrector to Social Media Datasets inMessage Filtering Systems,” 2017.

M. Maulana Yulianto and R. Arifudin, “Autocomplete and Spell Checking Levenshtein Distance Algorithm to Getting Text Suggest Error Data Searching in Library,” Scientific Journal of Informatics, vol. 5, no. 1, pp. 2407–7658, 2018, [Online]. Available: http://journal.unnes.ac.id/nju/index.php/sji

Published
2021-12-30
How to Cite
Mayanda Mega Santoni, Nurul Chamidah, Desta Sandya Prasvita, Helena Nurramdhani Irmanda, Ria Astriratma, & Reza Amarta Prayoga. (2021). Penerapan Convolutional Neural Networks untuk Mesin Penerjemah Bahasa Daerah Minangkabau Berbasis Gambar. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(6), 1153 - 1160. https://doi.org/10.29207/resti.v5i6.3614
Section
Information Technology Articles