Multi-Accent Speaker Detection Using Normalize Feature  MFCC Neural Network Method

Kristiawan Nugroho; Edy Winarno; Eri Zuliarso; Sunardi

doi:10.29207/resti.v7i4.4652

Kristiawan Nugroho Universitas Stikubank
Edy Winarno Universitas Stikubank
Eri Zuliarso Universitas Stikubank
Sunardi Universitas Stikubank

DOI: https://doi.org/10.29207/resti.v7i4.4652

Keywords: speaker recognition, classification, multi accent, MFCC

Abstract

Speaker recognition is a field of research that continues to this day. Various methods have been developed to detect the human voice with greater precision and accuracy. Research on human speech recognition that is quite challenging is accent recognition. Detecting various types of human accents with different accents and ethnicities with high accuracy is a research that is quite difficult to do. According to the results of the research on the data preprocessing stage, feature extraction and selection of the right classification method play a very important role in determining the accuracy results. This study uses a preprocessing approach with normalizing features combined with MFCC as a method to perform feature extraction and the neural network (NN), which is a classification method that works based on the workings of the human brain. Research results obtained using the normalize feature with MFCC and neural network for multiaccent speaker recognition, the accuracy performance reaches 82.68%, precision is 83% and recall is 82.88%.

Downloads

Download data is not yet available.

References

K. Aida-Zade, A. Xocayev, and S. Rustamov, “Speech recognition using Support Vector Machines,” Appl. Inf. Commun. Technol. AICT 2016 - Conf. Proc., vol. 1, 2017, doi: 10.1109/ICAICT.2016.7991664.

W. A. Pradana, Adiwijaya, and U. N. Wisesty, “Implementation of support vector machine for classification of speech marked hijaiyah letters based on Mel frequency cepstrum coefficient feature extraction,” J. Phys. Conf. Ser., vol. 971, no. 1, 2018, doi: 10.1088/1742-6596/971/1/012050.

M. S. Rao, G. B. Lakshmi, P. Gowri, and K. B. Chowdary, “Random Forest Based Automatic Speaker Recognition System,” Int. J. Anal. Exp. Model Anal., vol. 12, no. 4, pp. 526–535, 2020, [Online]. Available: http://www.ijaema.com/gallery/63-ijaema-april-3748.pdf

N. S, D. R. A, and M. G. S, “Speech Recognition System for Isolated Tamil Words using Random Forest Algorithm,” Int. J. Recent Technol. Eng., vol. 9, no. 1, pp. 2431–2435, 2020, doi: 10.35940/ijrte.a1467.059120.

V. Chauhan, S. Dwivedi, P. Karale, and P. S. M. Potdar, “Speech to Text Converter Using Gaussian Mixture Model ( GMM ) of Electronics and Telecommunication Engineering,” Int. Res. J. Eng. Technol., pp. 125–129, 2016.

P. K. Nayana, D. Mathew, and A. Thomas, “Comparison of Text Independent Speaker Identification Systems using GMM and i-Vector Methods,” Procedia Comput. Sci., vol. 115, pp. 47–54, 2017, doi: 10.1016/j.procs.2017.09.075.

R. R. K and A. P. Joseph, “Domestic Language Accent Detector Using MFCC and GMM,” Int. J. Appl. Eng. Res., vol. 15, no. 8, p. 800, 2020, doi: 10.37622/ijaer/15.8.2020.800-803.

T. Chamidy, “Metode Mel Frequency Cepstral Coeffisients (MFCC) Pada klasifikasi Hidden Markov Model (HMM) Untuk Kata Arabic pada Penutur Indonesia,” Matics, vol. 8, no. 1, p. 36, 2016, doi: 10.18860/mat.v8i1.3482.

P. Huruf, Q. Nada, C. Ridhuandi, P. Santoso, and D. Apriyanto, “Speech Recognition dengan Hidden Markov Model untuk,” J. AL-AZHAR Indones. SERI SAINS DAN Teknol., vol. 5, no. 1, pp. 19–26, 2019.

Y. Chen, “A hidden Markov optimization model for processing and recognition of English speech feature signals,” J. Intell. Syst., vol. 31, no. 1, pp. 716–725, 2022, doi: 10.1515/jisys-2022-0057.

Y. Singh, A. Pillay, and E. Jembere, “Features of speech audio for accent recognition,” 2020 Int. Conf. Artif. Intell. Big Data, Comput. Data Commun. Syst. icABCD 2020 - Proc., 2020, doi: 10.1109/icABCD49160.2020.9183893.

A. Maurya, D. Kumar, and R. K. Agarwal, “Speaker Recognition for Hindi Speech Signal using MFCC-GMM Approach,” in Procedia Computer Science, 2018, vol. 125, pp. 880–887. doi: 10.1016/j.procs.2017.12.112.

D. S. Widyowaty, A. Sunyoto, and H. Al Fatta, “Accent Recognition Using Mel-Frequency Cepstral Coefficients and Convolutional Neural Network,” Proc. Int. Conf. Innov. Sci. Technol. (ICIST 2020), vol. 208, no. Icist 2020, pp. 43–46, 2021, [Online]. Available: https://doi.org/10.2991/aer.k.211129.010

K. Nugroho, E. Noersasongko, D. R. Ignatius, and M. Setiadi, “Enhanced Indonesian Ethnic Speaker Recognition using Data Augmentation Deep Neural Network,” J. King Saud Univ. - Comput. Inf. Sci., no. xxxx, 2021, doi: 10.1016/j.jksuci.2021.04.002.

B. Odulio et al., “A speaker accent recognition system for filipino language,” Proc. 33rd Pacific Asia Conf. Lang. Inf. Comput. PACLIC 2019, no. 2013, pp. 511–515, 2019.

A. A. Ayrancõ, “Makine Ö ÷ renmesi AlgoritmalarÕ Kullanarak KonuúmacÕ AksanÕ TanÕma Speaker Accent Recognition Using Machine Learning Algorithms,” 2020.

P. M. C. Saiprasad Duduka, Henil Jain, Virik Jain, Harsh Prabhu, “A Neural Network Approach to Accent Classification,” Irjet, vol. 8, no. 3, pp. 1775–1777, 2021.

Z. Zhang, Y. Wang, and J. Yang, “Accent Recognition with Hybrid Phonetic Features,” Sensors (Basel)., vol. 21, no. 18, 2021, doi: 10.3390/s21186258.

S. Sakti, P. Hutagaol, A. A. Arman, and S. Nakamura, “Indonesian speech recognition for hearing and speaking impaired people,” 8th Int. Conf. Spok. Lang. Process. ICSLP 2004, no. February 2015, pp. 1037–1040, 2004, doi: 10.21437/interspeech.2004-366.

A. Y. P. Idwal, Y. I. Nurhasanah, and D. B. Utami, “Sistem Pengenalan Suara Bahasa Indonesia Untuk Mengenali Aksen Daerah,” J. Tek. Inform. dan Sist. Inf., vol. 3, no. 3, pp. 461–471, 2017, doi: 10.28932/jutisi.v3i3.661.

M. Badrul, “Optimasi Neural Network dengan Algoritma Genetika untuk Prediksi Hasil Pemilukada,” Bina Insa. ICT J., vol. 3, no. 1, pp. 229–242, 2016.

B. Li, F. Wu, S. N. Lim, S. Belongie, and K. Q. Weinberger, “On feature normalization and data augmentation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 12378–12387, 2021, doi: 10.1109/CVPR46437.2021.01220.

M. Susanti, B. Susilo, and D. Andreswari, “Aplikasi Speech-To-Text Dengan Metode Mel Frequency Cepstral Coefficient ( Mfcc ) Dan Hidden Markov Model ( Hmm ) Dalam Pencarian Kode,” J. Rekursif, vol. 6, no. 1, pp. 48–58, 2018, [Online]. Available: https://ejournal.unib.ac.id/index.php/rekursif/article/view/6497%0Ahttps://ejournal.unib.ac.id/index.php/rekursif/article/download/6497/3102

C. G. K. Leon, “Robust computer voice recognition using improved MFCC algorithm,” Proc. - 2009 Int. Conf. New Trends Inf. Serv. Sci. NISS 2009, pp. 835–840, 2009, doi: 10.1109/NISS.2009.12.

B. S. Santoso, J. P. Tanjung, U. P. Indonesia, B. Gandum, and A. N. Network, “Classification of Wheat Seeds Using Neural Network Backpropagation,” JITE (Journal Informatics Telecommun. Eng. Available, vol. 4, no. January, pp. 188–197, 2021.

M. Ichwan, I. A. Dewi, and Z. M. S, “Klasifikasi Support Vector Machine (SVM) Untuk Menentukan TingkatKemanisan Mangga Berdasarkan Fitur Warna,” MIND J., vol. 3, no. 2, pp. 16–23, 2019, doi: 10.26760/mindjournal.v3i2.16-23.

A. Sarica, A. Cerasa, and A. Quattrone, “Random Forest Algorithm for the Classification of Neuroimaging Data in Alzheimer ’ s Disease : A Systematic Review,” vol. 9, no. October, pp. 1–12, 2017, doi: 10.3389/fnagi.2017.00329.

Multi-Accent Speaker Detection Using Normalize Feature MFCC Neural Network Method

Abstract

Downloads

References

Most read articles by the same author(s)