Analisis Perbandingan Algoritma Klasifikasi MLP dan CNN pada Dataset American Sign Language
Abstract
People who have hearing loss (deafness) or speech impairment (hearing impairment) usually use sign language to communicate. One of the most basic and flexible sign languages is the Alphabet Sign Language to spell out the words you want to pronounce. Sign language uses hand, finger, and face movements to speak the user's thoughts. However, for alphabetical sign language, facial expressions are not used but only gestures or symbols formed using fingers and hands. In fact, there are still many people who don't understand the meaning of sign language. The use of image classification can help people more easily learn and translate sign language. Image classification accuracy is the main problem in this case. This research conducted a comparison of image classification algorithms, namely Convolutional Neural Network (CNN) and Multilayer Perceptron (MLP) to recognize American Sign Language (ASL) except the letters "J" and "Z" because movement is required for both. This is done to see the effect of the convolution and pooling stages on CNN on the resulting accuracy value and F1 Score in the ASL dataset. Based on the comparison, the use of CNN which begins with Gaussian Low Pass Filtering preprocessing gets the best accuracy of 96.93% and F1 Score 96.97%
Downloads
References
Shivashankara and Srinath, “American Sign Language Recognition System: An Optimal Approach,” Int. J. Image, Graph. Signal Process., vol. 10, no. 8, pp. 18–30, 2018, doi: 10.5815/ijigsp.2018.08.03.
“Deafness.” https://www.who.int/news-room/facts-in-pictures/detail/deafness (accessed Dec. 16, 2020).
WHO, “Deafness and hearing loss.” https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss (accessed Dec. 16, 2020).
W. C. Stokoe and M. Marschark, “Sign language structure: An outline of the visual communication systems of the american deaf,” J. Deaf Stud. Deaf Educ., vol. 10, no. 1, pp. 3–37, 2005, doi: 10.1093/deafed/eni001.
N. I. on D. and O. Communication, “What Is American Sign Language (ASL)? | NIDCD.” https://www.nidcd.nih.gov/health/american-sign-language (accessed Dec. 16, 2020).
Y. Baştanlar and M. Ozuysal, Introduction to Machine Learning Second Edition, vol. 1107. 2014.
J. Kim, B.-S. Kim, and S. Savarese, “Comparing Image Classification Methods: K-Nearest-Neighbor and Support-Vector-Machines,” Appl. Math. Electr. Comput. Eng., pp. 133–138, 2012.
N. Coskun and T. Yildirim, “The effects of training algorithms in MLP network on image classification,” Proc. Int. Jt. Conf. Neural Networks, vol. 2, no. see 17, pp. 1223–1226, 2003, doi: 10.1109/ijcnn.2003.1223867.
M. Xin and Y. Wang, “Research on image classification model based on deep convolution neural network,” Eurasip J. Image Video Process., vol. 2019, no. 1, 2019, doi: 10.1186/s13640-019-0417-8.
S. Ameen and S. Vadera, “A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images,” Expert Syst., vol. 34, no. 3, 2017, doi: 10.1111/exsy.12197.
L. K. S. Tolentino, R. O. S. Juan, A. C. Thio-ac, M. A. B. Pamahoy, R. R. Forteza, and X. J. O. Garcia, “Static Sign Language Recognition Using Deep Learning,” no. November, 2019, doi: 10.18178/ijmlc.2019.9.6.879.
R. Daroya, D. Peralta, and P. Naval, “Alphabet Sign Language Image Classification Using Deep Learning,” IEEE Reg. 10 Annu. Int. Conf. Proceedings/TENCON, vol. 2018-October, no. October, pp. 646–650, 2019, doi: 10.1109/TENCON.2018.8650241.
S. Ben Driss, M. Soua, R. Kachouri, and M. Akil, “A comparison study between MLP and convolutional neural network models for character recognition,” Real-Time Image Video Process. 2017, vol. 10223, p. 1022306, 2017, doi: 10.1117/12.2262589.
P. Marius-Constantin, V. E. Balas, L. Perescu-Popescu, and N. Mastorakis, “Multilayer perceptron and neural networks,” WSEAS Trans. Circuits Syst., vol. 8, no. 7, pp. 579–588, 2009.
K. O’Shea and R. Nash, “An Introduction to Convolutional Neural Networks,” pp. 1–11, 2015, [Online]. Available: http://arxiv.org/abs/1511.08458.
tecperson, “Sign Language MNIST | Kaggle,” 2017. https://www.kaggle.com/datamunge/sign-language-mnist (accessed Dec. 18, 2020).
R. C. Gonzalez and R. E. Woods, Digital Image Processing, 4e. .
“scikit-learn: machine learning in Python — scikit-learn 0.23.2 documentation.” https://scikit-learn.org/stable/ (accessed Dec. 18, 2020).
D. C. Liu and J. Nocedal, “On the limited memory BFGS method for large scale optimization,” Math. Program. Ser. B, vol. 45, no. 3, pp. 503–528, 1989.
F. Chollet and & O., “Keras: the Python deep learning API,” Keras: the Python deep learning API, 2020. https://keras.io/ (accessed Dec. 18, 2020).
Google Colab, “Welcome to Colaboratory - Colaboratory,” Getting Started - Introduction, 2020. https://colab.research.google.com/notebooks/intro.ipynb (accessed Dec. 18, 2020).
Y. Lecun, L. Bottou, Y. Bengio, and P. Ha, “LeNet,” Proc. IEEE, no. November, pp. 1–46, 1998.
T. F. Gonzalez, “Handbook of approximation algorithms and metaheuristics,” Handb. Approx. Algorithms Metaheuristics, pp. 1–1432, 2007, doi: 10.1201/9781420010749.
M. D. Zeiler and R. Fergus, “Visualizing and understanding convolutional networks,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 8689 LNCS, no. PART 1, pp. 818–833, 2014, doi: 10.1007/978-3-319-10590-1_53.
C. Szegedy et al., “Going deeper with convolutions,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 07-12-June, pp. 1–9, 2015, doi: 10.1109/CVPR.2015.7298594.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 3rd Int. Conf. Learn. Represent. ICLR 2015 - Conf. Track Proc., pp. 1–14, 2015.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 770–778, 2016, doi: 10.1109/CVPR.2016.90.
Copyright (c) 2021 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;