Investigating the Impact of ReLU and Sigmoid Activation Functions on Animal Classification Using CNN Models
Abstract
VGG16 is a convolutional neural network model used for image recognition. It is unique in that it only has 16 weighted layers, rather than relying on a large number of hyperparameters. It is considered one of the best vision model architectures. However, several things need to be improved to increase the accuracy of image recognition. In this context, this work proposes and investigates two ensemble CNNs using transfer learning and compares them with state-of-the-art CNN architectures. This study compares the performance of (rectified linear unit) ReLU and sigmoid activation functions on CNN models for animal classification. To choose which model to use, we tested two state-of-the-art CNN architectures: the default VGG16 with the proposed method VGG16. A dataset consisting of 2,000 images of five different animals was used. The results show that ReLU achieves a higher classification accuracy than sigmoid. The model with ReLU in fully connected and convolutional layers achieved the highest precision of 97.56% in the test dataset. The research aims to find better activation functions and identify factors that influence model performance. The dataset consists of animal images collected from Kaggle, including cats, cows, elephants, horses, and sheep. It is divided into training sets and test sets (ratio 80:20). The CNN model has two convolution layers and two fully connected layers. ReLU and sigmoid activation functions with different learning rates are used. Evaluation metrics include accuracy, precision, recall, F1 score, and test cost. ReLU outperforms sigmoid in accuracy, precision, recall, and F1 score. This study emphasizes the importance of choosing the right activation function for better classification accuracy. ReLU is identified as effective in solving the vanish-gradient problem. These findings can guide future research to improve CNN models in animal classification.
Downloads
References
A. K. Nugroho, I. Permadi, and M. Faturrahim, “Improvement Of Image Quality Using Convolutional Neural Networks Method,” Sci. J. Informatics, vol. 9, no. 1, pp. 95–103, 2022, doi: 10.15294/sji.v9i1.30892.
J. D. Rosita P and W. S. Jacob, “Multi-Objective Genetic Algorithm and CNN-Based Deep Learning Architectural Scheme for effective spam detection,” Int. J. Intell. Networks, vol. 3, no. December 2021, pp. 9–15, 2022, doi: 10.1016/j.ijin.2022.01.001.
V. Choudhary, P. Guha, G. Pau, R. K. Dhanaraj, and S. Mishra, “Automatic Classification of Cowpea Leaves Using Deep Convolutional Neural Network,” Smart Agric. Technol., p. 100209, 2023, doi: 10.1016/j.atech.2023.100209.
G. Pan, J. Li, F. Lin, T. Sun, and Y. Sun, “A Combined Activation Function for Learning Performance Improvement of CNN Image Classification,” no. Icvmee 2019, pp. 360–366, 2020, doi: 10.5220/0008851103600366.
D. Ruan, J. Wang, J. Yan, and C. Gühmann, “CNN parameter design based on fault signal analysis and its application in bearing fault diagnosis,” Adv. Eng. Informatics, vol. 55, no. June 2022, p. 101877, 2023, doi: 10.1016/j.aei.2023.101877.
G. Rajeshkumar et al., “Smart office automation via faster R-CNN based face recognition and internet of things,” Meas. Sensors, vol. 27, no. November 2022, p. 100719, 2023, doi: 10.1016/j.measen.2023.100719.
W. Baccouch, S. Oueslati, B. Solaiman, and S. Labidi, “ScienceDirect ScienceDirect performance for for automatic automatic A comparative comparative study study of of CNN CNN and and U-Net U-Net performance segmentation of of medical medical images : images : application application to to cardiac cardiac MRI ,” Procedia Comput. Sci., vol. 219, no. 2022, pp. 1089–1096, 2023, doi: 10.1016/j.procs.2023.01.388.
M. Chu, P. Wu, G. Li, W. Yang, J. L. Gutiérrez-Chico, and S. Tu, “Advances in Diagnosis, Therapy, and Prognosis of Coronary Artery Disease Powered by Deep Learning Algorithms,” JACC Asia, vol. 3, no. 1, pp. 1–14, 2023, doi: 10.1016/j.jacasi.2022.12.005.
R. Pavez, J. Diaz, J. Arango-Lopez, D. Ahumada, C. Mendez-Sandoval, and F. Moreira, “Emo-mirror: a proposal to support emotion recognition in children with autism spectrum disorders,” Neural Comput. Appl., vol. 35, no. 11, pp. 7913–7924, 2021, doi: 10.1007/s00521-021-06592-5.
C. Qing, N. Yang, S. Tang, J. Chen, and J. Wang, “CNN-aided timing synchronization in OFDM systems by exploiting lightweight cascaded mode,” ICT Express, no. xxxx, pp. 0–5, 2023, doi: 10.1016/j.icte.2023.03.003.
Y. Xu, H. K. Lam, G. Jia, J. Jiang, J. Liao, and X. Bao, “Improving COVID-19 CT classification of CNNs by learning parameter-efficient representation,” Comput. Biol. Med., vol. 152, no. December 2022, p. 106417, 2023, doi: 10.1016/j.compbiomed.2022.106417.
K. L. Kohsasih and B. H. Hayadi, “Classification SARS-CoV-2 Disease based on CT-Scan Image Using Convolutional Neural Network,” Sci. J. Informatics, vol. 9, no. 2, pp. 197–204, 2022, doi: 10.15294/sji.v9i2.36583.
R. Febrian, B. M. Halim, M. Christina, D. Ramdhan, and A. Chowanda, “Facial expression recognition using bidirectional LSTM - CNN,” Procedia Comput. Sci., vol. 216, no. 2022, pp. 39–47, 2023, doi: 10.1016/j.procs.2022.12.109.
A. Anton, N. F. Nissa, A. Janiati, N. Cahya, and P. Astuti, “Application of Deep Learning Using Convolutional Neural Network (CNN) Method For Women’s Skin Classification,” Sci. J. Informatics, vol. 8, no. 1, pp. 144–153, 2021, doi: 10.15294/sji.v8i1.26888.
R. Li, R. Gao, and P. N. Suganthan, “A decomposition-based hybrid ensemble CNN framework for driver fatigue recognition,” Inf. Sci. (Ny)., vol. 624, pp. 833–848, 2023, doi: 10.1016/j.ins.2022.12.088.
S. Razzaq, B. Shah, F. Iqbal, M. Ilyas, F. Maqbool, and A. Rocha, “DeepClassRooms: a deep learning based digital twin framework for on-campus class rooms,” Neural Comput. Appl., vol. 35, no. 11, pp. 8017–8026, 2022, doi: 10.1007/s00521-021-06754-5.
X. Huo, J. Xu, M. Xu, and H. Chen, “Artificial Intelligence in the Life Sciences An improved 3D quantitative structure-activity relationships ( QSAR ) of molecules with CNN-based partial least squares model,” Artif. Intell. Life Sci., vol. 3, no. November 2022, p. 100065, 2023, doi: 10.1016/j.ailsci.2023.100065.
A. Nouriani, R. Mcgovern, and R. Rajamani, “Intelligent Systems with Applications Activity recognition using a combination of high gain observer and deep learning computer vision algorithms,” Intell. Syst. with Appl., vol. 18, no. March, p. 200213, 2023, doi: 10.1016/j.iswa.2023.200213.
N. Youssouf, “Traffic sign classification using CNN and detection using faster-RCNN and YOLOV4,” Heliyon, vol. 8, no. 12, 2022, doi: 10.1016/j.heliyon.2022.e11792.
H.-K. Jo, S.-H. Kim, and C.-L. Kim, “Proposal of a new method for learning of diesel generator sounds and detecting abnormal sounds using an unsupervised deep learning algorithm,” Nucl. Eng. Technol., vol. 55, no. 2, pp. 506–515, 2022, doi: 10.1016/j.net.2022.10.019.
B. Eidel, “Deep CNNs as universal predictors of elasticity tensors in homogenization,” Comput. Methods Appl. Mech. Eng., vol. 403, p. 115741, 2023, doi: 10.1016/j.cma.2022.115741.
S. Anwar and Á. Rocha, “Special issue on towards advancements in machine learning for exploiting large-scale and heterogeneous repositories,” Neural Comput. Appl., vol. 5, pp. 7909–7911, 2023, doi: 10.1007/s00521-022-08182-5.
A. Abbas, J. P. Vantassel, B. R. Cox, K. Kumar, and J. Crocker, “A frequency-velocity CNN for developing near-surface 2D vs images from linear-array, active-source wavefield measurements,” Comput. Geotech., vol. 156, no. February, p. 105305, 2023, doi: 10.1016/j.compgeo.2023.105305.
N. N. Prakash, V. Rajesh, D. L. Namakhwa, S. Dwarkanath Pande, and S. H. Ahammad, “A DenseNet CNN-based liver lesion prediction and classification for future medical diagnosis,” Sci. African, vol. 20, p. e01629, 2023, doi: 10.1016/j.sciaf.2023.e01629.
W. N. Ismail, H. A. Alsalamah, M. M. Hassan, and E. Mohamed, “AUTO-HAR: An adaptive human activity recognition framework using an automated CNN architecture design,” Heliyon, vol. 9, no. 2, p. e13636, 2023, doi: 10.1016/j.heliyon.2023.e13636.
Q. Hou, R. Xia, J. Zhang, Y. Feng, Z. Zhan, and X. Wang, “Learning visual overlapping image pairs for SfM via CNN fine-tuning with photogrammetric geometry information,” Int. J. Appl. Earth Obs. Geoinf., vol. 116, no. October 2022, p. 103162, 2023, doi: 10.1016/j.jag.2022.103162.
Z. A. Sejuti and M. S. Islam, “A hybrid CNN–KNN approach for identification of COVID-19 with 5-fold cross validation,” Sensors Int., vol. 4, no. November 2022, p. 100229, 2023, doi: 10.1016/j.sintl.2023.100229.
L. F. de J. Silva, O. A. C. Cortes, and J. O. B. Diniz, “A novel ensemble CNN model for COVID-19 classification in computerized tomography scans,” Results Control Optim., vol. 11, no. September 2022, p. 100215, 2023, doi: 10.1016/j.rico.2023.100215.
L. Gaur, U. Bhatia, N. Z. Jhanjhi, G. Muhammad, and M. Masud, “Medical image-based detection of COVID-19 using Deep Convolution Neural Networks,” Multimed. Syst., no. 0123456789, 2021, doi: 10.1007/s00530-021-00794-6.
S. Sowmya and D. Jose, “Contemplate on ECG signals and classification of arrhythmia signals using CNN-LSTM deep learning model,” Meas. Sensors, vol. 24, no. October, p. 100558, 2022, doi: 10.1016/j.measen.2022.100558.
Copyright (c) 2024 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;