Peningkatan Hasil Klasifikasi pada Algoritma Random Forest untuk Deteksi Pasien Penderita Diabetes Menggunakan Metode Normalisasi
Improved Classification Results in the Random Forest Algorithm for Detection of Diabetes Patients Using the Normalization Method
Abstract
Diabetes is a disease caused by high blood sugar in the body or beyond normal limits. Diabetics in Indonesia have experienced a significant increase, Basic Health Research states that diabetics in Indonesia were 6.9% to 8.5% increased from 2013 to 2018 with an estimated number of sufferers more than 16 million people. Therefore, it is necessary to have a technology that can detect diabetes with good performance, accurate level of analysis, so that diabetes can be treated early to reduce the number of sufferers, disabilities, and deaths. The different scale values for each attribute in Gula Karya Medika’s data can complicate the classification process, for this reason the researcher uses two data normalization methods, namely min-max normalization, z-score normalization, and a method without data normalization with Random Forest (RF) as a classification method. Random Forest (RF) as a classification method has been tested in several previous studies. Moreover, this method is able to produce good performance with high accuracy. Based on the research results, the best accuracy is model 1 (Min-max normalization-RF) of 95.45%, followed by model 2 (Z-score normalization-RF) of 95%, and model 3 (without data normalization-RF) of 92%. From these results, it can be concluded that model 1 (Min-max normalization-RF) is better than the other two data normalization models and is able to increase the performance of classification Random Forest by 95.45%.
Downloads
References
Kementerian Kesehatan Republik Indonesia, 2018. Cegah, Cegah, dan Cegah: Suara Dunia Perangi Diabetes. [Online] (Update 13 Dec 2018). Tersedia di: http://p2ptm.kemkes.go.id/kegiatan-p2ptm/pusat-/cegah-cegah-dan-cegah-suara-dunia-perangi-diabetes [Accessed 6 Juni 2020]
Manimaran, R. and Vanitha, Dr. M, 2017. Novel Approach to Prediction of Diabetes using Classification Mining Algorithm. International Journal of Innovative Research in Science, Engineering and Technology, 6 (7), pp. 14481–14487. doi: 10.15680/IJIRSET.2017.0607266.
Agatsa, D. A., Rismala, R., and Wisesty, U.N, 2020. Klasifikasi Pasien Pengidap Diabetes menggunakan Metode Support Vector Machine. Journal of Telkom University, pp. 1–9.
Indrayanti, Sugianti, D., and AL Karomi, M. A., 2017. Optimasi Parameter K pada Algoritma K-Nearest Neighbour untuk Klasifikasi Penyakit Diabetes Mellitus. Jurnal Neliti, 14 (4), pp. 823–829.
Putra, J. A. and Akbar, A. L., 2016. Klasifikasi Pengidap Diabetes Pada Perempuan Menggunakan Penggabungan Metode Support Vector Machine dan K-Nearest Neighbour. Informatics J. UNEJ, 1 (2), pp. 47–52.
Ayon, S. I. and Islam, M. M., 2019. Diabetes Prediction: A Deep Learning Approach. International Journal of Information Engineering and Electronic Business, 2, pp. 21–27.
Pandey, A. and Jain, A., 2017. Comparative Analysis of KNN Algorithm using Various Normalization Techniques. I.J. Computer Network and Information Security, 11, pp. 36–42. doi: 10.5815/ijcnis.2017.11.04.
Rahman, M. F., Darmawidjadja, M. I., and Alamsah, D., 2017. Klasifikasi untuk Diagnosa Diabetes Menggunakan Metode Bayesian Regularization Neural Network (RBNN). Journal of Garuda, 11 (1), pp. 36–45.
Chairunisa, R., Adiwijaya, and Astuti, W., 2020. Perbandingan CART dan Random Forest untuk Deteksi Kanker berbasis Klasifikasi Data Microarray. Jurnal RESTI, 4(5), pp. 805–812. doi: https://doi.org/10.29207/resti.v4i5.2083.
Han, J., Kamber, M., and Pei, J., 2011. Data Mining Concepts and Techniques. (3rd ed.). USA: Morgan Kaufmann.
Khoirunnisa, A. and Rohmawati A., A., 2019. Implementing Principal Component Analysis and Multinomial Logit for Cancer Detection based on Microarray Data Classification. In 2019 7th International Conference on Information and Communication Technology (ICoICT), pp. 1–6. doi: 10.1109/ICoICT.2019.8835320.
Suyanto, 2018. Machine Learning Tingkat Dasar dan Lanjut. Bandung: Informatika Bandung.
Breiman, L., 2011. Random Forests. Netherlands: Kluwer Academic Publishers.
Nuklianggraita, T. N., Adiwijaya, and Aditsania, A., 2020. On the Feature Selection of Microarray Data for Cancer Detection based on Random Forest Classifier. Jurnal INFOTEL, 12 (3), pp. 89–96. doi: https://doi.org/10.20895/infotel.v12i3.485.
Benbelkacem, S. and Atmani, B., 2019. Random Forests for Diabetes Diagnosis. 2019 International Conference on Computer and Information Sciences (ICCIS), pp. 1–4. doi: 10.1109/ICCISci.2019.8716405.
VijiyaKumar, K., 2019. Random Forest Algorithm for the Prediction of Diabetes. Proceeding of International Conference on Systems Computation Automation and Networking 2019, pp. 1–5. doi: 10.1109/ICSCAN.2019.8878802.
Polamuri, S., 2017. How The Random Forest Algorithm Works in Machine Learning. [Online] (Update 22 May 2017). Tersedia di: https://dataaspirant.com/2017/05/22/random-forest-algorithm-machine-learing/ .
Agusta, Z. P. and Adiwijaya., 2019. Modified balanced random forest for improving imbalanced data prediction. International Journal of Advances in Intelligent Informatics, 5 (1), pp. 58–65.
Singh, D. and Singh, B., 2019. Investigating the impact of data normalization on classification performance. Applied Soft Computing Journal, pp. 1568–4946. doi: https://doi.org/10.1016/j.asoc.2019.105524.
Copyright (c) 2021 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;