Perbandingan Support Vector Machine dan Modified Balanced Random Forest dalam Deteksi Pasien Penyakit Diabetes
Abstract
Diabetes (diabetes) was a metabolic disorder caused by high levels of sugar in the blood caused by disorders of the pancreas and insulin. According to data from the Ministry of Health of the Republic of Indonesia, Diabetes was the third-largest cause of death in Indonesia with a percentage of 6.7%. The high rate of death from diabetes encouraged this study, with the aim of early detection. This research used a Machine Learning approach to classify the data. In this paper, a comparison of Support Vector Machine (SVM) and Modified Balanced Random Forest (MBRF) was discussed for classifying diabetes patient data. Both methods were chosen because it was proven in previous studies to get high accuracy, so that the two methods are compared to find the best classification model. Several preprocessing methods were used to prepare the data for the classification process. The entire combination of preprocessing steps will be carried out on the two classification methods to produce the same dataset. The evaluation was carried out using the Confusion Matrix method. Based on the experimental results in the process of testing the system being built, the maximum performance results were 87.94% using SVM and 97.8% using MBRF.
Downloads
References
Kemenkes RI, “Hari Diabetes Sedunia Tahun 2018,” Pus. Data dan Inf. Kementrian Kesehat. RI, pp. 1–8, 2018.
J. L. Harding, M. E. Pavkov, D. J. Magliano, J. E. Shaw, and E. W. Gregg, “Global trends in diabetes complications: a review of current evidence,” Diabetologia, vol. 62, no. 1, pp. 3–16, 2019, doi: 10.1007/s00125-018-4711-2.
WHO, “Global Report on Adult Learning Executive Summary,” 2016, [Online]. Available: http://apps.who.int/iris/bitstream/10665/204874/1/WHO_NMH_NVI_16.3_eng.pdf?ua=1.
K. Kannadasan, D. R. Edla, and V. Kuppili, “Type 2 diabetes data classification using stacked autoencoders in deep neural networks,” Clin. Epidemiol. Glob. Heal., vol. 7, no. 4, pp. 530–535, 2019, doi: 10.1016/j.cegh.2018.12.004.
J. P. Kandhasamy and S. Balamurali, “Performance analysis of classifier models to predict diabetes mellitus,” Procedia Comput. Sci., vol. 47, no. C, pp. 45–51, 2015, doi: 10.1016/j.procs.2015.03.182.
G. A. B. Suryanegara, Adiwijaya, and M. D. Purbolaksono, “Peningkatan Hasil Klasifikasi pada Algoritma Random Forest untuk Deteksi Pasien Penderita Diabetes Menggunakan Metode Normalisasi,” Resti, vol. 5, no. 1, pp. 114–122, 2021.
Agatsa, D.A, Rismala, R, and Wisesty, U.N, “Klasifikasi Pasien Pengidap Diabetes menggunakan Metode Support Vector Machine,” J. Telkom Univ., vol. 7, no. 1, pp. 1–9, 2020.
H. Aydadenta and Adiwijaya, “On the classification techniques in data mining for microarray data classification,” 2018, doi: 10.1088/1742-6596/971/1/012004.
Adiwijaya, U. N. Wisesty, E. Lisnawati, A. Aditsania, and D. S. Kusumo, “Dimensionality reduction using Principal Component Analysis for cancer detection based on microarray data classification,” J. Comput. Sci., vol. 14, no. 11, pp. 1521–1530, 2018, doi: 10.3844/jcssp.2018.1521.1530.
Z. P. Agusta and Adiwijaya, “Modified balanced random forest for improving imbalanced data prediction,” Int. J. Adv. Intell. Informatics, vol. 5, no. 1, pp. 58–65, 2019, doi: 10.26555/ijain.v5i1.255.
R. A. Wijayanti, M. T. Furqon, and S. Adinugroho, “Penerapan Algoritme Support Vector Machine Terhadap Klasifikasi Tingkat Risiko Pasien Gagal Ginjal,” J. Pengemb. Teknol. Inf. dan Ilmu Komput. Univ. Brawijaya, vol. 2, no. 10, pp. 3500–3507, 2018, [Online]. Available: http://j-ptiik.ub.ac.id/index.php/j-ptiik/article/download/2647/991/.
D. A. Nasution, H. H. Khotimah, and N. Chamidah, “Perbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma K-NN,” Comput. Eng. Sci. Syst. J., vol. 4, no. 1, p. 78, 2019, doi: 10.24114/cess.v4i1.11458.
Suyanto, Data Mining Untuk Klasifikasi Dan Klasterisasi Data. 2019.
E. Excavations, L. Classifiers, E. García-gonzalo, Z. Fernández-muñiz, P. José, and G. Nieto, “Hard-Rock Stability Analysis for Span Design in,” pp. 1–19, 2016, doi: 10.3390/ma9070531.
A. S. Nugroho, A. B. Witarto, and D. Handoko, “Support Vector Machine, Teori dan Aplikasinya dalam Bioinformatika,” Proc. Indones. Sci. Meet. Cent. Japan, 2013, doi: 10.1109/CCDC.2011.5968300.
S. V. . Nugroho, “Paradigma Baru Dalam SoftComputing dan Aplikasinya,” 2018.
Y. L. Pavlov, “Random forests,” Random For., pp. 1–122, 2019, doi: 10.1201/9780429469275-8.
P. Gulati, A. Sharma, and M. Gupta, “Theoretical Study of Decision Tree Algorithms to Identify Pivotal Factors for Performance Improvement: A Review,” Int. J. Comput. Appl., vol. 141, no. 14, pp. 19–25, 2016, doi: 10.5120/ijca2016909926.
Copyright (c) 2021 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;