Perbandingan Performansi Algoritma Pengklasifikasian Terpandu Untuk Kasus Penyakit Kardiovaskular

  • Adi Nugroho Universitas Narotama
  • Agustinus Bimo Gumelar Universitas Narotama
  • Adri Gabriel Sooai Universitas Katolik Widya Mandira
  • Dyana Sarvasti Universitas Katolik Widya Mandala
  • Paul L Tahalele Universitas Katolik Widya Mandala
Keywords: klasifikasi, penyakit kardiovaskular, algoritma k-nearest neighboar, stochastic gradient descent, random forest, neural network, logistic regression

Abstract

One of the health problems that occur in Indonesia is the increasing number of NCD (Non-Communicable Disease) such as heart attack and cardiovascular disease. There are two factors that cause cardiovascular disease, i.e. factor that can be changed and cannot be changed. This study aim to analyze the best performance of several classification algorithms such as k-nearest neighbors algorithm (k-NN), stochastic gradient descent (SGD), random forest (RF), neural network (NN) and logistic regression (LR) in classifying cardiovascular based on factors that caused those diseases. There are two aspects that need to be examined, the performance of each algorithm which is evaluated using the Confusion matrix method with the parameters of accuracy, precision, recall and AUC (Area Under the Curve). The dataset uses 425.195 samples from result data of cardiovascular disease diagnosed. The testing mode uses percentage split and cross-validation technique. The experimental results show that the performance of NN algorithms produces the best prediction accuracy compared to other algorithms, which is accuracy of 89.60%, AUC of 0.873, precision of 0.877, and recall of  0.896 using percentage split  and cross-validation testing mode using Orange. For the accuracy of 89.46%, AUC of 0.865, precision of 0.875, and recall of 0.895 using cross-validation testing mode using Weka. By KNIME, the result of accuracy value is 88.55%, AUC value is 0.768, precision value is 0.854, and recall value is 0.886 using cross-validation testing mode.

Downloads

Download data is not yet available.

References

Team, R, P., 2017. Buku Saku Pedoman Pengkajian dan Pengelolaan Risiko Penyakit Kardiovaskuler Versi Bahasa Indonesia. Yogyakarta: Universitas Gajah Mada.

Kemenkes RI, 2014. Situasi kesehatan jantung. Jakarta: INFODATIN Pusat Data dan Informasi Kementerian Kesehatan RI.

Perhimpunan Dokter Spesialis Kardiovaskular Indonesia ( P E R K I ), 2019. Press Release, World Heart Day PERKI 2019 [Online] (Updated 26 Sept 2019)

Tersedia di: http://www.inaheart.org/news_and_events/news/2019/9/26/press_release_world_heart_day_perki_2019. [Accessed: 01-Jul-2020].

Laderas, T., Vasilevsky, N., Pederson, B., Haendel, M., McWeeney, S. and Dorr, D.A., 2018. Teaching data science fundamentals through realistic synthetic clinical cardiovascular data. bioRxiv, p.232611.

SangeethaLakshmi, M.G. and Jayashree, M.M., 2019. Comparative Analysis of Various Tools for Data Mining and Big Data Mining. International Research Journal of Engineering and Technology (IRJET), 6(4), pp.704–708.

Saberioon, M., Císař, P., Labbé, L., Souček, P., Pelissier, P. and Kerneis, T., 2018. Comparative performance analysis of support vector machine, random forest, logistic regression and k-nearest neighbours in rainbow trout (oncorhynchus mykiss) classification using image-based features. Sensors, 18(4), p.1027.

Ruder, S., 2016. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747.

Thanh Noi, P. and Kappas, M., 2018. Comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using Sentinel-2 imagery. Sensors, 18(1), p.18.

Dewi, S., 2016. Komparasi 5 metode algoritma klasifikasi data mining pada prediksi keberhasilan pemasaran produk layanan perbankan. Jurnal Techno Nusa Mandiri, 13(1), pp.60-65.

Gupta, C., Sharma, S.G. and Bansal, M.G., 2007. Implementation of Back Propagation Algorithm (of neural networks) in VHDL Doctoral dissertation: THAPAR INSTITUTE.

Saritas, M.M. and Yasar, A., 2019. Performance analysis of ANN and Naive Bayes classification algorithm for data classification. International Journal of Intelligent Systems and Applications in Engineering (IJISAE), 7(2), pp.88-91.

Nugroho, K.S., 2019. Confusion Matrix untuk Evaluasi Model pada Supervised Learning [Online] (Updated 13 Nov 2019)

Tersedia di: https://medium.com/@ksnugroho/confusion-matrix-untuk-evaluasi-model-pada-unsupervised-machine-learning-bc4b1ae9ae3f. [Accessed: 01-Jul-2020]

Rosandy, T., 2017. Perbandingan Metode Naive Bayes Classifier Dengan Metode Decision Tree (C4. 5) Untuk Menganalisa Kelancaran Pembiayaan (Study Kasus: Kspps/Bmt Al-fadhila. Jurnal Teknologi Informasi Magister, 2(01), pp.52-62.

Published
2020-10-30
How to Cite
Adi Nugroho, Agustinus Bimo Gumelar, Adri Gabriel Sooai, Dyana Sarvasti, & Paul L Tahalele. (2020). Perbandingan Performansi Algoritma Pengklasifikasian Terpandu Untuk Kasus Penyakit Kardiovaskular. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 4(5), 998-1006. https://doi.org/10.29207/resti.v4i5.2316
Section
Information Systems Engineering Articles