Perbandingan Metode TF-ABS dan TF-IDF Pada Klasifikasi Teks Helpdesk Menggunakan K-Nearest Neighbor

Riza Adrianti Supono; Muhammad Azis Suprayogi

doi:10.29207/resti.v5i5.3403

Riza Adrianti Supono Gunadarma University
Muhammad Azis Suprayogi Gunadarma University https://orcid.org/0000-0002-9860-4608

DOI: https://doi.org/10.29207/resti.v5i5.3403

Keywords: helpdesk, term weighting, text classification, tf-abs, tf-idf

Abstract

Distribution of tickets to the destination unit is a very important function in the helpdesk application, but the process of distributing tickets manually by admin officers has drawbacks, namely ticket distribution errors can occur and increase ticket completion time if the number of tickets is large. Helpdesk text classification becomes important to automatically distribute tickets to the appropriate destination units in a short time. This study was conducted to compare the performance of helpdesk text classification at the Directorate General of State Assets of the Ministry of Finance using the K-Nearest Neighbor (KNN) method with the TF-ABS and TF-IDF weighting methods. The research was conducted by collecting complaint documents, preprocessing, word weighting, feature reduction, classification, and testing. Classification using KNN with parameters n_neighbor (k) namely k=1, k=3, k=5, k=7, k=9, k=11, k=13, k=15, k=17, and k=19 to classify 10,537 helpdesk texts into 8 categories. The test uses a confusion matrix based on the accuracy value and score-f1. The test results show that the TF-ABS weighting method is better than TF-IDF with the highest accuracy value of 90.04% at 15% and k=3.

Downloads

Download data is not yet available.

Author Biography

Riza Adrianti Supono, Gunadarma University

Gunadarma University

References

R. Feldman and J. Sanger, The Text Mining Handbook. 2006.

A. Kulkarni and A. Shivananda, Natural Language Processing Recipes. New York, USA: Apress, 2019.

Okfalisa, I. Gazalba, Mustakim, and N. G. I. Reza, “Comparative analysis of k-nearest neighbor and modified k-nearest neighbor algorithm for data classification,” Proc. - 2017 2nd Int. Conf. Inf. Technol. Inf. Syst. Electr. Eng. ICITISEE 2017, vol. 2018-Janua, pp. 294–298, 2018, doi: 10.1109/ICITISEE.2017.8285514.

C. F. Suharno, M. A. Fauzi, and R. S. Perdana, “Klasifikasi Teks Bahasa Indonesia Pada Dokumen Pengaduan Sambat Online Menggunakan Metode K-Nearest Neighbors dan Chi-Square,” Syst. Inf. Syst. Informatics J., vol. 03, no. 01, pp. 25–32, 2017.

A. Indriani, “Analisa Perbandingan Metode Naïve Bayes Classifier Dan K-Nearest Neighbor Terhadap Klasifikasi Data,” Sebatik, vol. 24, no. 1, pp. 1–7, 2020, doi: 10.46984/sebatik.v24i1.909.

L. A. Utami, “Melalui Komparasi Algoritma Support Vector Machine Dan K-Nearest Neighbor Berbasis Particle Swarm Optimization,” vol. 13, no. 1, pp. 103–112, 2017.

L. D. Utami, “Komparasi Algoritma Klasifikasi Pada Analisis Review Hotel,” J. Pilar Nusa Mandiri, vol. 14, no. 2, p. 261, 2018, doi: 10.33480/pilar.v14i2.1023.

M. ALTINTAġ and A. C. TANTUĞ, “Machine learning based software development,” vol. 21, no. 3, pp. 33–44, 2014.

M. A. Kurniawan, Y. Sibaroni, and K. L. Muslim, “Kategorisasi Berita Menggunakan Metode Pembobotan TF.ABS dan TF.CHI,” Indones. J. Comput., vol. 3, no. 2, p. 83, 2018, doi: 10.21108/indojc.2018.3.2.236.

M. Lan, C. L. Tan, J. Su, and Y. Lu, “Supervised and traditional term weighting methods for automatic text categorization,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 4, pp. 721–735, 2009, doi: 10.1109/TPAMI.2008.110.

F. Debole and F. Sebastiani, “Supervised Term Weighting for Automated Text Categorization,” in Supervised Term Weighting for Automated Text Categorization, 2003, no. December 2015, doi: 10.1145/952686.952688.

N. G. Yudiarta, M. Sudarma, and W. G. Ariastina, “Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data,” Maj. Ilm. Teknol. Elektro, vol. 17, no. 3, p. 339, 2018, doi: 10.24843/mite.2018.v17i03.p06.

P. Bafna, D. Pramod, and A. Vaidya, “Document clustering: TF-IDF approach,” Int. Conf. Electr. Electron. Optim. Tech. ICEEOT 2016, no. November, pp. 61–66, 2016, doi: 10.1109/ICEEOT.2016.7754750.

L. A. Matsunaga and N. F. F. Ebecken, “Two Novel Weighting for Text Categorization,” in Data Mining IX - Data Mining, Protection, Detection and other Security Technologies, IX., A. Zanasi, D. Almorza Gomar, N. F. . Ebecken, and C. . Brebbia, Eds. Rio de Janeiro, Brazil: WITPRESS, 2008, pp. 105–114.

J. Li et al., “Feature selection: A data perspective,” ACM Comput. Surv., vol. 50, no. 6, 2017, doi: 10.1145/3136625.

J. Han, M. Kamber, and J. Pei, Data Mining Concepts and Techniques - third edition. 2012.

D. Yuliana and C. Supriyanto, “Klasifikasi Teks Pengaduan Masyarakat Dengan Menggunakan Algoritma Neural Network,” UPI YPTK J. KomTekInfo, vol. 5, no. 3, pp. 92–116, 2019.

L. A. Andika, P. A. N. Azizah, and R. Respatiwulan, “Analisis Sentimen Masyarakat terhadap Hasil Quick Count Pemilihan Presiden Indonesia 2019 pada Media Sosial Twitter Menggunakan Metode Naive Bayes Classifier,” Indones. J. Appl. Stat., vol. 2, no. 1, p. 34, 2019, doi: 10.13057/ijas.v2i1.29998.