Depression Detection on Twitter Social Media Using Decision Tree
Abstract
Depression is a major mood illness that causes patients to experience significant symptoms that interfere with their daily activities. As technology has developed, people now frequently express themselves through social media, especially Twitter. Twitter is a social media platform that allows users to post tweets and communicate with each other. Therefore, detecting depression based on social media can help in early treatment for sufferers before further treatment. This study created a system to detect if a person is indicating depression or not based on Depression Anxiety and Stress Scale - 42 (DASS-42) and their tweets using the Classification and Regression Tree (CART) method with TF-IDF feature extraction. The results show that the most optimal model achieved an accuracy score of 81.25% and an f1 score of 85.71%, which are higher than baseline results with an accuracy score of 62.50% and an f1 score of 66.66%. In addition, we found that there were significant effects on changing the value of the maximum features in TF-IDF and changing the maximum depth of the tree to the model performance.
Downloads
References
National Institute of Mental Health, “NIMH » Depression,” Feb. 2018. https://www.nimh.nih.gov/health/topics/depression (accessed Nov. 01, 2021).
World Health Organization, “Depression.” https://www.who.int/news-room/fact-sheets/detail/depression (accessed Nov. 01, 2021).
A. Budiman, J. C. Young, and A. Suryadibrata, “Implementasi Algoritma Naïve Bayes untuk Klasifikasi Konten Twitter dengan Indikasi Depresi,” J. Inform. J. Pengemb. IT, vol. 6, no. 2, pp. 133–138, 2021, doi: http://dx.doi.org/10.30591/jpit.v6i2.2419.
R. Watrianthos, M. Giatman, W. Simatupang, R. Syafriyeti, and N. K. Daulay, “Analisis Sentimen Pembelajaran Campuran Pada Twitter Data Menggunakan Algoritma Naïve Bayes,” Analisis Sentimen Pembelajaran Campuran Pada Twitter Data Menggunakan Algoritma Naïve Bayes, vol. 6, no. 1, pp. 166–170, 2022, doi: http://dx.doi.org/10.30865/mib.v6i1.3383
W. Akram and R. Kumar, “A Study on Positive and Negative Effects of Social Media on Society,” Int. J. Comput. Sci. Eng., vol. 5, no. 10, pp. 351–354, 2017, doi: 10.26438/ijcse/v5i10.351354.
Samsir, Ambiyar, U. Verawardina, F. Edi, and R. Watrianthos, “Analisis Sentimen Pembelajaran Daring Pada Twitter di Masa Pandemi COVID-19,” JURNAL MEDIA INFORMATIKA BUDIDARMAJURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 5, no. 10, pp. 174–179, 2021, doi: 10.30865/mib.v4i4.2293
DataReportal, “Global Social Media Statistics — DataReportal – Global Digital Insights.” https://datareportal.com/social-media-users (accessed Jun. 23, 2022).
Twitter, “Pertanyaan Umum pengguna baru.” https://help.twitter.com/id/resources/new-user-faq (accessed Nov. 01, 2021).
Samsir et al., “Naives Bayes Algorithm for Twitter Sentiment Analysis,” Journal of Physics: Conference Series, vol. 1933, no. 1, p. 012019, 2021, doi: 10.1088/1742-6596/1933/1/012019
Statista, “• Twitter: most users by country | Statista.” https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/ (accessed Jun. 23, 2022).
K. S. Nugroho, I. Akbar, A. N. Suksmawati, and I. Istiadi, “Deteksi Depresi dan Kecemasan Pengguna Twitter,” 4th Conf. Innov. Appl. Sci. Technol. (CIASTECH 2021), no. Ciastech, pp. 287–296, 2021.
A. H. Orabi, P. Buddhitha, M. H. Orabi, and D. Inkpen, “Deep Learning for Depression Detection of Twitter Users,” Gift. Child Q., vol. 19, no. 2, pp. 175–180, 1975, doi: 10.1177/001698627501900225.
A. Rizki and Y. Sibaroni, “Analisis Sentimen Untuk Pengukuran Tingkat Depresi Pengguna,” e-Proceeding Eng., vol. 8, no. 5, pp. 11367–11375, 2021.
H. S. Alsagri and M. Ykhlef, “Machine Learning-Based Approach for Depression Detection in,” IEICE Trans. Inf. Syst., vol. 103, no. 8, pp. 1825–1832, 2020, doi: https://doi.org/10.1587/transinf.2020EDP7023.
L. Yang, D. Jiang, L. He, E. Pei, M. C. Oveneke, and H. Sahli, “Decision tree based depression classification from audio video and language information,” AVEC 2016 - Proc. 6th Int. Work. Audio/Visual Emot. Challenge, co-located with ACM Multimed. 2016, pp. 89–96, 2016, doi: 10.1145/2988257.2988269.
S. Kusumadewi and H. Wahyuningsih, “Model Sistem Pendukung Keputusan Kelompok untuk Penilaian Gangguan Depresii, Kecemasan dan Stress Berdasarkan DASS-42,” J. Teknol. Inf. dan Ilmu Komput., vol. 7, no. 2, p. 219, 2020, doi: 10.25126/jtiik.2020721052.
A. Krouska, C. Troussas, and M. Virvou, “The effect of preprocessing techniques on Twitter sentiment analysis,” IISA 2016 - 7th Int. Conf. Information, Intell. Syst. Appl., 2016, doi: 10.1109/IISA.2016.7785373.
M. A. Rosid, A. S. Fitrani, I. R. I. Astutik, N. I. Mulloh, and H. A. Gozali, “Improving Text Preprocessing for Student Complaint Document Classification Using Sastrawi,” IOP Conf. Ser. Mater. Sci. Eng., vol. 874, no. 1, pp. 0–6, 2020, doi: 10.1088/1757-899X/874/1/012017.
A. Fatima, N. Nazir, and M. G. Khan, “Data Cleaning In Data Warehouse: A Survey of Data Pre-processing Techniques and Tools,” Int. J. Inf. Technol. Comput. Sci., vol. 9, no. 3, pp. 50–61, 2017, doi: 10.5815/ijitcs.2017.03.06.
I. Daga, A. Gupta, R. Vardhan, and P. Mukherjee, “Prediction of likes and retweets using text information retrieval,” Procedia Comput. Sci., vol. 168, pp. 123–128, 2020, doi: 10.1016/j.procs.2020.02.273.
F. P. Shah and V. Patel, “A review on feature selection and feature extraction for text classification,” Proc. 2016 IEEE Int. Conf. Wirel. Commun. Signal Process. Networking, WiSPNET 2016, pp. 2264–2268, 2016, doi: 10.1109/WiSPNET.2016.7566545.
S. Qaiser and R. Ali, “Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents,” Int. J. Comput. Appl., vol. 181, no. 1, pp. 25–29, 2018, doi: 10.5120/ijca2018917395.
R. Puspita and A. Widodo, “Perbandingan Metode KNN, Decision Tree, dan Naïve Bayes Terhadap Analisis Sentimen Pengguna Layanan BPJS,” J. Inform. Univ. Pamulang, vol. 5, no. 4, p. 646, 2021, doi: 10.32493/informatika.v5i4.7622.
D. Setiawati, I. Taufik, J. Jumadi, and W. B. Zulfikar, “Klasifikasi Terjemahan Ayat Al-Quran Tentang Ilmu Sains Menggunakan Algoritma Decision Tree Berbasis Mobile,” J. Online Inform., vol. 1, no. 1, p. 24, Jun. 2016, doi: 10.15575/join.v1i1.7.
Shamrat F.M. Javed Mehedi, Ranjan Rumesh, Hasib Khan Md., Yadav Amit, and Siddique Abdul Hasib, “Pervasive Computing and Social Networking: Proceedings of ICPCSN 2021 - G. Ranganathan - Google Books.” https://books.google.co.id/books?hl=en&lr=&id=oWRXEAAAQBAJ&oi=fnd&pg=PA127&dq=%5B1%5D%09Shamrat,+F.J.M.,+Ranjan,+R.,+Md,+K.,+Hasib,+A.Y.+and+Siddique,+A.H.,+“Performance+Evaluation+among+ID3,+C4.+5,+and+CART+Decision+Tree+Algorithms,”+in+Pervasive+Computi (accessed Jun. 30, 2022).
C. L. Lin and C. L. Fan, “Evaluation of CART, CHAID, and QUEST algorithms: a case study of construction defects in Taiwan,” J. Asian Archit. Build. Eng., vol. 18, no. 6, pp. 539–553, 2019, doi: 10.1080/13467581.2019.1696203.
O. Caelen, “A Bayesian interpretation of the confusion matrix,” Ann. Math. Artif. Intell., vol. 81, no. 3–4, pp. 429–450, 2017, doi: 10.1007/s10472-017-9564-8.
S. Setiawan, “Membicarakan Precision, Recall, dan F1-Score | by Stevanus Setiawan | Medium.” https://stevkarta.medium.com/membicarakan-precision-recall-dan-f1-score-e96d81910354 (accessed Jun. 30, 2022).
Copyright (c) 2022 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;