Detecting Fake News on Social Media Combined with the CNN Methods
Abstract
Social media platforms are created to facilitate human social life as technology develops. Twitter is one of the most popular and frequently used social media for exchanging information. This social media platform disseminates real-time and complete information. Unfortunately, there are not a few tweets that contain false information or are often referred to as hoaxes. Those hoaxes that existed on Twitter are very troubling for society. Fake news or hoaxes can cause misunderstandings in receiving information. Therefore, this research aimed at developing a system that can detect hoaxes on Twitter to anticipate their spread, which can be detrimental to related parties. The system being developed uses a deep learning approach with the Convolutional Neural Network (CNN), Term Frequency-Inverse Document Frequency (TF-IDF), Bidirectional Encoder Representations from Transformers (BERT), and Global Vectors (GloVe). The results of this study display the fake news detected by the system using the CNN method with baseline, BERT, and GloVe. The data have been adjusted to the keywords related to fake news and spread on online media, such as Hoax or Not from Detik.com, CekFakta from Kompas.com, etc. The results show the highest accuracy of 98.57% using CNN with a split ratio of 90:10, baseline unigram-bigram, BERT, and Top10 corpus tweet+IndoNews with an increase of 4.7%.
Downloads
References
H. K. Farid, E. B. Setiawan, and I. Kurniawan, “Implementation Information Gain Feature Selection for Hoax News Detection on Twitter using Convolutional Neural Network (CNN),” Indones. J. Comput., vol. 5, no. 3, pp. 23–36, 2020, doi: 10.34818/INDOJC.2020.5.3.506.
R. K. Kaliyar, K. Fitwe, P. Rajarajeswari, and A. Goswami, “Classification of Hoax/Non-Hoax News Articles on Social Media using an Effective Deep Neural Network,” Proc. - 5th Int. Conf. Comput. Methodol. Commun. ICCMC 2021, no. Iccmc, pp. 935–941, 2021, doi: 10.1109/ICCMC51019.2021.9418282.
L. Nashif, “Detecting and Identifying Fake News on Twitter,” no. c, pp. 37–40, 2021.
B. Irena and Erwin Budi Setiawan, “Fake News (Hoax) Identification on Social Media Twitter using Decision Tree C4.5 Method,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 4, no. 4, pp. 711–716, 2020, doi: 10.29207/resti.v4i4.2125.
A. R. Jamaludin and E. B. Setiawan, “Deteksi Berita Hoax Di Media Sosial Twitter Dengan Ekspansi Fitur Menggunakan Glove,” eProceedings …, vol. 9, no. 3, pp. 1847–1854, 2022, [Online]. Available: https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/17986%0Ahttps://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/17986/17615
Y. Servatius, D. Langobelen, and E. B. Setiawan, “Ekspansi Fitur menggunakan GloVe pada Sistem Pendeteksi Hoax di Twitter,” pp. 1–11, 2021.
P. K. Laksana Utama, “Identifikasi Hoax pada Media Sosial dengan Pendekatan Machine Learning,” Widya Duta J. Ilm. Ilmu Agama dan Ilmu Sos. Budaya, vol. 13, no. 1, p. 69, 2018, doi: 10.25078/wd.v13i1.436.
I. Zukhrufillah, “Gejala Media Sosial Twitter Sebagai Media Sosial Alternatif,” Al-I’lam J. Komun. dan Penyiaran Islam, vol. 1, no. 2, p. 102, 2018, doi: 10.31764/jail.v1i2.235.
C. Boididou, S. Papadopoulos, M. Zampoglou, L. Apostolidis, O. Papadopoulou, and Y. Kompatsiaris, “Detection and visualization of misleading content on Twitter,” Int. J. Multimed. Inf. Retr., vol. 7, no. 1, pp. 71–86, 2018, doi: 10.1007/s13735-017-0143-x.
C. S. Sriyano and E. B. Setiawan, “Pendeteksian Berita Hoax Menggunakan Naive Bayes Multinomial Pada Twitter dengan Fitur Pembobotan TF-IDF,” e-Proceeding Eng. Vol.8, No.2, vol. 8, no. 2, pp. 3396–3405, 2021.
M. K. Balwant, “Bidirectional LSTM Based on POS tags and CNN Architecture for Fake News Detection,” 2019 10th Int. Conf. Comput. Commun. Netw. Technol. ICCCNT 2019, pp. 1–6, 2019, doi: 10.1109/ICCCNT45670.2019.8944460.
A. A. Kurniawan and M. Mustikasari, “Implementasi Deep Learning Menggunakan Metode CNN dan LSTM untuk Menentukan Berita Palsu dalam Bahasa Indonesia,” J. Inform. Univ. Pamulang, vol. 5, no. 4, p. 544, 2021, doi: 10.32493/informatika.v5i4.6760.
S. Sharma, M. Saraswat, and A. K. Dubey, “Fake News Detection Using Deep Learning,” Commun. Comput. Inf. Sci., vol. 1459 CCIS, pp. 249–259, 2021, doi: 10.1007/978-3-030-91305-2_19.
Febiana Anistya and Erwin Budi Setiawan, “Hate Speech Detection on Twitter in Indonesia with Feature Expansion Using GloVe,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 6, pp. 1044–1051, 2021, doi: 10.29207/resti.v5i6.3521.
F. S. Darusman, A. A. Arifiyanti, and S. F. A. Wati, “Sentiment Analysis Pedulilindungi Tweet Using Support Vector Machine Method,” Appl. Technol. Comput. Sci. J., vol. 4, no. 2, pp. 113–118, 2022, doi: 10.33086/atcsj.v4i2.2836.
C. W. Kencana, E. B. Setiawan, and I. Kurniawan, “Hoax Detection on Twitter using Feed-forward and Back-propagation Neural Networks Method,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 1, no. 3, pp. 648–654, 2017.
W. Pamungkas and S. Suryani, “Deteksi Hoax Untuk Berita Hoax Covid 19 Indonesia Menggunakan CNN,” vol. 8, no. 5, pp. 10264–10276, 2021.
B. Anthony, C. Martani, and E. B. Setiawan, “Naïve Bayes-Support Vector Machine Combined BERT to Classified Big Five Personality on Twitter,” vol. 5, no. 158, pp. 1072–1078, 2022.
R. K. Kaliyar, A. Goswami, and P. Narang, “FakeBERT: Fake news detection in social media with a BERT-based deep learning approach,” Multimed. Tools Appl., vol. 80, no. 8, pp. 11765–11788, 2021, doi: 10.1007/s11042-020-10183-2.
Y. Dong, P. Liu, Z. Zhu, Q. Wang, and Q. Zhang, “A Fusion Model-Based Label Embedding and Self-Interaction Attention for Text Classification,” IEEE Access, vol. 8, pp. 30548–30559, 2020, doi: 10.1109/ACCESS.2019.2954985.
A. N. Assidyk, E. B. Setiawan, and I. Kurniawan, “Analisis Perbandingan Pembobotan TF-IDF dan TF-RF pada Trending Topic di Twitter dengan Menggunakan Klasifikasi K-Nearest Neighbor,” e-Proceeding Eng., vol. 7, no. 2, pp. 7773–7781, 2020.
P. M. Brennan, J. J. M. Loan, N. Watson, P. M. Bhatt, and P. A. Bodkin, “GloVe: Global Vectors for Word Representation,” Br. J. Neurosurg., vol. 31, no. 6, pp. 682–687, 2017, doi: 10.1080/02688697.2017.1354122.
A. M. Zakaria and E. B. Setiawan, “Aspect-Based Analysis of Telkomsel User Sentiment on Twitter Using the Random Forest Classification Method and Glove Feature Expansion,” J. Teknol. dan Sist. …, no. September, 2022, [Online]. Available: https://jtsiskom.undip.ac.id/article/view/14558
E. B. Setiawan, D. H. Widyantoro, and K. Surendro, “Feature expansion using word embedding for tweet topic classification,” Proceeding 2016 10th Int. Conf. Telecommun. Syst. Serv. Appl. TSSA 2016 Spec. Issue Radar Technol., no. 2011, 2017, doi: 10.1109/TSSA.2016.7871085.
Q. Zhao et al., “Prediction of plant-derived xenomiRs from plant miRNA sequences using random forest and one-dimensional convolutional neural network models,” BMC Genomics, vol. 19, no. 1, pp. 1–13, 2018, doi: 10.1186/s12864-018-5227-3.
R. C. Riana and Y. Sibaroni, “Hoax Detector of Covid 19 Indonesia in twitter using Rocchio Classification Method,” vol. 8, no. 5, pp. 10427–10439, 2021.
I. M. Mubaroq and E. B. Setiawan, “The Effect of Information Gain Feature Selection for Hoax Identification in Twitter Using Classification Method Support Vector Machine,” Indones. J. …, vol. 5, no. September, pp. 107–118, 2020, doi: 10.21108/indojc.2020.5.2.499.
Copyright (c) 2023 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;