Sentiment Analysis on Social Media with Glove Using Combination CNN and RoBERTa

  • Diaz Tiyasya Putra Telkom University
  • Erwin Budi Setiawan Telkom University
Keywords: Sentiment Analysis, CNN, Twitter, RoBERTa, Glove

Abstract

Twitter is a popular social media platform that allows users to share short message’s opinion and engage in real-time conversations on a wide range of topics known as tweet. However, tweets often have a complicated and unclear context, which makes it difficult to determine the actual emotion. Therefore, sentiment analysis is required to see the tendency of an opinion, whether the opinion tends to be positive, negative, or neutral. Researchers or institutions can find out how the response and emotions of an issue are happening and make good decisions. With the large user of Twitter social media in Indonesia, sentiment analysis will be carried out using deep learning Convolutional Neural Network (CNN), Term Frequency-Inverse Document Frequency (TF-IDF), Robustly Optimized BERT Pretraining Approach (RoBERTa), Synthetic Minority Over-sampling Technique (SMOTE), and Global Vector (Glove). In this research, the dataset used is trending topics with hashtags related to government policies on Twitter social media and obtained through crawling. By using 30.811 data, the result shows the highest accuracy of 95.56% using CNN with a split ratio of 90:10, baseline unigram, RoBERTa, SMOTE, and Top10 corpus tweet with an increase 10.1%.

Downloads

Download data is not yet available.

References

T. Siswanto, “Optimalisasi Sosial Media Sebagai Media Pemasaran Usaha Kecil Menengah,” Liquidity, vol. 2, no. 1, pp. 80–86, 2018, doi: 10.32546/lq.v2i1.134.

databoks.katadata.co.id, “Berapa Pengguna Media Sosial Indonesia?,” 2019. https://databoks.katadata.co.id/datapublish/2019/02/08/berapa-pengguna-media-sosial-indonesia (accessed Apr. 22, 2022).

databoks.katadata.co.id, “Pengguna Twitter Indonesia Masuk Daftar Terbanyak di Dunia, Urutan Berapa?,” 2022. [Online]. Available: https://databoks.katadata.co.id/datapublish/2022/03/23/pengguna-twitter-indonesia-masuk-daftar-terbanyak-di-dunia-urutan-berapa

M. Cindo and D. P. Rini, Seminar Nasional Teknologi Komputer & Sains (SAINTEKS) Literatur Review: Metode Klasifikasi Pada Sentimen Analisis. 2019. [Online]. Available: https://seminar-id.com/semnas-sainteks2019.html

J. Tao and X. Fang, “Toward multi-label sentiment analysis: a transfer learning based approach,” J. Big Data, vol. 7, no. 1, pp. 1–26, 2020, doi: 10.1186/s40537-019-0278-0.

W. A. Prabowo and C. Wiguna, “Sistem Informasi UMKM Bengkel Berbasis Web Menggunakan Metode SCRUM,” J. Media Inform. Budidarma, vol. 5, no. 1, p. 149, 2021, doi: 10.30865/mib.v5i1.2604.

Samsir, Kusmanto, Abdul Hakim Dalimunthe, Rahmad Aditiya, and Ronal Watrianthos, “Implementation Naïve Bayes Classification for Sentiment Analysis on Internet Movie Database,” Building of Informatics, Technology and Science (BITS), vol. 4, no. 1, pp. 1–6, Jun. 2022.

F. Pradana Rachman, H. Santoso, and R. Artikel, “Jurnal Teknologi dan Manajemen Informatika Perbandingan Model Deep Learning Untuk Klasifikasi Sentiment Analysis Dengan Teknik Natural Languange Processing Info Artikel ABSTRAK,” vol. 7, no. 2, pp. 103–112, 2021, [Online]. Available: http://http//jurnal.unmer.ac.id/index.php/jtmi

M. R. Aldiansyah and P. S. Sasongko, “Twitter Sentiment Analysis about Public Opinion on 4G Smartfren Network Services Using Convolutional Neural Network,” ICICOS 2019 - 3rd Int. Conf. Informatics Comput. Sci. Accel. Informatics Comput. Res. Smarter Soc. Era Ind. 4.0, Proc., pp. 3–8, 2019, doi: 10.1109/ICICoS48119.2019.8982429.

M. Mahrus Zain, R. Nathamael Simbolon, H. Sulung, and Z. Anwar, “Analisis Sentimen Pendapat Masyarakat Mengenai Vaksin Covid-19 Pada Media Sosial Twitter dengan Robustly Optimized BERT Pretraining Approach,” J. Komput. Terap., vol. 7, no. Vol. 7 No. 2 (2021), pp. 280–289, 2021, doi: 10.35143/jkt.v7i2.4782.

Febiana Anistya and Erwin Budi Setiawan, “Hate Speech Detection on Twitter in Indonesia with Feature Expansion Using GloVe,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 6, pp. 1044–1051, 2021, doi: 10.29207/resti.v5i6.3521.

H. S. Batubara, Ambiyar, Syahril, Fadhilah, and R. Watrianthos, “Sentiment Analysis of Face-To-Face Learning Based on Social Media,” Jurnal Pendidikan Teknologi Kejuruan, vol. 4, no. 3, pp. 102–106, 2021.

A. Wattimena, “Analisis Sentimen Teks Bahasa Indonesia Pada Media Sosial Menggunakan Algoritma Convolutional Neural Network (Studi Kasus : E-Commerce),” 2018, [Online]. Available: https://repository.its.ac.id/52636/1/05211440000134-Undergraduate_Theses.pdf

F. Koto and G. Y. Rahmaningtyas, “Inset lexicon: Evaluation of a word list for Indonesian sentiment analysis in microblogs,” Proc. 2017 Int. Conf. Asian Lang. Process. IALP 2017, vol. 2018-Janua, no. December, pp. 391–394, 2018, doi: 10.1109/IALP.2017.8300625.

H. T. Duong and T. A. Nguyen-Thi, “A review: preprocessing techniques and data augmentation for sentiment analysis,” Comput. Soc. Networks, vol. 8, no. 1, pp. 1–16, 2021, doi: 10.1186/s40649-020-00080-x.

D. T. Hermanto, A. Setyanto, and E. T. Luthfi, “Algoritma LSTM-CNN untuk Binary Klasifikasi dengan Word2vec pada Media Online,” Creat. Inf. Technol. J., vol. 8, no. 1, p. 64, 2021, doi: 10.24076/citec.2021v8i1.264.

N. Buslim, B. Busman, N. S. Sinatrya, and T. S. Kania, “Analisa Sentimen Menggunakan Data Twitter, Flume, Hive Pada Hadoop dan Java Untuk Deteksi Kemacetan di Jakarta,” J. Online Inform., vol. 3, no. 1, p. 1, 2018, doi: 10.15575/join.v3i1.141.

R. K. Bania, “COVID-19 Public Tweets Sentiment Analysis using TF-IDF and Inductive Learning Models,” Infocomp, vol. 19, no. 2, pp. 23–41, 2020.

A. Barua, S. Thara, and K. P. Soman, nalysis of Contextual and Non-contextual Word Embedding Models for Hindi NER with Web Application for Data Collection, vol. 223, no. 4643. 2021. doi: 10.1126/science.223.4643.1350.a.

A. E. Putra and W. Maharani, “Depression Levels Detection Through Twitter Tweets Using RoBERTa Method,” J. Inf. Syst. Res., vol. 3, no. 4, pp. 453–459, 2022, doi: 10.47065/josh.v3i4.1872.

E. Sutoyo and M. A. Fadlurrahman, “Penerapan SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Television Advertisement Performance Rating Menggunakan Artificial Neural Network,” J. Edukasi dan Penelit. Inform., vol. 6, no. 3, p. 379, 2020, doi: 10.26418/jp.v6i3.42896.

M. Zulqarnain, R. Ghazali, Y. M. M. Hassim, and M. Rehan, “A comparative review on deep learning models for text classification,” Indones. J. Electr. Eng. Comput. Sci., vol. 19, no. 1, pp. 325–335, 2020, doi: 10.11591/ijeecs.v19.i1.pp325-335.

W. Meng, Y. Wei, P. Liu, Z. Zhu, and H. Yin, “Aspect Based Sentiment Analysis with Feature Enhanced Attention CNN-BiLSTM,” IEEE Access, vol. 7, pp. 167240–167249, 2019, doi: 10.1109/ACCESS.2019.2952888.

Q. Zhao et al., “Prediction of plant-derived xenomiRs from plant miRNA sequences using random forest and one-dimensional convolutional neural network models,” BMC Genomics, vol. 19, no. 1, pp. 1–13, 2018, doi: 10.1186/s12864-018-5227-3.

K. L. Kohsasih et al., “Analisis Perbandingan Algoritma Convolutional Neural Network Dan Algoritma Multi-Layer Perceptron Neural Dalam Klasifikasi Citra Sampah,” J. Technol. Informatics dan Comput. Syst., vol. 10, no. 2, pp. 22–28, 2021, [Online]. Available: http://ejournal.stmik-time.ac.id

B. Anthony, C. Martani, and E. B. Setiawan, “JURNAL RESTI Five Personality on Twitter,” vol. 5, no. 158, pp. 1072–1078, 2022.

Published
2023-06-01
How to Cite
Diaz Tiyasya Putra, & Erwin Budi Setiawan. (2023). Sentiment Analysis on Social Media with Glove Using Combination CNN and RoBERTa . Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 7(3), 457 - 563. https://doi.org/10.29207/resti.v7i3.4892
Section
Information Technology Articles

Most read articles by the same author(s)

1 2 > >>