Using Social Media Data to Monitor Natural Disaster: A Multi Dimension Convolutional Neural Network Approach with Word Embedding
Abstract
Social media has a significant role in natural disaster management, namely as an early warning and monitoring when natural disasters occur. Artificial intelligence can maximize the use of natural disaster social media messages for natural disaster management. The artificial intelligence system will classify social media message texts into three categories: eyewitness, non-eyewitness and don't-know. Messages with the eyewitness category are essential because they can provide the time and location of natural disasters. A common problem in text classification research is that feature extraction techniques ignore word meanings, omit word order information and produce high-dimensional data. In this study, a feature extraction technique can maintain word order information and meaning by using three-word embedding techniques, namely word2vec, fastText, and Glove. The result is data with 1D, 2D, and 3D dimensions. This study also proposes a data formation technique with new features by combining data from all word embedding techniques. The classification model is made using three Convolutional Neural Network (CNN) techniques, namely 1D CNN, 2D CNN and 3D CNN. The best accuracy results in this study were in the case of earthquakes 78.33%, forest fires 81.97%, and floods 78.33%. The calculation of the average accuracy shows that the 2D and 3D v1 data formation techniques work better than other techniques. Other results show that the proposed technique produces better average accuracy.
Downloads
References
D. Wu and Y. Cui, “Disaster early warning and damage assessment analysis using social media data and geo-location information,” Decis. Support Syst., vol. 111, pp. 48–59, 2018.
K. Muniz-Rodriguez et al., “Social media use in emergency response to natural disasters: a systematic review with a public health perspective,” Disaster Med. Public Health Prep., vol. 14, no. 1, pp. 139–149, 2020.
A. Devaraj, D. Murthy, and A. Dontula, “Machine-learning methods for identifying social media-based requests for urgent help during hurricanes,” Int. J. Disaster Risk Reduct., vol. 51, p. 101757, 2020.
K. Zahra, M. Imran, and F. O. Ostermann, “Automatic identification of eyewitness messages on twitter during disasters,” Inf. Process. Manag., vol. 57, no. 1, p. 102107, 2020, doi: 10.1016/j.ipm.2019.102107.
K. Y. Firlia, M. R. Faisal, D. Kartini, R. A. Nugroho, and F. Abadi, “Analysis of New Features on the Performance of the Support Vector Machine Algorithm in Classification of Natural Disaster Messages,” in 2021 4th International Conference of Computer and Informatics Engineering (IC2IE), 2021, pp. 317–322.
M. K. Delimayanti, R. Sari, M. Laya, M. R. Faisal, Pahrul, and R. F. Naryanto, “The effect of pre-processing on the classification of twitter’s flood disaster messages using support vector machine algorithm,” Proc. ICAE 2020 - 3rd Int. Conf. Appl. Eng., no. February 2021, 2020, doi: 10.1109/ICAE50557.2020.9350387.
S. M. Nooralifa, M. R. Faisal, F. Abadi, R. A. Nugroho, and M. Aziz, “Identifikasi Otomatis Pesan Saksi Mata Pada Media Sosial Saat Bencana Gempa,” KLIK-KUMPULAN J. ILMU Komput., vol. 8, no. 2, pp. 129–138, 2021.
M. R. Faisal, R. A. Nugroho, R. Ramadhani, F. Abadi, R. Herteno, and T. H. Saragih, “Natural Disaster on Twitter: Role of Feature Extraction Method of Word2Vec and Lexicon Based for Determining Direct Eyewitness,” Trends Sci., vol. 18, no. 23, p. 680, 2021.
B. Jang, M. Kim, G. Harerimana, S. Kang, and J. W. Kim, “Bi-LSTM model to increase accuracy in text classification: Combining Word2vec CNN and attention mechanism,” Appl. Sci., vol. 10, no. 17, p. 5841, 2020.
N. A. Hasanah, N. Suciati, D. Purwitasari, and others, “Pemantauan Perhatian Publik terhadap Pandemi COVID-19 melalui Klasifikasi Teks dengan Deep Learning,” J. RESTI (Rekayasa Sist. Dan Teknol. Informasi), vol. 5, no. 1, pp. 193–202, 2021.
Rinaldi, M. R. Faisal, M. I. Mazdadi, R. A. Nugroho, F. Abadi, and Others, “Eye Witness Message Identification on Forest Fires Disaster Using Convolutional Neural Network,” J. Data Sci. Softw. Eng., vol. 2, no. 02, pp. 100–108, 2021.
J. Ochoa-Luna and D. Ari, “Word Embeddings and Deep Learning for Spanish Twitter Sentiment Analysis,” in Information Management and Big Data, 2019, pp. 19–31.
D. Alita and A. R. Isnain, “Pendeteksian Sarkasme pada Proses Analisis Sentimen Menggunakan Random Forest Classifier,” J. Komputasi, vol. 8, no. 2, pp. 50–58, 2020.
Y. Goldberg and O. Levy, “word2vec Explained: deriving Mikolov et al.’s negative-sampling word-embedding method,” arXiv Prepr. arXiv1402.3722, 2014.
S. Khomsah, R. D. Ramadhani, S. Wijaya, and others, “The Accuracy Comparison Between Word2Vec and FastText On Sentiment Analysis of Hotel Reviews,” J. RESTI (Rekayasa Sist. Dan Teknol. Informasi), vol. 6, no. 3, pp. 352–358, 2022.
J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543.
F. Anistya, E. B. Setiawan, and others, “Hate Speech Detection on Twitter in Indonesia with Feature Expansion Using GloVe,” J. RESTI (Rekayasa Sist. Dan Teknol. Informasi), vol. 5, no. 6, pp. 1044–1051, 2021.
J. He and X. Fan, “Evaluating the performance of the k-fold cross-validation approach for model selection in growth mixture modeling,” Struct. Equ. Model. A Multidiscip. J., vol. 26, no. 1, pp. 66–79, 2019.
Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolutional neural networks: analysis, applications, and prospects,” IEEE Trans. neural networks Learn. Syst., 2021.
S. Kiranyaz, O. Avci, O. Abdeljaber, T. Ince, M. Gabbouj, and D. J. Inman, “1D convolutional neural networks and applications: A survey,” Mech. Syst. Signal Process., vol. 151, p. 107398, 2021.
D. Kartini, A. Farmadi, D. T. Nugrahadi, P. Pirjatullah, and others, “Perbandingan Nilai K pada Klasifikasi Pneumonia Anak Balita Menggunakan K-Nearest Neighbor,” J. Komputasi, vol. 10, no. 1, pp. 47–53, 2022.
A. Nugroho, A. B. Gumelar, A. G. Sooai, D. Sarvasti, P. L. Tahalele, and others, “Perbandingan Performansi Algoritma Pengklasifikasian Terpandu Untuk Kasus Penyakit Kardiovaskular,” J. RESTI (Rekayasa Sist. Dan Teknol. Informasi), vol. 4, no. 5, pp. 998–1006, 2020.
S. Bodapati, H. Bandarupally, R. N. Shaw, and A. Ghosh, “Comparison and analysis of RNN-LSTMs and CNNs for social reviews classification,” in Advances in Applications of Data-Driven Computing, Springer, 2021, pp. 49–59.
H. Jwa, D. Oh, K. Park, J. M. Kang, and H. Lim, “exBAKE: Automatic Fake News Detection Model Based on Bidirectional Encoder Representations from Transformers (BERT),” Appl. Sci., vol. 9, no. 19, 2019, doi: 10.3390/app9194062.
Copyright (c) 2022 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;