Optimizing Sentiment Analysis for Lombok Tourism Using SMOTE and Chi-Square with Machine Learning
Abstract
Tourism is a vital economic sector for Lombok Island, which is renowned for its natural beauty and cultural richness as a top destination. The rapid growth of tourism in Lombok requires a deep understanding of tourists' perceptions and sentiments to ensure an optimal service quality. The sentiment analysis of online reviews is valuable for identifying service strengths and weaknesses and addressing tourists' needs more effectively. This not only enhances tourist satisfaction, but also aids in the design of more effective marketing strategies. However, text data analysis from online reviews presents unique challenges such as noise, class imbalance, and numerous features that may affect classification results. Therefore, this study aims to classify tourist sentiment toward Lombok tourism using machine learning methods combined with feature selection and oversampling techniques. This study focuses on optimizing sentiment analysis of tourism-related tweets using a combination of SMOTE oversampling and Chi-Square feature selection on improving classification performance without hyperparameter tuning. The study applies machine learning methods, such as SVM and Naïve Bayes, with feature selection and oversampling using Chi-Square and SMOTE. The dataset used was sentiment data regarding Lombok tourism obtained from Twitter in 2023, consisting of 940 instances divided into three classes: Negative, Neutral, and Positive. The research findings show that the use of SMOTE and Chi-Square can improve the accuracy of the SVM and Naive Bayes methods. Without optimization, the SVM method achieved an accuracy of 73.93% and a Naive Bayes of 67.02%. After optimization with SMOTE and Chi-Square, the accuracy increased for SVM by 90% and Naive Bayes by 84% to classify tourist sentiment towards Lombok tourism. The implications indicate that combining data balancing using SMOTE with feature selection via Chi-Square effectively improves the performance of sentiment classification models for tourist opinions on Lombok's tourism.
Downloads
References
M. Azizurrohman, R. B. Hartarto, Y.-M. Lin, and F. H. Nahar, “The Role of Foreign Tourists in Economic Growth: Evidence from Indonesia,” J. Ekon. Stud. Pembang., vol. 22, no. 2, pp. 313–322, 2021, doi: 10.18196/jesp.v22i2.11591.
I. Rahadi, H. Basri, M. Adi Junaidi, and D. Alfatwari, “The most popular tourist destinations in Lombok Timur by univariate and bivariate analysis methods,” J. Mantik, vol. 6, no. 3, pp. 2685–4236, 2022, doi: https://doi.org/10.35335/mantik.v6i3.2910.
R. Jangra, S. P. Kaushik, and S. S. Saini, “An analysis of tourist’s perceptions toward tourism development: Study of cold desert destination, India,” Geogr. Sustain., vol. 2, no. 1, pp. 48–58, 2021, doi: 10.1016/j.geosus.2021.02.004.
D. Flores-Ruiz, A. Elizondo-Salto, and M. D. L. O. Barroso-González, “Using social media in tourist sentiment analysis: A case study of andalusia during the Covid-19 pandemic,” Sustain., vol. 13, no. 7, pp. 1–19, 2021, doi: 10.3390/su13073836.
T. Oliveira, B. Araujo, and C. Tam, “Why do people share their travel experiences on social media?,” Tour. Manag., vol. 78, no. November 2019, pp. 1–14, 2020, doi: 10.1016/j.tourman.2019.104041.
O. A. George and C. M. Q. Ramos, “Sentiment analysis applied to tourism: exploring tourist-generated content in the case of a wellness tourism destination,” Int. J. Spa Wellness, vol. 7, no. 2, pp. 139–161, 2024, doi: 10.1080/24721735.2024.2352979.
M. Alzate, M. Arce-Urriza, and J. Cebollada, “Mining the text of online consumer reviews to analyze brand image and brand positioning,” J. Retail. Consum. Serv., vol. 67, no. November 2021, p. 102989, 2022, doi: 10.1016/j.jretconser.2022.102989.
K. Y. Chan, C. K. Kwong, and H. Jiang, “Analyzing imbalanced online consumer review data in product design using geometric semantic genetic programming,” Eng. Appl. Artif. Intell., vol. 105, no. August, pp. 1–15, 2021, doi: 10.1016/j.engappai.2021.104442.
S. M. Alrashidi and A. M. Awadelkarim, “Machine Learning-Based Sentiment Analysis for Tweets Saudi Tourism,” Journla Theor. Appl. Inf. Technol., vol. 100, no. 16, pp. 5096–5109, 2022.
N. Leelawat et al., “Twitter data sentiment analysis of tourism in Thailand during the COVID-19 pandemic using machine learning,” Heliyon, vol. 8, no. 10, pp. 1–11, 2022, doi: 10.1016/j.heliyon.2022.e10894.
B. A. Alharbi, M. A. Mezher, and A. M. Barakeh, “Tourist Reviews Sentiment Classification using Deep Learning Techniques: A Case Study in Saudi Arabia,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 6, pp. 717–726, 2022, doi: 10.14569/IJACSA.2022.0130685.
C. Steven and W. Wella, “The Right Sentiment Analysis Method of Indonesian Tourism in Social Media Twitter,” IJNMT (International J. New Media Technol., vol. 7, no. 2, pp. 102–110, 2020, doi: 10.31937/ijnmt.v7i2.1732.
A. D. Poernomo and S. Suharjito, “Indonesian online travel agent sentiment analysis using machine learning methods,” Indones. J. Electr. Eng. Comput. Sci., vol. 14, no. 1, p. 113, 2019, doi: 10.11591/ijeecs.v14.i1.pp113-117.
A. Muhammad, S. Defit, and G. W. Nurcahyo, “Determining Intent: Sentiment Analysis Based on the Classification of Indonesian Tourist Destination Review Texts,” J. Adv. Inf. Technol., vol. 15, no. 10, pp. 1106–1116, 2024, doi: 10.12720/jait.15.10.1106-1116.
Hariyono et al., “Exploring Visitor Sentiments: A Study of Nusantara Temple Reviews on TripAdvisor Using Machine Learning,” J. Appl. Data Sci., vol. 5, no. 2, pp. 600–612, 2024, doi: 10.47738/jads.v5i2.208.
Y. A. Singgalen, “Culture and heritage tourism sentiment classification through cross-industry standard process for data mining,” Int. J. Basic Appl. Sci., vol. 12, no. 3, pp. 110–120, 2023, doi: https://doi.org/10.35335/ijobas.v12i3.299.
N. K. Shadrina, E. Sutoyo, and V. P. Widartha, “Sentiment Analysis in Reviews About Beaches in Bali on Tripadvisor Using Recurrent Neural Network (RNN),” in 2021 IEEE 7th Information Technology International Seminar (ITIS), 2021, pp. 1–6. doi: 10.1109/ITIS53497.2021.9791501.
D. T. Hermanto, M. Ziaurrahman, M. A. Bianto, and A. Setyanto, “Twitter Social Media Sentiment Analysis in Tourist Destinations Using Algorithms Naive Bayes Classifier,” in Journal of Physics, 2018, pp. 1–8. doi: 10.1088/1742-6596/1140/1/012037.
H. Hairani, T. Widiyaningtyas, and D. D. Prasetya, “Addressing Class Imbalance of Health Data: A Systematic Literature Review on Modified Synthetic Minority Oversampling Technique (SMOTE) Strategies,” JOIV Int. J. Informatics Vis., vol. 8, no. 3, pp. 1310–1318, 2024, doi: https://dx.doi.org/10.62527/joiv.8.3.2283.
S. D. Wahyuni and R. H. Kusumodestoni, “Optimalisasi Algoritma Support Vector Machine ( SVM ) Dalam Klasifikasi Kejadian Data Stunting,” Bull. Inf. Technol., vol. 5, no. 2, pp. 56–64, 2024, doi: 10.47065/bit.v5i2.1247.
M. K. Tamami and I. Kharisudin, “Komparasi Metode Support Vector Machine dan Naive Bayes Classifier untuk Pemodelan Kualitas Pengajuan Kredit,” Indones. J. Math. Nat. Sci., vol. 46, no. 1, pp. 38–44, 2023, doi: 10.15294/ijmns.v46i1.46174.
X. Gao, N. Jamil, M. I. Ramli, and S. M. Z. S. Z. Ariffin, “A Comparative Analysis of Combination of CNN-Based Models with Ensemble Learning on Imbalanced Data,” Int. J. Informatics Vis., vol. 8, no. 1, pp. 456–464, 2024, doi: 10.62527/joiv.8.1.2194.
H. Hairani and D. Priyanto, “A New Approach of Hybrid Sampling SMOTE and ENN to the Accuracy of Machine Learning Methods on Unbalanced Diabetes Disease Data,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 8, pp. 585–590, 2023, doi: https://dx.doi.org/10.14569/IJACSA.2023.0140864.
M. W. Huang, C. H. Chiu, C. F. Tsai, and W. C. Lin, “On combining feature selection and over-sampling techniques for breast cancer prediction,” Appl. Sci., vol. 11, no. 14, pp. 1–9, 2021, doi: 10.3390/app11146574.
L. G. R. Putra, K. Marzuki, and H. Hairani, “Correlation-based feature selection and Smote-Tomek Link to improve the performance of machine learning methods on cancer disease prediction,” Eng. Appl. Sci. Res., vol. 50, no. 6, pp. 577–583, 2023, doi: 10.14456/easr.2023.59.
A. Anggrawan, H. Hairani, and C. Satria, “Improving SVM Classification Performance on Unbalanced Student Graduation Time Data Using SMOTE,” Int. J. Inf. Educ. Technol., vol. 13, no. 2, pp. 289–295, 2023, doi: 10.18178/ijiet.2023.13.2.1806.
Copyright (c) 2025 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;