Evaluasi Topik Tersembunyi Berdasarkan Aspect Extraction menggunakan Pengembangan Latent Dirichlet Allocation

Dinda Adimanggala; Fitra Abdurrachman Bachtiar; Eko Setiawan

doi:10.29207/resti.v5i3.3075

Dinda Adimanggala Universitas Brawijaya
Fitra Abdurrachman Bachtiar Universitas Brawijaya https://orcid.org/0000-0002-0845-247X
Eko Setiawan Universitas Brawijaya

DOI: https://doi.org/10.29207/resti.v5i3.3075

Keywords: sentiment analysis, aspect, topic, extraction, LDA, evaluation

Abstract

Recently, Sentiment Analysis is used for expression detection of products or services. Sentiment Analysis is one category type with a level of aspect focused on extracting product aspects. One of the common methods used for aspect extraction is Latent Dirichlet Allocation (LDA) using random topic identification, but this method has not been able to find an acceptable topic with some aspects having been found. Undeterminable topics are referred to as the hidden topics. This study purpose is to evaluate and compare the suitability of identifying hidden topics between human and computer evaluation. The study is also focused on aspect extraction using a variety of LDA innovations. The data used in this study used case studies on e-Commerce. Data were processed using feature selection and grouped using LDA development. Then the data results are processed using Latent Topic Identification based on subjective and objective evaluations. The identification of hidden topic results was evaluated using several semantic and lexicon tests. The evaluation results indicate the comparison of two hidden topic identification assessment values is quite relevant with the average difference in value reaching 6%. As a result, computer calculations assist humans in determining topics if each topic has a low coherence value.

Downloads

Download data is not yet available.

References

S. Poria, E. Cambria, and A. Gelbukh, “Aspect extraction for opinion mining with a deep convolutional neural network,” Knowledge-Based Syst., vol. 108, pp. 42–49, Sep. 2016, doi: 10.1016/j.knosys.2016.06.009.

M. Tubishat, N. Idris, and M. Abushariah, “Explicit aspects extraction in sentiment analysis using optimal rules combination,” Futur. Gener. Comput. Syst., vol. 114, pp. 448–480, Jan. 2021, doi: 10.1016/j.future.2020.08.019.

B. Liu, Sentiment analysis : mining opinions, sentiments, and emotions. New York: Cambridge University Press, 2015.

T. A. Rana and Y.-N. Cheah, “Aspect extraction in sentiment analysis: comparative analysis and survey,” Artif. Intell. Rev., vol. 46, no. 4, pp. 459–483, Dec. 2016, doi: 10.1007/s10462-016-9472-z.

A. S. Manek, P. D. Shenoy, M. C. Mohan, and V. K. R, “Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier,” World Wide Web, vol. 20, no. 2, pp. 135–154, Mar. 2017, doi: 10.1007/s11280-015-0381-x.

M. Shams and A. Baraani-dastjerdi, “Enriched LDA (ELDA): combination of latent Dirichlet allocation with word co-occurrence analysis for aspect extraction,” Expert Syst. Appl., vol. 80, pp. 136–146, 2017, doi: 10.1016/j.eswa.2017.02.038.

E. Cambria, D. Das, S. Bandyopadhyay, and A. Feraco, Eds., A Practical Guide to Sentiment Analysis, vol. 5. Cham: Springer International Publishing, 2017.

L. Chen, J. Martineau, D. Cheng, and A. Sheth, “Clustering for Simultaneous Extraction of Aspects and Features from Reviews,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016, pp. 789–799, doi: 10.18653/v1/N16-1093.

C. Wu, F. Wu, S. Wu, Z. Yuan, and Y. Huang, “A hybrid unsupervised method for aspect term and opinion target extraction,” Knowledge-Based Syst., vol. 148, pp. 66–73, 2018, doi: 10.1016/j.knosys.2018.01.019.

Y. Rubtsova and S. Koshelnikov, “Aspect Extraction from Reviews Using Conditional Random Fields,” in Knowledge Engineering and Semantic Web, 2015, pp. 158–167, doi: 10.1007/978-3-319-24543-0.

Y. Yang, C. Chen, and F. S. Bao, “Aspect Extraction from Product Reviews Using Category Hierarchy Information,” in Proceedings ofthe 15th Conference ofthe European Chapter ofthe Association for Computational Linguistics, 2017, vol. 2, pp. 675–680.

X. Yan, J. Guo, Y. Lan, and X. Cheng, “A Biterm Topic Model for Short Texts,” in International World Wide Web Conference Committee, 2013, pp. 1445–1455, doi: 10.1145/2488388.2488514.

D.-H. Pham and A.-C. Le, “Exploiting multiple word embeddings and one-hot character vectors for aspect-based sentiment analysis,” Int. J. Approx. Reason., vol. 103, pp. 1–10, Dec. 2018, doi: 10.1016/j.ijar.2018.08.003.

R. He, W. S. Lee, H. T. Ng, and D. Dahlmeier, “An Unsupervised Neural Attention Model for Aspect Extraction,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2017, pp. 388–397.

D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.

H. Jelodar et al., “Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey,” Multimed. Tools Appl., vol. 78, no. 11, pp. 15169–15211, Jun. 2019, doi: 10.1007/s11042-018-6894-4.

K. Stevens, P. Kegelmeyer, D. Andrzejewski, and D. Buttler, “Exploring Topic Coherence over many models and many topics,” in Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012, no. July, pp. 952–961.

N. Aletras and M. Stevenson, “Evaluating Topic Coherence Using Distributional Semantics,” in Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Long Papers, 2013, pp. 13–22.

D. Nolasco and J. Oliveira, “Mining social influence in science and vice-versa: A topic correlation approach,” Int. J. Inf. Manage., vol. 51, p. 102017, Apr. 2020, doi: 10.1016/j.ijinfomgt.2019.10.002.

J. Cai, J. Luo, S. Wang, and S. Yang, “Feature selection in machine learning: A new perspective,” Neurocomputing, vol. 300, pp. 70–79, Jul. 2018, doi: 10.1016/j.neucom.2017.11.077.

A. M. Priyatno, M. M. Muttaqi, F. Syuhada, and A. Z. Arifin, “Deteksi Bot Spammer Twitter Berbasis Time Interval Entropy dan Global Vectors for Word Representations Tweet’s Hashtag,” Regist. J. Ilm. Teknol. Sist. Inf., vol. 5, no. 1, p. 37, Jan. 2019, doi: 10.26594/register.v5i1.1382.

A. Panchenko et al., “A Graph-Based Approach to Skill Extraction from Text,” in Proceedings of TextGraphs-8 Graph-based Methods for Natural Language Processing, 2013, pp. 79–87.