Topic Classification of Quranic Verses in English Translation Using Word Centrality Measurement
Abstract
Every Muslim in the world believes that the Quran is a miracle and the words of God (Kalamullah) revealed to the Prophet Muhammad SAW to be conveyed to humans. The Quran is used by humans as a guide in dealing with all problems in every aspect of life. To study the Quran, it is necessary to know what topic is being discussed in every single verse. With the help of technology, the verses of the Quran can be given topics automatically. This task is called multilabel classification where input data can be classified into one or more categories. This research aims to apply the multilabel classification to classify the topics of the Quranic verses in English translation into 10 topics using the Word Centrality measurement as the word weighting value. Then a comparison is made to the 4 classification methods, namely SVM, Naïve Bayes, KNN, and Decision Tree. The result of the centrality measurement shows that the word ‘Allah’ is the most important or the most central word of the whole document of the Quran with the scenario using stopword removal. Furthermore, the use of word centrality value as term weighting in feature extraction can improve the performance of the classification system.
Downloads
References
A. Pal, M. Selvakumar, and M. Sankarasubbu, “Multi-Label Text Classification using Attention-based Graph Neural Network,” Mar. 2020, doi: 10.5220/0008940304940505.
F. D. Malliaros and K. Skianis, “Graph-Based Term Weighting for Text Categorization,” in Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, Aug. 2015, pp. 1473–1479. doi: 10.1145/2808797.2808872.
A. Ishtiaq, M. A. Islam, M. Azhar Iqbal, M. Aleem, and U. Ahmed, “Graph Centrality Based Spam SMS Detection,” in 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST), Jan. 2019, pp. 629–633. doi: 10.1109/IBCAST.2019.8667174.
F. Yulianto, K. M. Lhaksmana, and D. T. Murdiansyah, “Classifying Quranic Verse Topics using Word Centrality Measure,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 3, pp. 594–601, Jun. 2021, doi: 10.29207/resti.v5i3.3171.
R. A. Pane, M. S. Mubarok, N. S. Huda, and Adiwijaya, “A Multi-Lable Classification on Topics of Quranic Verses in English Translation Using Multinomial Naive Bayes,” in 2018 6th International Conference on Information and Communication Technology (ICoICT), May 2018, pp. 481–484. doi: 10.1109/ICoICT.2018.8528777.
G. I. Ulumudin, A. Adiwijaya, and M. S. Mubarok, “A multilabel classification on topics of qur’anic verses in English translation using K-Nearest Neighbor method with Weighted TF-IDF,” J Phys Conf Ser, vol. 1192, p. 012026, Mar. 2019, doi: 10.1088/1742-6596/1192/1/012026.
R. Hidayat and S. Minati, “Comparative Analysis of Text Mining Classification Algorithms for English and Indonesian Qur’an Translation,” IJID (International Journal on Informatics for Development), vol. 8, no. 1, p. 47, Jun. 2019, doi: 10.14421/ijid.2019.08108.
D. A. Otchere, T. O. Arbi Ganat, R. Gholami, and S. Ridha, “Application of supervised machine learning paradigms in the prediction of petroleum reservoir properties: Comparative analysis of ANN and SVM models,” J Pet Sci Eng, vol. 200, p. 108182, May 2021, doi: 10.1016/j.petrol.2020.108182.
K. S. Sahoo et al., “An Evolutionary SVM Model for DDOS Attack Detection in Software Defined Networks,” IEEE Access, vol. 8, pp. 132502–132513, 2020, doi: 10.1109/ACCESS.2020.3009733.
M. R. Choirulfikri, K. M. Lhaksamana, and S. al Faraby, “A Multi-Label Classification of Al-Quran Verses Using Ensemble Method and Naïve Bayes,” Building of Informatics, Technology and Science (BITS), vol. 3, no. 4, pp. 473–479, Mar. 2022, doi: 10.47065/bits.v3i4.1287.
H. Pratama, “Machine Learning: Using Optimized KNN (K-Nearest Neighbors) to Predict the Facies Classifications,” in Proceedings of the 13th SEGJ International Symposium, Tokyo, Japan, 12–14 November 2018, Apr. 2019, pp. 538–541. doi: 10.1190/SEGJ2018-139.1.
S. N. Safitri, Haryono Setiadi, and E. Suryani, “Educational Data Mining Using Cluster Analysis Methods and Decision Trees based on Log Mining,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 6, no. 3, pp. 448–456, Jul. 2022, doi: 10.29207/resti.v6i3.3935.
“The Holy Quran Translation by Abdullah Yusuf Ali, 2022.” https://quranyusufali.com/ (accessed Jul. 29, 2022).
“Quran Verses – Verses from the Noble Qur’an categorized according to subject and accompanied by a beautiful recitation.” https://quranverses.net/ (accessed Jul. 29, 2022).
S. Kr. Biswas, M. Bordoloi, and J. Shreya, “A graph based keyword extraction model using collective node weight,” Expert Syst Appl, vol. 97, pp. 51–59, May 2018, doi: 10.1016/j.eswa.2017.12.025.
J. Zhang and Y. Luo, “Degree Centrality, Betweenness Centrality, and Closeness Centrality in Social Network,” 2017. doi: 10.2991/msam-17.2017.68.
D. Kim, D. Seo, S. Cho, and P. Kang, “Multi-co-training for document classification using various document representations: TF–IDF, LDA, and Doc2Vec,” Inf Sci (N Y), vol. 477, pp. 15–29, Mar. 2019, doi: 10.1016/j.ins.2018.10.006.
M. Goudjil, M. Koudil, M. Bedda, and N. Ghoggali, “A Novel Active Learning Method Using SVM for Text Classification,” International Journal of Automation and Computing, vol. 15, no. 3, pp. 290–298, Jun. 2018, doi: 10.1007/s11633-015-0912-z.
V. K. Chauhan, K. Dahiya, and A. Sharma, “Problem formulations and solvers in linear SVM: a review,” Artif Intell Rev, vol. 52, no. 2, pp. 803–855, Aug. 2019, doi: 10.1007/s10462-018-9614-6.
Copyright (c) 2022 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;