Classifying Quranic Verse Topics using Word Centrality Measure

  • Ferdian Yulianto Telkom University
  • Kemas Muslim Lhaksmana Telkom University
  • Danang Triantoro Murdiansyah Telkom University
Keywords: The Holy Quran, centrality, topic classification, SVM, naive Bayes, multilabel classification


Muslims believe that, as the speech of Allah, The Quran is a miracle that has specialties in itself. Some of the specialties that have studied are the regularities in the number of letters, words, vocabularies, etc. In the past, the early Islamic scholars identify these regularities manually, i.e. by counting the occurrence of each vocabulary by hand. This research tackles this problem by utilizing centrality in quranic verse topic classification. The goal of this research is to analyze the effect of The Quran word centrality measure on the topic classification of The Quran verses. To achieve this objective, the method of this research is constructing the Quran word graph, then the score of centralities included as one of the features in the verse topic classification. The effect of centrality is observed along with support vector machine (SVM) and naïve Bayes classifiers by performing two scenarios (with stopword and without stopword removal). The result shows that according to the centrality measure the word “الله” (Allah) is the most central in The Quran. The performance evaluation of the classification models shows that the use of centrality improves the hamming loss score from 0.43 to 0.21 on naïve Bayes classifier with stopword removal. Finally, both of classification method has a better performance in word graph that use stopword removal.



