DiG-MFV: Dual-integrated Graph for Multilingual Fact Verification

Nova Agustina; Kusrini; Ema Utami; Tonny Hidayat

doi:10.29207/resti.v9i4.6695

Nova Agustina Universitas Amikom Yogyakarta
Kusrini Universitas Amikom Yogyakarta
Ema Utami Universitas Amikom Yogyakarta
Tonny Hidayat Universitas Amikom Yogyakarta

DOI: https://doi.org/10.29207/resti.v9i4.6695

Keywords: fact verification, multilingual model, LaBSE, XLM-R, mBERT, graph fusion, political claim

Abstract

The proliferation of misinformation in political domains, especially across multilingual platforms, presents a major challenge to maintaining public information integrity. Existing models often fail to effectively verify claims when the evidence spans multiple languages and lacks a structured format. To address this issue, this study proposes a novel architecture called Dual-integrated Graph for Multilingual Fact Verification (DiG-MFV), which combines semantic representations from multilingual language models (i.e., mBERT, XLM-R, and LaBSE) with two graph-based components: an evidence graph and a semantic fusion graph. These components are processed through a dual-path architecture that integrates the outputs from a text encoder and a graph encoder, enabling deeper semantic alignment and cross-evidence reasoning. The PolitiFact dataset was used as the source of claims and evidence. The model was evaluated by using a data split of 70% for training, 20% for validation, and 10% for testing. The training process employed the AdamW optimizer, cross-entropy loss, and regularization techniques, including dropout and early stopping based on the F1-score. The evaluation results show that DiG-MFV with LaBSE achieved an accuracy of 85.80% and an F1-score of 85.70%, outperforming the mBERT and XLM-R variants, and proved to be more effective than the DGMFP baseline model (76.1% accuracy). The model also demonstrated stable convergence during training, indicating its robustness in cross-lingual political fact verification tasks. These findings encourage further exploration in graph-based multilingual fact verification systems.

Downloads

Download data is not yet available.

References

A. Rani et al., “FACTIFY-5WQA: 5W Aspect-based Fact Verification through Question Answering,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Stroudsburg, PA, USA: Association for Computational Linguistics, May 2023, pp. 10421–10440. doi: 10.18653/v1/2023.acl-long.581.

Z. Chen, F. Zhuang, L. Liao, M. Jia, J. Li, and H. Huang, “Effectively Modeling Sentence Interactions with Factorization Machines for Fact Verification,” IEEE Intell Syst, vol. 38, no. 5, pp. 18–27, Sep. 2023, doi: 10.1109/MIS.2023.3301170.

J. Gao, H.-F. Hoffmann, S. Oikonomou, D. Kiskovski, and A. Bandhakavi, “Logically at Factify 2022: Multimodal Fact Verification,” Dec. 2021, doi: https://doi.org/10.48550/arXiv.2112.09253.

N. Agustina, Kusrini, E. Utami, and T. Hidayat, “Systematic Literature Review in the Development of Datasets and Fact Verification Models for Indonesian Language,” in 2024 7th International Conference of Computer and Informatics Engineering (IC2IE), IEEE, Sep. 2024, pp. 1–9. doi: 10.1109/IC2IE63342.2024.10748079.

J. Kim, S. Park, Y. Kwon, Y. Jo, J. Thorne, and E. Choi, “FactKG: Fact Verification via Reasoning on Knowledge Graphs,” in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Stroudsburg, PA, USA: Association for Computational Linguistics, Sep. 2023, pp. 16190–16206. doi: 10.18653/v1/2023.acl-long.895.

A. Ünver, “Emerging Technologies And Automated Fact-Checking: Tools, Techniques And Algorithms,” Aug. 2023. doi: 10.13140/RG.2.2.20514.20165.

A. Athar, S. Ali, M. M. Sheeraz, S. Bhattachariee, and H.-C. Kim, “Sentimental Analysis of Movie Reviews using Soft Voting Ensemble-based Machine Learning,” no. March, pp. 01–05, 2022, doi: 10.1109/snams53716.2021.9732159.

I. Perikos and S. Souli, “Natural Language Inference with Transformer Ensembles and Explainability Techniques,” 2024, doi: 10.3390/electronics.

C. J. Varshney, A. Sharma, and D. P. Yadav, “Sentiment analysis using ensemble classification technique,” 2020 IEEE Students’ Conference on Engineering and Systems, SCES 2020, no. July, 2020, doi: 10.1109/SCES50439.2020.9236754.

A. Praseed, J. Rodrigues, and P. S. Thilagam, “Hindi fake news detection using transformer ensembles,” Eng Appl Artif Intell, vol. 119, Mar. 2023, doi: 10.1016/j.engappai.2022.105731.

H. Zhang and M. O. Shafiq, “Survey of transformers and towards ensemble learning using transformers for natural language processing,” J Big Data, vol. 11, no. 1, Dec. 2024, doi: 10.1186/s40537-023-00842-0.

Z. Yang, Y. Xu, J. Hu, and S. Dong, “Generating knowledge aware explanation for natural language inference,” Inf Process Manag, vol. 60, no. 2, Mar. 2023, doi: 10.1016/j.ipm.2022.103245.

H. Gong, C. Wang, and X. Huang, “Double Graph Attention Network Reasoning Method Based on Filtering and Program-Like Evidence for Table-Based Fact Verification,” IEEE Access, vol. 11, pp. 86859–86871, 2023, doi: 10.1109/ACCESS.2023.3304915.

C. Liu, Z. Yao, Y. Zhan, X. Ma, S. Pan, and W. Hu, “Gradformer: Graph Transformer with Exponential Decay,” Apr. 2024, [Online]. Available: http://arxiv.org/abs/2404.15729

G. Mitrov, B. Stanoev, S. Gievska, G. Mirceva, and E. Zdravevski, “Combining Semantic Matching, Word Embeddings, Transformers, and LLMs for Enhanced Document Ranking: Application in Systematic Reviews,” Big Data and Cognitive Computing, vol. 8, no. 9, p. 110, Sep. 2024, doi: 10.3390/bdcc8090110.

L. Wu, D. Yu, P. Liu, C. Gao, and Z. Wang, “Heuristic Heterogeneous Graph Reasoning Networks for Fact Verification,” IEEE Trans Neural Netw Learn Syst, 2023, doi: 10.1109/TNNLS.2023.3282380.

L. Pan, Y. Zhang, and M.-Y. Kan, “Investigating Zero- and Few-shot Generalization in Fact Verification,” in Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Stroudsburg, PA, USA: Association for Computational Linguistics, Sep. 2023, pp. 511–524. doi: 10.18653/v1/2023.ijcnlp-main.34.

M. DeHaven and S. Scott, “BEVERS: A General, Simple, and Performant Framework for Automatic Fact Verification,” in Proceedings of the Sixth Fact Extraction and VERification Workshop (FEVER), Stroudsburg, PA, USA: Association for Computational Linguistics, Mar. 2023, pp. 58–65. doi: 10.18653/v1/2023.fever-1.6.

M. Naseer, M. Asvial, and R. F. Sari, “An Empirical Comparison of BERT, RoBERTa, and Electra for Fact Verification,” in 3rd International Conference on Artificial Intelligence in Information and Communication, ICAIIC 2021, Institute of Electrical and Electronics Engineers Inc., Apr. 2021, pp. 241–246. doi: 10.1109/ICAIIC51459.2021.9415192.

Z. Liu, C. Xiong, Z. Dai, S. Sun, M. Sun, and Z. Liu, “Adapting Open Domain Fact Extraction and Verification to COVID-FACT through In-Domain Language Modeling,” pp. 2395–2400, 2020, doi: 10.18653/v1/2020.findings-emnlp.216.

G. Z. Nabiilah, I. N. Alam, E. S. Purwanto, and M. F. Hidayat, “Indonesian multilabel classification using IndoBERT embedding and MBERT classification,” International Journal of Electrical and Computer Engineering, vol. 14, no. 1, pp. 1071–1078, Feb. 2024, doi: 10.11591/ijece.v14i1.pp1071-1078.

F. Feng, Y. Yang, D. Cer, N. Arivazhagan, and W. Wang, “Language-agnostic BERT Sentence Embedding,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Stroudsburg, PA, USA: Association for Computational Linguistics, Jul. 2022, pp. 878–891. doi: 10.18653/v1/2022.acl-long.62.

A. Gaurav, B. B. Gupta, S. Sharma, R. Bansal, and K. T. Chui, “XLM-RoBERTa Based Sentiment Analysis of Tweets on Metaverse and 6G,” Procedia Comput Sci, vol. 238, pp. 902–907, 2024, doi: 10.1016/j.procs.2024.06.110.

Y. Zhu, J. Si, Y. Zhao, H. Zhu, D. Zhou, and Y. He, “EXPLAIN, EDIT, GENERATE: Rationale-Sensitive Counterfactual Data Augmentation for Multi-hop Fact Verification,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA: Association for Computational Linguistics, Oct. 2023, pp. 13377–13392. doi: 10.18653/v1/2023.emnlp-main.826.

V. Bhatnagar, D. Kanojia, and K. Chebrolu, “Harnessing Abstractive Summarization for Fact-Checked Claim Detection,” Sep. 2022, Accessed: Jun. 01, 2025. [Online]. Available: https://aclanthology.org/2022.coling-1.259/

Y. Yang, Y. Zhou, Q. Ying, Z. Qian, and X. Zhang, “Search, Examine and Early-Termination: Fake News Detection with Annotation-Free Evidences,” Jul. 2024, doi: 10.48550/arXiv.2407.07931.

X. Wang and L. Aitchison, “How to set AdamW’s weight decay as you scale model and dataset size,” May 2024, doi: https://doi.org/10.48550/arXiv.2405.13698.

F. Ji, X. Zhang, and J. Zhao, “α-EGAN: α-Energy distance GAN with an early stopping rule,” Computer Vision and Image Understanding, vol. 234, Sep. 2023, doi: 10.1016/j.cviu.2023.103748.

T. Miseta, A. Fodor, and Á. Vathy-Fogarassy, “Surpassing early stopping: A novel correlation-based stopping criterion for neural networks,” Neurocomputing, vol. 567, Jan. 2024, doi: 10.1016/j.neucom.2023.127028.

B. Goswami, A. B. Somaraj, P. Chakrabarti, R. Gudi, and N. Punjabi, “Classifier Enhanced Deep Learning Model for Erythroblast Differentiation with Limited Data,” Nov. 2024, doi: https://doi.org/10.48550/arXiv.2411.15592.

Y.-C. Chang, C. Kruengkrai, and J. Yamagishi, “XFEVER: Exploring Fact Verification across Languages,” Oct. 2023, Accessed: Jun. 01, 2025. [Online]. Available: https://aclanthology.org/2023.rocling-1.1/

S. Wu and M. Dredze, “Are all languages created equal in multilingual BERT?,” in Proceedings of the Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (ACL), 2020, pp. 120–130. doi: 10.18653/v1/2020.repl4nlp-1.16.

A. Conneau et al., “Unsupervised Cross-lingual Representation Learning at Scale,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2020, pp. 8440–8451. doi: 10.18653/v1/2020.acl-main.747.

N. Reimers and I. Gurevych, “Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Stroudsburg, PA, USA: Association for Computational Linguistics, 2020, pp. 4512–4525. doi: 10.18653/v1/2020.emnlp-main.365.

DiG-MFV: Dual-integrated Graph for Multilingual Fact Verification

Abstract

Downloads

References

Most read articles by the same author(s)