Question Answering through Transfer Learning on Closed-Domain Educational Websites

Matiin Laugiwa; Evi Yulianti

doi:10.29207/resti.v9i1.6163

Matiin Laugiwa Universitas Indonesia
Evi Yulianti Universitas Indonesia

DOI: https://doi.org/10.29207/resti.v9i1.6163

Keywords: NLP, Question Answering, Transfer Learning, Closed Domain, XLM-RoBERTa

Abstract

Navigating complex educational websites poses challenges for users looking for specific information. This research discusses the problem of efficient information search on closed-domain educational platforms, focusing on the Universitas Indonesia website. Leveraging Natural Language Processing (NLP), we explore the effectiveness of transfer learning models in Closed Domain Question Answering (QA). The performance of three BERT-based models, including IndoBERT, RoBERTa, and XLM-RoBERTa, are compared in transfer and non-transfer learning scenarios. Our result reveals that transfer learning significantly improves QA model performance. The models using transfer learning scenario showed up to 4.91\% improvement in the F-1 score against those using non-transfer learning scenario. XLM-RoBERTa base outperforms all other models, achieving the F-1 score of 61.72\%. This study provides valuable insights into Indonesian-language NLP tasks, emphasizing the efficacy of transfer learning in improving closed-domain QA on educational websites. This research advances our understanding of effective information retrieval strategies, with implications for improving user experience and efficiency in accessing information from educational websites.

Downloads

Download data is not yet available.

References

T. Shao, Y. Guo, H. Chen, and Z. Hao, “Transformer-Based Neural Network for Answer Selection in Question Answering,” IEEE Access, vol. 7, pp. 26146–26156, 2019, doi: 10.1109/ACCESS.2019.2900753.

T. Wolf et al., “Transformers: State-of-the-Art Natural Language Processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online: Association for Computational Linguistics, 2020, pp. 38–45. doi: 10.18653/v1/2020.emnlp-demos.6.

S. Ruder, M. E. Peters, S. Swayamdipta, and T. Wolf, “Transfer Learning in Natural Language Processing”.

D. W. Otter, J. R. Medina, and J. K. Kalita, “A Survey of the Usages of Deep Learning for Natural Language Processing,” IEEE Trans. Neural Netw. Learning Syst., vol. 32, no. 2, pp. 604–624, Feb. 2021, doi: 10.1109/TNNLS.2020.2979670.

B. D. Shivahare, A. K. Singh, N. Uppal, A. Rizwan, V. S. Vaathsav, and S. Suman, “Survey Paper: Study of Natural Language Processing and its Recent Applications,” in 2022 2nd International Conference on Innovative Sustainable Computational Technologies (CISCT), Dehradun, India: IEEE, Dec. 2022, pp. 1–5. doi: 10.1109/CISCT55310.2022.10046440.

R. Sarikaya, G. E. Hinton, and A. Deoras, “Application of Deep Belief Networks for Natural Language Understanding,” IEEE/ACM Trans. Audio Speech Lang. Process., vol. 22, no. 4, pp. 778–784, Apr. 2014, doi: 10.1109/TASLP.2014.2303296.

N. Rachmawati and E. Yulianti, “Transfer Learning for Closed Domain Question Answering in COVID-19,” IJACSA, vol. 13, no. 12, 2022, doi: 10.14569/IJACSA.2022.0131234.

S. Acharya, K. Sornalakshmi, B. Paul, and A. Singh, “Question Answering System using NLP and BERT,” in 2022 3rd International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India: IEEE, Oct. 2022, pp. 925–929. doi: 10.1109/ICOSEC54921.2022.9952050.

H. Le, L.-M. Nguyen, J. Ni, and S. Okada, “Constructing a Closed-Domain Question Answering System with Generative Language Models,” in 2023 15th International Conference on Knowledge and Systems Engineering (KSE), Hanoi, Vietnam: IEEE, Oct. 2023, pp. 1–6. doi: 10.1109/KSE59128.2023.10299437.

J. A. Alzubi, R. Jain, A. Singh, P. Parwekar, and M. Gupta, “COBERT: COVID-19 Question Answering System Using BERT,” Arab J Sci Eng, vol. 48, no. 8, pp. 11003–11013, Aug. 2023, doi: 10.1007/s13369-021-05810-5.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” May 24, 2019, arXiv: arXiv:1810.04805. Accessed: May 05, 2024. [Online]. Available: http://arxiv.org/abs/1810.04805

P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “SQuAD: 100,000+ Questions for Machine Comprehension of Text,” Oct. 10, 2016, arXiv: arXiv:1606.05250. Accessed: May 05, 2024. [Online]. Available: http://arxiv.org/abs/1606.05250

B. Wilie et al., “IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding,” Oct. 08, 2020, arXiv: arXiv:2009.05387. Accessed: May 05, 2024. [Online]. Available: http://arxiv.org/abs/2009.05387

Y. Liu et al., “RoBERTa: A Robustly Optimized BERT Pretraining Approach,” Jul. 26, 2019, arXiv: arXiv:1907.11692. Accessed: May 05, 2024. [Online]. Available: http://arxiv.org/abs/1907.11692

A. Conneau et al., “Unsupervised Cross-lingual Representation Learning at Scale,” Apr. 07, 2020, arXiv: arXiv:1911.02116. Accessed: May 05, 2024. [Online]. Available: http://arxiv.org/abs/1911.02116

D. Chen, A. Fisch, J. Weston, and A. Bordes, “Reading Wikipedia to Answer Open-Domain Questions,” Apr. 27, 2017, arXiv: arXiv:1704.00051. Accessed: May 05, 2024. [Online]. Available: http://arxiv.org/abs/1704.00051

S. Reddy, D. Chen, and C. D. Manning, “CoQA: A Conversational Question Answering Challenge,” Mar. 29, 2019, arXiv: arXiv:1808.07042. Accessed: May 05, 2024. [Online]. Available: http://arxiv.org/abs/1808.07042

N. T. M. Trang and M. Shcherbakov, “Vietnamese Question Answering System f rom Multilingual BERT Models to Monolingual BERT Model,” in 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART), Moradabad, India: IEEE, Dec. 2020, pp. 201–206. doi: 10.1109/SMART50582.2020.9337155.

M. Shymbayev and Y. Alimzhanov, “Extractive Question Answering for Kazakh Language,” in 2023 IEEE International Conference on Smart Information Systems and Technologies (SIST), Astana, Kazakhstan: IEEE, May 2023, pp. 401–405. doi: 10.1109/SIST58284.2023.10223508.

W. Yang et al., “End-to-End Open-Domain Question Answering with BERTserini,” in Proceedings of the 2019 Conference of the North, 2019, pp. 72–77. doi: 10.18653/v1/N19-4013.

A. Akdemir, “Research on Task Discovery for Transfer Learning in Deep Neural Networks,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, Online: Association for Computational Linguistics, 2020, pp. 33–41. doi: 10.18653/v1/2020.acl-srw.6.

A. Akdemir and T. Shibuya, “Transfer Learning for Biomedical Question Answering”.

N. Kadam and M. A. Kumar, “Multiple Choice Question Answering Using Attention Based Ranking and Transfer Learning,” in 2022 IEEE Region 10 Symposium (TENSYMP), Mumbai, India: IEEE, Jul. 2022, pp. 1–6. doi: 10.1109/TENSYMP54529.2022.9864511.

W. T. Alshammari and S. AlHumoud, “TAQS: An Arabic Question Similarity System Using Transfer Learning of BERT With BiLSTM,” IEEE Access, vol. 10, pp. 91509–91523, 2022, doi: 10.1109/ACCESS.2022.3198955.

S. S. Lakkimsetty, S. V. Latchireddy, S. M. Lakkoju, G. R. Manukonda, and R. V. V. M. Krishna, “Fine-Tuned Transformer Models for Question Answering,” in 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), Delhi, India: IEEE, Jul. 2023, pp. 1–5. doi: 10.1109/ICCCNT56998.2023.10307046.

M. A. Ateeq, S. Tiun, H. Abdelhaq, and N. Rahhal, “Arabic Narrative Question Answering (QA) Using Transformer Models,” IEEE Access, vol. 12, pp. 2760–2777, 2024, doi: 10.1109/ACCESS.2023.3348410.

Y. Lan, G. He, J. Jiang, J. Jiang, W. X. Zhao, and J.-R. Wen, “Complex Knowledge Base Question Answering: A Survey,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 11, pp. 11196–11215, Nov. 2023, doi: 10.1109/TKDE.2022.3223858.

H.-H. Hsu and N.-F. Huang, “Xiao-Shih: A Self-Enriched Question Answering Bot With Machine Learning on Chinese-Based MOOCs,” IEEE Trans. Learning Technol., vol. 15, no. 2, pp. 223–237, Apr. 2022, doi: 10.1109/TLT.2022.3162572.

H. N. Van, D. Nguyen, P. M. Nguyen, and M. L. Nguyen, “Miko Team: Deep Learning Approach for Legal Question Answering in ALQAC 2022,” Nov. 03, 2022, arXiv: arXiv:2211.02200. Accessed: May 06, 2024. [Online]. Available: http://arxiv.org/abs/2211.02200

W. Yang, Y. Xie, L. Tan, K. Xiong, M. Li, and J. Lin, “Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering,” Apr. 14, 2019, arXiv: arXiv:1904.06652. Accessed: May 06, 2024. [Online]. Available: http://arxiv.org/abs/1904.06652