Enhancing Premier League Match Outcome Prediction Using Support Vector Machine with Ensemble Techniques: A Comparative Study on Bagging and Boosting

  • Agus Perdana Windarto STIKOM Tunas Bangsa
  • Putrama Alkhairi STIKOM Tunas Bangsa
  • Johan Muslim Universitas Budi Luhur
Keywords: Support Vector Machine (SVM), Ensemble Techniques, Bagging, Boosting, Model Accuracy, Sports Analytics

Abstract

Predicting football match outcomes is a significant challenge in sports analytics, requiring models that are both accurate and resilient. This study evaluates the effectiveness of ensemble techniques, specifically Bagging and Boosting, in enhancing the performance of Support Vector Machine (SVM) models for predicting match outcomes in the English Premier League. The dataset comprises detailed match statistics from 1,520 matches across multiple seasons, including features such as team performance, player statistics, and match outcomes. Four models were examined: baseline SVM, SVM with Bagging, SVM with Boosting, and a combined SVM + Bagging + Boosting approach. Evaluation metrics include accuracy, recall, precision, F1 score, and ROC-AUC, providing a comprehensive assessment of each model's performance. Experimental results indicate that ensemble methods substantially improve model accuracy and stability, with the SVM + Bagging + Boosting combination achieving perfect scores in accuracy, recall, precision, and F1 score, alongside an ROC-AUC value of 0.88. However, this model's slightly reduced ROC-AUC compared to others and its high computational cost highlight potential risks of overfitting and the need for significant resources. These findings underscore the practical potential of combining Bagging and Boosting with SVM for robust and accurate predictions. Limitations include the dataset's focus on a single league and the high resource requirements for ensemble methods. Future research could expand this approach to other sports and leagues, improve computational efficiency, and explore real-time predictive applications

Downloads

Download data is not yet available.

References

J. Wagemans, A.-W. De Leeuw, P. Catteeuw, and D. Vissers, “Development of an algorithm-based approach using neuromuscular test results to indicate an increased risk for non-contact lower limb injuries in elite football players.,” BMJ open Sport Exerc. Med., vol. 9, no. 2, p. e001614, 2023, doi: 10.1136/bmjsem-2023-001614.

M. Doidge, Y. Nuhrat, and R. Kossakowski, “Introduction: ‘A spectre is haunting European football–the spectre of a European Super League,’” Soccer Soc., vol. 24, no. 4, pp. 451–462, 2023, doi: 10.1080/14660970.2023.2194509.

K. Fisik, P. Sepakbola, K. Asyabab, and D. I. Kabupaten, “ASYABAB DI KABUPATEN SIDOARJO DENI SETIAWAN Physical condition is a whole unit from components that cannot be separated off hand , whether the improvement or the maintenance . Problem that proposed in this research is how the physical condition of footbal,” pp. 1–5.

P. Alkhairi and A. P. Windarto, “Classification Analysis of Back propagation-Optimized CNN Performance in Image Processing,” J. Syst. Eng. Inf. Technol., vol. 2, no. 1, pp. 8–15, 2023.

S. C. Sihombing and A. Dahlia, “Prediction Of Stock Close Price On The Five Best Issuers Forbes Global 2000 Version Using Chen’s Fuzzy Time Series Method,” Int. Conf. Bus. Soc. Sci., pp. 1163–1171, 2022.

R. K. Tipu, “Enhancing chloride concentration prediction in marine concrete using conjugate gradient-optimized backpropagation neural network,” Asian J. Civ. Eng., vol. 25, no. 1, pp. 637–656, 2024, doi: 10.1007/s42107-023-00801-3.

P. Alkhairi, E. R. Batubara, R. Rosnelly, W. Wanayaumini, and H. S. Tambunan, “Effect of Gradient Descent With Momentum Backpropagation Training Function in Detecting Alphabet Letters,” Sinkron, vol. 8, no. 1, pp. 574–583, 2023, doi: 10.33395/sinkron.v8i1.12183.

P. M. Kurniawan, “Prediction of Civil Servant Performance Allowances Using the Neural Network Backpropagation Method,” Int. J. Informatics Vis., vol. 7, no. 3, pp. 673–680, 2023, doi: 10.30630/joiv.7.3.1698.

U. Sharma, “Prediction of the compressive strength of Flyash and GGBS incorporated geopolymer concrete using artificial neural network,” Asian J. Civ. Eng., vol. 24, no. 8, pp. 2837–2850, 2023, doi: 10.1007/s42107-023-00678-2.

A. Javeed, S. U. Khan, L. Ali, S. Ali, Y. Imrana, and A. Rahman, “Machine Learning-Based Automated Diagnostic Systems Developed for Heart Failure Prediction Using Different Types of Data Modalities: A Systematic Review and Future Directions,” Comput. Math. Methods Med., vol. 2022, 2022, doi: 10.1155/2022/9288452.

Buntoro and G A, “Implementation of a Machine Learning Algorithm for Sentiment Analysis of Indonesia‘s 2019 Presidential Election,” IIUM Eng. J., vol. 22, no. 1, pp. 78–92, 2021, doi: 10.31436/IIUMEJ.V22I1.1532.

S. R. Dani, S. Solikhun, and D. Priyanto, “The Performance Machine Learning Powel-Beale for Predicting Rubber Plant Production in Sumatera,” Int. J. Eng. Comput. Sci. Appl., vol. 2, no. 1, pp. 29–38, 2023, doi: 10.30812/ijecsa.v2i1.2420.

A. P. Windarto, T. Herawan, and P. Alkhairi, “Prediction of Kidney Disease Progression Using K-Means Algorithm Approach on Histopathology Data,” in Artificial Intelligence, Data Science and Applications, Cham: Springer Nature Switzerland, 2024, pp. 492–497. doi: 10.1007/978-3-031-48465-0_66.

H. Wang, “Research on the Application of Random Forest-based Feature Selection Algorithm in Data Mining Experiments,” Int. J. Adv. Comput. Sci. Appl., vol. 14, no. 10, pp. 505–518, 2023, doi: 10.14569/IJACSA.2023.0141054.

P. Alkhairi, A. P. Windarto, and M. M. Efendi, “Optimasi LSTM Mengurangi Overfitting untuk Klasifikasi Teks Menggunakan Kumpulan Data Ulasan Film Kaggle IMDB,” vol. 6, no. 2, pp. 1142–1150, 2024, doi: 10.47065/bits.v6i2.5850.

D. R. K. Kumar Shubham, “Breast Cancer Detection Using Machine Learning Algorithms,” Lect. Notes Networks Syst., vol. 624 LNNS, no. 3, pp. 399–406, 2023, doi: 10.1007/978-3-031-25344-7_36.

I. Khajar, H. Hersugondo, and U. Udin, “Comparative study of sharia and conventional stock mutual fund performance: evidence from indonesia,” WSEAS Trans. Bus. Econ., vol. 16, pp. 78 – 85, 2019.

L. F. de J. Silva, O. A. C. Cortes, and J. O. B. Diniz, “A novel ensemble CNN model for COVID-19 classification in computerized tomography scans,” Results Control Optim., vol. 11, no. September 2022, p. 100215, 2023, doi: 10.1016/j.rico.2023.100215.

C. W. Teoh, S. B. Ho, K. S. Dollmat, and C. H. Tan, “Ensemble-Learning Techniques for Predicting Student Performance on Video-Based Learning,” Int. J. Inf. Educ. Technol., vol. 12, no. 8, pp. 741–745, 2022, doi: 10.18178/ijiet.2022.12.8.1679.

S. S. Rani, “An Automated Lion-Butterfly Optimization (LBO) based Stacking Ensemble Learning Classification (SELC) Model for Lung Cancer Detection,” Iraqi J. Comput. Sci. Math., vol. 4, no. 3, pp. 87–100, 2023, doi: 10.52866/ijcsm.2023.02.03.008.

A. P. Windarto, T. Herawan, and P. Alkhairi, “Early Detection of Breast Cancer Based on Patient Symptom Data Using Naive Bayes Algorithm on Genomic Data,” in Artificial Intelligence, Data Science and Applications, Y. Farhaoui, A. Hussain, T. Saba, H. Taherdoost, and A. Verma, Eds., Cham: Springer Nature Switzerland, 2024, pp. 478–484.

P. Venkata, “Data Mining and SVM Based Fault Diagnostic Analysis in Modern Power System Using Time and Frequency Series Parameters Calculated From Full-Cycle Moving Window,” J. Oper. Autom. Power Eng., vol. 12, no. 3, pp. 206–214, 2024, doi: 10.22098/JOAPE.2023.10819.1789.

S. N. Bhagat, “Coupling of Rough Set Theory and Predictive Power of SVM Towards Mining of Missing Data,” Int. Res. J. Multidiscip. Scope, vol. 5, no. 2, pp. 732–744, 2024, doi: 10.47857/irjms.2024.v05i02.0631.

P. Alkhairi, W. Wanayumini, and B. H. Hayadi, “Analysis of the adaptive learning rate and momentum effects on prediction problems in increasing the training time of the backpropagation algorithm,” AIP Conf. Proc., vol. 3048, no. 1, p. 20049, 2024, doi: 10.1063/5.0203374.

S. Poonkodi, “A review on lung carcinoma segmentation and classification using CT image based on deep learning,” Int. J. Intell. Syst. Technol. Appl., vol. 20, no. 5, pp. 394–413, 2022, doi: 10.1504/IJISTA.2022.125608.

N. Khan, “Enhanced Deep Learning Hybrid Model of CNN Based on Spatial Transformer Network for Facial Expression Recognition,” Int. J. Pattern Recognit. Artif. Intell., vol. 36, no. 14, 2022, doi: 10.1142/S0218001422520280.

S. Defit, A. P. Windarto, and P. Alkhairi, “Comparative Analysis of Classification Methods in Sentiment Analysis: The Impact of Feature Selection and Ensemble Techniques Optimization,” Telematika, vol. 17, no. 1, pp. 52–67, 2024.

P. Kaladevi, “An improved ensemble classification-based secure two stage bagging pruning technique for guaranteeing privacy preservation of DNA sequences in electronic health records,” J. Intell. Fuzzy Syst., vol. 44, no. 1, pp. 149–166, 2023, doi: 10.3233/JIFS-221615.

S. Nayak, Savita, and Y. K. Sharma, “A modified Bayesian boosting algorithm with weight-guided optimal feature selection for sentiment analysis,” Decis. Anal. J., vol. 8, no. July, p. 100289, 2023, doi: 10.1016/j.dajour.2023.100289.

F. E. Botchey, Z. Qin, and K. Hughes-Lartey, “Mobile money fraud prediction-A cross-case analysis on the efficiency of support vector machines, gradient boosted decision trees, and Naïve Bayes algorithms,” Inf., vol. 11, no. 8, 2020, doi: 10.3390/INFO11080383.

A. P. Windarto, I. R. Rahadjeng, M. N. H. Siregar, and P. Alkhairi, “Deep Learning to Extract Animal Images With the U-Net Model on the Use of Pet Images,” J. MEDIA Inform. BUDIDARMA, vol. 8, no. 1, pp. 468–476, 2024.

Q. N. Nguyen, N. C. Debnath, and V. D. Nguyen, “Face Recognition Based on Deep Learning and HSV Color Space BT - Proceedings of the 8th International Conference on Advanced Intelligent Systems and Informatics 2022,” A. E. Hassanien, V. Snášel, M. Tang, T.-W. Sung, and K.-C. Chang, Eds., Cham: Springer International Publishing, 2023, pp. 171–177.

S. V. Kiran, “Machine Learning with Data Science-Enabled Lung Cancer Diagnosis and Classification Using Computed Tomography Images,” Int. J. Image Graph., vol. 23, no. 3, 2023, doi: 10.1142/S0219467822400022.

M. Khan et al., “Ensemble and optimization algorithm in support vector machines for classification of wheat genotypes,” Sci. Rep., vol. 14, no. 1, p. 22728, 2024, doi: 10.1038/s41598-024-72056-0.

I. B. Santoso, Y. Adrianto, A. D. Sensusiati, D. P. Wulandari, and I. K. E. Purnama, “Ensemble Convolutional Neural Networks With Support Vector Machine for Epilepsy Classification Based on Multi-Sequence of Magnetic Resonance Images,” IEEE Access, vol. 10, pp. 32034–32048, 2022, doi: 10.1109/ACCESS.2022.3159923.

E. D. Wandekoken, F. M. Varejão, R. Batista, and T. W. Rauber, “Support Vector Machine Ensemble Based on Feature and Hyperparameter Variation for Real-World Machine Fault Diagnosis,” in Soft Computing in Industrial Applications, A. Gaspar-Cunha, R. Takahashi, G. Schaefer, and L. Costa, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 271–282.

Published
2025-02-06
How to Cite
Agus Perdana Windarto, Putrama Alkhairi, & Johan Muslim. (2025). Enhancing Premier League Match Outcome Prediction Using Support Vector Machine with Ensemble Techniques: A Comparative Study on Bagging and Boosting. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 9(1), 94 - 103. https://doi.org/10.29207/resti.v9i1.6173
Section
Information Technology Articles