Imputation Missing Value to Overcome Sparsity Problems in The Recommendation System

  • Sri Lestari Institut Informatika dan Bisnis Darmajaya
  • M. Elrico Afdila Institut Informatika dan Bisnis Darmajaya
  • Yan Aditiya Pratama Institut Informatika dan Bisnis Darmajaya
Keywords: missing value, stochastic hot-deck, imputation

Abstract

A recommendation system is a system that provides suggestions or recommendations to a product or service for its users. One of the problems encountered in the recommendation system is sparsity, namely the lack of available data for analysis, resulting in poor performance of the recommendation system because it cannot provide the proper recommendations. On this basis, this study proposes the mean method and the stochastic hot-deck method to calculate missing values to improve the quality of the recommendations. The experimental results show that the hot-deck imputation method gives better results than the mean imputation method with smaller RMSE and MAE values, namely 2,706 and 2,691.

Downloads

Download data is not yet available.

References

Anang Furkon Rifai, Erwin Budi Setiawan, (2022). Memory-based Collaborative Filtering on Twitter Using Support Vector Machine Classification. JURNAL RESTI, 702 - 209.

X. Wang, Z. Dai, H. Li, and J. Yang,(2020). A New Collaborative Filtering Recommendation Method Based on Transductive SVM and Active Learning. Discreet. Dyn. Nat. Soc., vol. 2020, no. 1.

L. Ren and W. Wang,(2019). An SVM-based collaborative filtering approach for Top-N web services recommendation. Futur. Gener. Comput. Syst., vol. 78, pp. 531–543.

von Hippel PT,(2020). How many imputations do you need? A two-stage calculation using a quadratic rule. Social methods res., 2020;49:699-718.

Melinda, Imam Muttaqin, M., Nurdin, Y., & Bahri, A. (2023). Implementation of Word Recommendation System Using Hybrid Method for Speed Typing Website. JURNAL RESTI, 7(1), 7–14.

Islamiyah, M., Subekti, P., Dwi Andini, T., & Asia Malang, S. (2019). Pemanfaatan Metode Item Based Collaborative Filtering Untuk Rekomendasi Wisata Di Kabupaten Malang. Jurnal Ilmiah Teknologi Informasi Asia, 13(2).

G. Vink, “Roderick J. Little and Donald B. Rubin: Statistical Analysis with Missing Data,” Psychometrika, 2002

Subagyo, I., Dwi Yulianto, L., Permadi, W., Dewantara, A. W., & Hartanto, A. D. (2019). Sentiment Analisis Review Film Di IMDB Menggunakan Algoritma SVM Sentiment Analysis of Film Review at IMDB using SVM algorithm., INFORMASI (Vol. 47).

H. Tahmasebi, R. Ravanmehr, and R. Mohamadrezaei (2021). Social movie recommender system based on deep autoencoder network using Twitter data. Neural Comput. Appl., vol. 33, no. 5, pp. 1607–1623.

Austin, P. C., White, I. R., Lee, D. S., & van Buuren, S. (2021). Missing Data in Clinical Research: A Tutorial on Multiple Imputation. Canadian Journal of Cardiology, 37(9), 1322–1331.

Raudhatunnisa, T., & Wilantika, N. (2022). Performance Comparison of Hot-Deck Imputation, K-Nearest Neighbor Imputation, and Predictive Mean Matching in Missing Value Handling, Case Study: March 2019 SUSENAS Kor Dataset. Proceedings of The International Conference on Data Science and Official Statistics, 2021(1), 753–770.

Evenson, R. E. (2017). A Stochastic Model of Applied Research Author ( s ): Robert E . Evenson and Yoav Kislev Source: Journal of Political Economy, Vol . 84, No . 2 ( Apr ., 1976 ), pp . 265-282

Ritvik Voleti. (2020). Data Wrangling- A Goliath of Data Industry. International Journal of Engineering Research And, V9(08).

Jiang, S., Kahn, J(2020). Data wrangling practices and collaborative interactions with aggregated data. Intern. J. Comput.-Support. Collab. Learn 15, 257–281.

P. Wibowo and C. Fatichah(2021). In-depth performance analysis of the oversampling techniques for high-class imbalanced datasets. vol. 7, no. January, pp. 63–71.

Xu, Y., Goodacre, R (2018). On Splitting Training and Validation Set: A Comparative Study of Cross-Validation, Bootstrap and Systematic Sampling for Estimating the Generalization Performance of Supervised Learning. J. Anal. Test. 2, 249–262.

Ilham, A. (2020). Hybrid Metode Boostrap Dan Teknik Imputasi Pada Metode C4-5 Untuk Prediksi Penyakit Ginjal Kronis. Statistika, 8(1), 43–51.

Dhimas Irnawan, F., Hidayah, I., & Nugroho, L. E. (2021). Metode Imputasi pada Data Debit Daerah Aliran Sungai Opak, Provinsi DI Yogyakarta. Jurnal Nasional Teknik Elektro dan Teknologi Informasi, (Vol. 10).

Ilham, A. (2020). Hybrid Metode Boostrap Dan Teknik Imputasi Pada Metode C4-5 Untuk Prediksi Penyakit Ginjal Kronis. Statistika, 8(1), 43–51.

Fadillah, I. J., Muchlisoh, S., Statistika, P., & Stis, P. S. (2021). Perbandingan Metode Hot-Deck Imputation dan Metode KNNI dalam Mengatasi Missing Values. Jurnal Ilmiah Politeknik Statistika STIS, 275-285.

Zahara, S., & Sugianto. (2021). Peramalan Data Indeks Harga Konsumen Berbasis Time Series Multivariate Menggunakan Deep Learning. JURNAL RESTI, 5(1), 24–30.

Ilham, A. (2020). Hybrid Metode Boostrap Dan Teknik Imputasi Pada Metode C4-5 Untuk Prediksi Penyakit Ginjal Kronis. Statistika, 8(1), 43–51.

Moch Farryz Rizkilloh and Sri Widiyanesti (2022) .Prediksi Harga Cryptocurrency Menggunakan Algoritma Long Short Term Memory (LSTM). J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 6, no. 1.

Azam Zamhuri Fuadi, Irsyad Nashirul Haq and Edi Leksono (2021). Support Vector Machine to Predict Electricity Consumption in the Energy Management Laboratory. JURNAL RESTI, Vol. 5 No. 3 466 - 473.

I. M. Yudha Arya Dala, I. K. Gede Darma Putra, and P. Wira Buana (2021). Forecasting Cases of Dengue Hemorrhagic Fever Using the Backpropagation, Gaussians and Support-Vector Machine Methods. JURNAL RESTI, vol. 5, no. 2.

Published
2023-11-26
How to Cite
Sri Lestari, M. Elrico Afdila, & Yan Aditiya Pratama. (2023). Imputation Missing Value to Overcome Sparsity Problems in The Recommendation System. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 7(6), 1285 - 1291. https://doi.org/10.29207/resti.v7i6.5300
Section
Information Technology Articles