Improving the Accuracy of C4.5 Algorithm with Chi-Square Method on Pure Tea Classification Using Electronic Nose
Tea is one of the plantation products within the Ministry of Agriculture of the Republic of Indonesia, which plays an essential role as a mainstay commodity that boosts the Indonesian economy. Each type of tea has different properties, and the aroma of each type of tea can measure the quality of the tea. The human sense of smell is still very limited in classifying pure types of tea. Therefore, a device is needed to help measure the aroma of tea from an electronic nose. The devices attached to several gas sensors help humans take data from the smell of pure tea and calculate the value of each type of tea to test datasets with data mining algorithms. This study uses the C4.5 algorithm as a classification method with advantages over noise data, missing values, and handling variables with discrete and continuous types. Meanwhile, Chi-square is used to perform attribute severing in the data preprocessing process to increase the accuracy of dataset testing. Testing a pure tea dataset with four whole attributes, namely CO2, CO, H2, and CH4, using the C4.5 algorithm resulted in an accuracy of 93.65% and an increase in the accuracy performance of the C4.5 algorithm by 94.27% with dataset testing using Chi-Square feature selection with the two highest value attributes.
D. Sita, Kralawi; Rohdinan, Radar Opini dan Analisis Perkebunan, 2nd ed. Bandung: dePlantation, 2021.
B. P. Statistik, Statistik Teh Indonesia, 1st ed. Jakarta: Badan Pusat Statistik Republik Indonesia, 2018.
Ditjenbun, Buku Outlook Komoditas Perkebunan Teh. Jakarta: Pusdatin Kementerian Pertanian, 2019.
M. Xu, J. Wang, and L. Zhu, “Tea quality evaluation by applying E-nose combined with chemometrics methods,” J. Food Sci. Technol., vol. 58, no. 4, pp. 1549–1561, Apr. 2021, doi: 10.1007/s13197-020-04667-0.
S. Wakhid, R. Sarno, S. I. Sabilla, and D. B. Maghfira, “Detection and classification of indonesian civet and non-civet coffee based on statistical analysis comparison using E-Nose,” Int. J. Intell. Eng. Syst., vol. 13, no. 4, pp. 56–65, 2020, doi: 10.22266/IJIES2020.0831.06.
A. Nugroho and Y. Religia, “Analisis Optimasi Algoritma Klasifikasi Naive Bayes menggunakan Genetic Algorithm dan Bagging,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 3, pp. 504–510, 2021, doi: 10.29207/resti.v5i3.3067.
C. K. Lo, H. C. Chen, P. Y. Lee, M. C. Ku, L. Ogiela, and C. H. Chuang, “Smart dynamic resource allocation model for patient-driven mobile medical information system using C4.5 algorithm,” J. Electron. Sci. Technol., vol. 17, no. 3, pp. 231–241, 2019, doi: 10.11989/JEST.1674-862X.71018117.
B. F. Tanyu, A. Abbaspour, Y. Alimohammadlou, and G. Tecuci, “Landslide susceptibility analyses using Random Forest, C4.5, and C5.0 with balanced and unbalanced datasets,” Catena, vol. 203, Aug. 2021, doi: 10.1016/j.catena.2021.105355.
X. Meng, P. Zhang, Y. Xu, and H. Xie, “Construction of decision tree based on C4.5 algorithm for online voltage stability assessment,” Int. J. Electr. Power Energy Syst., vol. 118, Jun. 2020, doi: 10.1016/j.ijepes.2019.105793.
P. Gite, K. Chouhan, K. Murali Krishna, C. Kumar Nayak, M. Soni, and A. Shrivastava, “ML Based Intrusion Detection Scheme for various types of attacks in a WSN using C4.5 and CART classifiers,” Mater. Today Proc., Jul. 2021, doi: 10.1016/j.matpr.2021.07.378.
S. Sundaramurthy and P. Jayavel, “A hybrid Grey Wolf Optimization and Particle Swarm Optimization with C4.5 approach for prediction of Rheumatoid Arthritis,” Appl. Soft Comput. J., vol. 94, p. 106500, 2020, doi: 10.1016/j.asoc.2020.106500.
D. Marelli and M. Fu, “Asymptotic properties of statistical estimators using multivariate Chi-squared measurements,” Digit. Signal Process. A Rev. J., vol. 103, p. 102754, 2020, doi: 10.1016/j.dsp.2020.102754.
S. Bahassine, A. Madani, M. Al-Sarem, and M. Kissi, “Feature selection using an improved Chi-square for Arabic text classification,” J. King Saud Univ. - Comput. Inf. Sci., vol. 32, no. 2, pp. 225–231, 2020, doi: 10.1016/j.jksuci.2018.05.010.
N. Peker and C. Kubat, “Application of Chi-square discretization algorithms to ensemble classification methods,” Expert Syst. Appl., vol. 185, no. July, p. 115540, 2021, doi: 10.1016/j.eswa.2021.115540.
H. Chen, S. Fu, and H. Wang, “Optical coherence tomographic image denoising based on Chi-square similarity and fuzzy logic,” Opt. Laser Technol., vol. 143, no. July 2020, p. 107298, 2021, doi: 10.1016/j.optlastec.2021.107298.
F. Mahan, M. Mohammadzad, S. M. Rozekhani, and W. Pedrycz, “Chi-MFlexDT:Chi-square-based multi flexible fuzzy decision tree for data stream classification,” Appl. Soft Comput., vol. 105, p. 107301, 2021, doi: 10.1016/j.asoc.2021.107301.
D. B. Magfira and R. Sarno, “Classification of Arabica and Robusta coffee using electronic nose,” 2018 Int. Conf. Inf. Commun. Technol. ICOIACT 2018, vol. 2018-Janua, pp. 645–650, 2018, doi: 10.1109/ICOIACT.2018.8350725.
D. R. Wijaya, R. Sarno, and E. Zulaika, “Electronic nose dataset for beef quality monitoring in uncontrolled ambient conditions,” Data Br., vol. 21, pp. 2414–2420, 2018, doi: 10.1016/j.dib.2018.11.091.
W. Harsono, R. Sarno, and S. I. Sabilla, “Recognition of original arabica civet coffee based on odor using electronic nose and machine learning,” Proc. - 2020 Int. Semin. Appl. Technol. Inf. Commun. IT Challenges Sustain. Scalability, Secur. Age Digit. Disruption, iSemantic 2020, pp. 333–339, 2020, doi: 10.1109/iSemantic50169.2020.9234234.
A. I. F. Al Isyrofie et al., “Odor clustering using a gas sensor array system of chicken meat based on temperature variations and storage time,” Sens. Bio-Sensing Res., vol. 37, no. July, p. 100508, 2022, doi: 10.1016/j.sbsr.2022.100508.
L. Ji, P. Liu, and S. Robert, “Tail asymptotic behavior of the supremum of a class of chi-square processes,” Stat. Probab. Lett., vol. 154, p. 108551, 2019, doi: 10.1016/j.spl.2019.07.001.
Copyright (c) 2023 Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright in each article belongs to the author
- The author acknowledges that the RESTI Journal (System Engineering and Information Technology) is the first publisher to publish with a license Creative Commons Attribution 4.0 International License.
- Authors can enter writing separately, arrange the non-exclusive distribution of manuscripts that have been published in this journal into other versions (eg sent to the author's institutional repository, publication in a book, etc.), by acknowledging that the manuscript has been published for the first time in the RESTI (Rekayasa Sistem dan Teknologi Informasi) journal ;