KFCM-PSOTD : An Imputation Technique for Missing Values in Incomplete Data Classification
Data mining is a very important process for finding out the data interpretation. Data preprocessing is the crucial data mining steps. The existence of missing values in the data is one of the primary issues with data preprocessing. Generally, this can be overcome with mean or median imputation because they are easy to implement. However, the use of these techniques is not recommended because they ignore the data variance. This research develops the Kernel Fuzzy C-Means Optimized by the Particle Swarm Optimizer with Two Differential Mutations (KFCM-PSOTD). KFCM imputation is applied to obtain better estimation values due to its proven ability to recognize patterns in the data. In addition, the PSOTD algorithm is used as an optimization tool to boost the KFCM's performance. PSOTD is adopted because it has more balanced exploration and exploitation capabilities compared to classical PSO. Datasets that have been imputed on KFCM-PSOTD are classified using the Decision Tree algorithm. The results are evaluated using accuracy, precision, recall, and f1 score to determine the quality of the imputed values. The outcomes demonstrate that the KFCM-PSOTD algorithm has a better performance; even the difference in evaluation scores obtained reaches 10% better than other imputation techniques.
DOI: https://doi.org/10.18860/ca.v9i1.25138
Copyright (c) 2024 Muhaimin Ilyas, Syaiful Anam, Trisilowati Trisilowati

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
