Mall Customer Segmentation Using K-Means Clustering Optimized by the Elbow Method

Rossima Eva Yuliana, Diah Mariatul Ulya, Mohammad Jamhuri

Abstract


This study explores the effectiveness of K-Means clustering for segmenting mall customers based on demographic and behavioral features, using the Mall Customers dataset. The segmentation process focuses on three numerical attributes—age, annual income, and spending score—with an additional engineered feature: the spending-to-income ratio. After applying min-max normalization and log transformation, the Elbow Method was employed to determine the optimal number of clusters ($K=5$). The resulting clusters were evaluated using internal validation metrics, including Silhouette Score, Davies-Bouldin Index, and Calinski-Harabasz Index. K-Means clustering achieved the best overall performance compared to Gaussian Mixture Models (GMM), DBSCAN, and Agglomerative Hierarchical Clustering. Five interpretable customer profiles emerged, ranging from high-spending young professionals to low-engagement senior customers. These clusters were visualized using PCA for dimensionality reduction and further interpreted through descriptive statistics and domain-based labeling. Business implications were derived by aligning each cluster with strategic marketing recommendations. Overall, the findings reaffirm the utility of classical clustering frameworks such as K-Means—when rigorously validated and thoughtfully interpreted—for deriving actionable insights in customer analytics.

Keywords


Clustering, Customer Segmentation, Data Mining, K-Means, Elbow Method, Mall Customers.

Full Text:

PDF

References


[1] T. Iklima dan A. Pujiyanta, “Perbandingan metode k-means clustering dan metode Ward dalam mengelompokkan pelanggan mall,” JURNAL FASILKOM, vol. 13, no. 3, hlm. 349–357, 2023.

[2] H. Mulyani, R. A. Setiawan, dan H. Fathi, “Optimization of k value in clustering using silhouette score (case study: Mall customers data),” Journal of Information Technology and Its Utilization, vol. 6, no. 2, hlm. 45–50, 2023.

[3] A. Sabrina dan J. Heikal, “K-means clustering implementation for XYZ mall customer segmentation and marketing strategy using the marketing mix theory,” OPSearch: American Journal of Open Research, vol. 3, no. 2, hlm. 914–920, 2024.

[4] B. Y. Putra, F. Y. Azzahra, dan I. A. Erlanda, “Klasterisasi pengunjung mall menggunakan algoritma k-means berdasarkan pendapatan dan pengeluaran,” Jurnal Informatika dan Teknik Elektro Terapan, vol. 11, no. 3s1, 2023.

[5] Ashwani, G. Kaur, dan L. Rani, “Mall customer segmentation using k-means clustering,” dalam International Conference on Data Analytics & Management, Springer, 2023, hlm. 459–474.

[6] T. M. Dista dan F. F. Abdulloh, “Clustering pengunjung mall menggunakan metode k-means dan particle swarm optimization,” Jurnal Media Informasi Budidarma, vol. 6, no. 3, hlm. 1339, 2022.

[7] R. Xu dan D. Wunsch, “Survey of clustering algorithms,” IEEE Transactions on Neural Networks, vol. 16, no. 3, hlm. 645–678, 2005.

[8] M. Chen, S. Mao, dan Y. Liu, “Data mining for the Internet of Things: Literature review and challenges,” Journal of Network and Computer Applications, vol. 36, no. 1, hlm. 244–252, 2012.

[9] A. K. Jain, “Data clustering: 50 years beyond k-means,” Pattern Recognition Letters, vol. 31, no. 8, hlm. 651–666, 2010.

[10] D. Xu dan Y. Tian, “A comprehensive survey of clustering algorithms,” Annals of Data Science, vol. 2, hlm. 165–193, 2015.

[11] P. J. Rousseeuw, “Silhouettes: A graphical aid to the interpretation and validation of cluster analysis,” Journal of Computational and Applied Mathematics, vol. 20, hlm. 53–65, 1987.

[12] G. W. Milligan dan M. C. Cooper, “An examination of procedures for determining the number of clusters in a data set,” Psychometrika, vol. 50, no. 2, hlm. 159–179, 1985.

[13] M. Fraihat, S. Fraihat, M. Awad, dan M. AlKasassbeh, “An efficient enhanced k-means clustering algorithm for best offer prediction in telecom,” International Journal of Electrical and Computer Engineering, vol. 12, no. 3, hlm. 2931, 2022.

[14] M. U. Ijaz, “Analysis of clustering algorithms for mall,” International Journal of Wireless Communications and Mobile Computing, vol. 8, no. 2, hlm. 39–47, 2021.

[15] F. P. Rachman, H. Santoso, dan A. Djajadi, “Machine learning mini batch k-means and business intelligence utilization for credit card customer segmentation,” International Journal of Advanced Computer Science and Applications, vol. 12, no. 10, 2021.

[16] M. Jamhuri, M. I. Irawan, I. Mukhlash, M. Iqbal, dan N. N. T. Puspaningsih, “Neural networks optimization via Gauss–Newton based QR factorization on SARS-CoV-2 variant classification,” Systems and Soft Computing, hlm. 200195, 2025.




DOI: https://doi.org/10.18860/jrmm.v4i5.33389

Refbacks

  • There are currently no refbacks.