Utilizing the K-Means Algorithm for Breast Cancer Diagnosis: A Promising Approach for Improved Early Detection
Abstract
Breast cancer is a pressing non-communicable disease, especially affecting women, with its incidence on the rise. In 2020, it ranked among the most common cancers in Indonesia. Timely detection and precise diagnosis are pivotal for effective breast cancer management. To enhance diagnostic accuracy, the K-means clustering method is applied to group patients based on shared attributes. This research aims to contribute significantly to breast cancer diagnosis by leveraging the K-means method, potentially improving patient survival rates.
The research process involves data collection, preprocessing, K-means application, evaluation, and visualization. A dataset of 569 breast cancer patient records with 32 attributes from Kaggle is utilized. The K-Means algorithm is assessed using accuracy, yielding a value of 0.8457, signifying good performance. Malignant cases (211) and benign cases (301) are visualized in a scatter plot, distinguishing between them.
In conclusion, this study presents an initial step in utilizing the K-means algorithm for breast cancer diagnosis, offering promising results. Further research and the development of more advanced models are imperative to address the global health challenge posed by breast cancer among women.
Index Terms—breast cancer; clustering; K-Means Algorithm
Full Text:
PDFReferences
Agusta Yudi, “K-Means – Penerapan, Permasalahan dan Metode Terkait,” J. Sist. dan Inform., vol. 3, no. Februari, pp. 47–60, 2007.
H. Dewi, “Analisis risiko kanker payudara berdasar riwayat pemakaian kontrasepsi hormonal dan usia,” J. Berk. Epidemiol., vol. 3, no. 1, pp. 12–23, 2015.
Mehmed Kantardzic, “Data Mining: Concepts, Models, Methods, and Algorithms,” p. 360, 2003.
F. Gullo, “From patterns in data to knowledge discovery: What data mining can do,” Phys. Procedia, vol. 62, pp. 18–22, 2015, doi: 10.1016/j.phpro.2015.02.005.
B. Santosa, T. Conway, and T. Trafalis, “A hybrid knowledge based-clustering multi-class svm approach for genes expression analysis,” Springer Optim. Its Appl., vol. 7, pp. 231–274, 2007, doi: 10.1007/978-0-387-69319-4_15.
A. K. J. A. Harding, M. Shahbaz, Srinivas, “Data Mining in Manufacturing: A Review,” vol. 128, no. 4, 2006, doi: https://doi.org/10.1115/1.2194554.
P. Eko, “Data Mining - Konsep dan Aplikasi Menggunakan Matlab,” 2012.
B. Harahap, “Penerapan Algoritma K-Means untuk Menentukan Bahan Bangunan Laris (Studi Kasus Pada UD. Toko Bangunan YD Indarung),” Reg. Dev. Ind. Heal. Sci. Technol. Art Life, pp. 394–403, 2019, [Online]. Available: https://ptki.ac.id/jurnal/index.php/readystar/article/view/82.
S. Agarwal, “Data mining: Data mining concepts and techniques,” Proc. - 2013 Int. Conf. Mach. Intell. Res. Adv. ICMIRA 2013, pp. 203–207, 2014, doi: 10.1109/ICMIRA.2013.45.
Y. P. Aritama, “Penerapan Metode K-Means Clustering untuk Mengelompokkan Data Kasus COVID-19 di Indonesia,” USD Repos., pp. 11–12, 2022.
DOI: https://doi.org/10.18860/mat.v15i2.23644
Refbacks
- There are currently no refbacks.
Copyright (c) 2023 Nur Fitriyah Ayu Tunjung Sari, Maharini Nabela, Muhammad Falah Abdurrohman
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The journal is indexed by :
_______________________________________________________________________________________________________________
Editorial Office:
Informatics Engineering Department
Faculty of Science and Technology
Universitas Islam Negeri Maulana Malik Ibrahim Malang
Jalan Gajayana 50 Malang, Jawa Timur, Indonesia 65144
Email: matics@uin-malang.ac.id
_______________________________________________________________________________________________________________
This work is licensed under a CC-BY-NC-SA 4.0.
© All rights reserved 2015. MATICS , ISSN : 1978-161X | e-ISSN : 2477-2550