Feature Selection Risk Factors Cervical Cancer Using Hybrid Methods Random Forest and FOX-Inspired Optimization Algorithm
Abstract
Cervical cancer is one of the number four causes of death among women worldwide, with about 604,000 new cases and 324,000 deaths each year. Human Papillomavirus infection is one of the main factors in almost 99% of cervical cancer cases. In addition to HPV, other risk factors such as smoking, long-term use of oral contraceptives, and weak immunity also play an important role. Along with the development of technology and in an effort to detect cervical cancer early, machine learning algorithms have been widely used to analyze the risk of cervical cancer, one of which is Random Forest (RF). One of the main challenges in early detection of cervical cancer is the large amount of irrelevant and redundant data, which can reduce the accuracy of predictions, making feature selection imperative. SI is able to combine new algorithms to improve performance in feature selection. One of the SI-based optimization algorithms is the FOX-Inspired Optimization Algorithm. The results of research that has been carried out using the RF-FOX hybrid method, the Num of pregnancies feature has proven to be the most influential factor in detecting the risk of cervical cancer in patients. In addition, other features such as First sexual intercourse, Number of sexual partners, age, and Hormonal Contraceptives also occupy the top five most influential features. Therefore, the hybrid RF-FOX method allows the performance of the model to be more optimized, thus helping in the identification of patients at risk of cervical cancer more precisely.
Keywords
Full Text:
PDFReferences
[1] F. Bray, J. Ferlay, I. Soerjomataram, R. L. Siegel, L. A. Torre, and A. Jemal, “Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries,” CA Cancer J Clin, vol. 68, no. 6, pp. 394–424, Nov. 2018, doi: 10.3322/caac.21492.
[2] A. Mathis, U. D. Smith, V. Crowther, T. Lee, and S. Suther, “An Epidemiological Study of Cervical Cancer Trends among Women with Human Immunodeficiency Virus,” Healthcare (Switzerland), vol. 12, no. 12, Jun. 2024, doi: 10.3390/healthcare12121178.
[3] WHO, “World Health Organization,” Risk Factors Cervical Cancer. Accessed: Mar. 06, 2024. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death
[4] L. Rahangdale, C. Mungo, S. O’Connor, C. J. Chibwesha, and N. T. Brewer, “Human papillomavirus vaccination and cervical cancer risk,” 2022, BMJ Publishing Group. doi: 10.1136/bmj-2022-070115.
[5] N. Y. Ozturk, S. Z. Hossain, M. Mackey, S. Adam, and P. Brennan, “HPV and Cervical Cancer Awareness and Screening Practices among Migrant Women: A Narrative Review,” Apr. 01, 2024, Multidisciplinary Digital Publishing Institute (MDPI). doi: 10.3390/healthcare12070709.
[6] S. Choi, A. Ismail, G. Pappas-Gogos, and S. Boussios, “HPV and Cervical Cancer: A Review of Epidemiology and Screening Uptake in the UK,” Feb. 01, 2023, MDPI. doi: 10.3390/pathogens12020298.
[7] N. Al Mudawi and A. Alazeb, “A Model for Predicting Cervical Cancer Using Machine Learning Algorithms,” Sensors, vol. 22, no. 11, Jun. 2022, doi: 10.3390/s22114132.
[8] S. F. Abdoh, M. Abo Rizka, and F. A. Maghraby, “Cervical cancer diagnosis using random forest classifier with SMOTE and feature reduction techniques,” IEEE Access, vol. 6, pp. 59475–59485, 2018, doi: 10.1109/ACCESS.2018.2874063.
[9] B. Nithya and V. Ilango, “Evaluation of machine learning based optimized feature selection approaches and classification methods for cervical cancer prediction,” SN Appl Sci, vol. 1, no. 6, Jun. 2019, doi: 10.1007/s42452-019-0645-7.
[10] C. P. Vandana and A. A. Chikkamannur, “Feature selection: An empirical study,” International Journal of Engineering Trends and Technology, vol. 69, no. 2, pp. 165–170, Feb. 2021, doi: 10.14445/22315381/IJETT-V69I2P223.
[11] L. Brezočnik, I. Fister, and V. Podgorelec, “Swarm intelligence algorithms for feature selection: A review,” Sep. 01, 2018, MDPI AG. doi: 10.3390/app8091521.
[12] G. Senthilkumar et al., “Incorporating Artificial Fish Swarm in Ensemble Classification Framework for Recurrence Prediction of Cervical Cancer,” IEEE Access, vol. 9, pp. 83876–83886, 2021, doi: 10.1109/ACCESS.2021.3087022.
[13] M. Onay, “A New and Fast Optimization Algorithm: Fox Hunting Algorithm (FHA),” 2016.
[14] D. Połap and M. Woźniak, “Red fox optimization algorithm,” Expert Syst Appl, vol. 166, Mar. 2021, doi: 10.1016/j.eswa.2020.114107.
[15] H. Mohammed and T. Rashid, “FOX: a FOX-inspired optimization algorithm,” Applied Intelligence, vol. 53, pp. 1030–1050, 2023, doi: 10.1007/s10489-022-03533-0/Published.
[16] R. Sharma et al., “Comparative performance analysis of binary variants of FOX optimization algorithm with half-quadratic ensemble ranking method for thyroid cancer detection,” Sci Rep, vol. 13, no. 1, Dec. 2023, doi: 10.1038/s41598-023-46865-8.
[17] L., Breiman, Random forests. Machine learning, 45, 5-32. 2001.
[18] A. Cutler, D. R. Cutler, and J. R. Stevens, “Random Forests,” in Ensemble Machine Learning, New York, NY: Springer New York, 2012, pp. 157–175. doi: 10.1007/978-1-4419-9326-7_5.
[19] P. Bhargav and K. Sashirekha, “A Machine Learning Method for Predicting Loan Approval by Comparing the Random Forest and Decision Tree Algorithms,” 2023.
[20] S. Gupta and M. K. Gupta, “Computational Prediction of Cervical Cancer Diagnosis Using Ensemble-Based Classification Algorithm,” Computer Journal, vol. 65, no. 6, pp. 1527–1539, Jun. 2021, doi: 10.1093/comjnl/bxaa198.
[21] R. Ashtagi et al., “Cervical Cancer Prediction Using Machine Learning,” 2024.
[22] A. Tak, P. M. Parihar, D. S. Fatehpuriya, and Y. Singh, “Optimised Feature Selection and Cervical Cancer Prediction Using Machine Learning Classification,” Scripta Medica (Banja Luka), vol. 53, no. 3, pp. 205–211, Sep. 2022, doi: 10.5937/scriptamed53-38848.
[23] R. Alsmariy, G. Healy, and H. Abdelhafez, “Predicting Cervical Cancer using Machine Learning Methods,” 2020. [Online]. Available: www.ijacsa.thesai.org
[24] Y.-Long. Jiang, T.-Ao. Tang, and Ye. Fan, ICSICT-2018 : 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT) : proceedings :Oct. 31- Nov. 3, 2018, Qingdao, China. IEEE, 2018.
[25] J. Lu, E. Song, A. Ghoneim, and M. Alrashoud, “Machine learning for assisting cervical cancer diagnosis: An ensemble approach,” Future Generation Computer Systems, vol. 106, pp. 199–205, May 2020. doi: 10.1016/j.future.2019.12.033.
[26] A. Telikani, A. Tahmassebi, W. Banzhaf, and A. H. Gandomi, “Evolutionary Machine Learning: A Survey,” Nov. 01, 2022, Association for Computing Machinery. doi: 10.1145/3467477.
[27] A. Baita, I. A. Prasetyo, and N. Cahyono, “HYPERPARAMETER TUNING ON RANDOM FOREST FOR DIAGNOSE COVID-19,” JIKO (Jurnal Informatika dan Komputer), vol. 6, no. 2, Aug. 2023, doi: 10.33387/jiko.v6i2.6389.
[28] B. Majhi and Prastavana, “A feature selection model using binary FOX optimization and v-shaped transfer function for network IDS,” Peer Peer Netw Appl, 2024, doi: 10.1007/s12083-024-01720-z.
[29] D. Chicco and G. Jurman, “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, vol. 21, no. 1, Jan. 2020, doi: 10.1186/s12864-019-6413-7.
[30] M. Liao, H. Wen, L. Yang, G. Wang, X. Xiang, and X. Liang, “Improving the model robustness of flood hazard mapping based on hyperparameter optimization of random forest,” Expert Syst Appl, vol. 241, May 2024, doi: 10.1016/j.eswa.2023.122682.
DOI: https://doi.org/10.18860/ca.v9i2.29582
Refbacks
- There are currently no refbacks.
Copyright (c) 2024 Afidatul Masbakhah, Umu Sa'adah, Mohamad Muslikh
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Editorial Office
Mathematics Department,
Universitas Islam Negeri Maulana Malik Ibrahim Malang
Gajayana Street 50 Malang, East Java, Indonesia 65144
Faximile (+62) 341 558933
e-mail: cauchy@uin-malang.ac.id
CAUCHY: Jurnal Matematika Murni dan Aplikasi is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.