Genetic Algorithm for Variable Selection and Parameter Optimization in SVM and Fuzzy SVM for Colon Cancer Microarray Classification

Irhamah Irhamah, Elok Faiqah, Heri Kuswanto, NLP Satyaning Pradnya Paramita

Abstract


Colon cancer is the second leading cause of cancer-related deaths in the world, hence research on that topic needs to be undertaken with improvement. Recent advanced in microarray technology allows the monitoring of the expression level of a large set of genes simultaneously. Microarray data is a type of high-dimensional data with hundreds or even thousands number of genes (features), while usually the number of patients observed (observations) is much smaller than the number of features. This study uses a colon cancer microarray dataset contains two class of genes, normal and tumor. The aims of this study is to develop a classification model using fuzzy support vector machines (FSVM) hybridized with genetic algorithm (GA) for classifying individuals based on gene expression. Fuzzy memberships was used in SVM in order to deal with the case of imbalanced microarray data. Meanwhile, the role of genetic algorithm is, firstly, to select the relevant genes as the features and, secondly, to optimize the parameter of FSVM as GA is able to handle the problem of nonlinear optimization that has a high dimension, adaptable, and easily combined with other methods. The classification using FCBF selection has a higher accuracy value than the ones without the selection. The results also show that FSVM that has been optimized using GA has the highest accuracy value compared to other classification methods used in this study.

Keywords


Feature Selection; Fuzzy SVM; Genetic Algorithm; Parameter Optimization; SVM

Full Text:

PDF

References


WHO, World Cancer Day : Global Action to Avert 8 Million Cancer-Realted Deaths By 2015. Retrieved from https://www.who.int/mediacentre/news/releases/2006/pr06/en/, 2002

T. R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, & J.P. Mesirov, "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring", Science, 286, 531-537, 1999

M. M. Babu, Introduction To Micoarray Data Analysis, U.K : Horizon Press, 2013

T.S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, & D. Haussler, "Support vector machine classification and validation of cancer tissue samples using microarray expression data". Bioinformatics, Vol. 16, No. 6 , 906-914, 2000

Y.-N. Chen, C.-A. Lu, and C.-Y. Huang, Anti Spam Filter Based on Naive Bayes, SVM and KNN Model. Sillicon Valley: Carnegie Mellon School, 2009

R. Aziz, C. K. Verma, & N. Srivastava, "A Fuzzy Based Feature Selection from Independent Component Subspace for Machine Learning Classification of Microarray Data", Genomics Data, Vol. 8, 4-15, 2016

S. Yenaeng, S. Saelee, & W. Samai, "Automatic Medical Case Study Essay Scoring by Support Vector Machine and Genetic Algorithm", International Journal of Information and Education Technology, Vol. 4, No. 2, 132-137, 2014

C. L. Huang & C. J. Wang, "A GA-based Feature Selection and Parameters Optimization for Support Vector Machines", Expert Systems with Application, Vol. 31 , 231- 240, 2006

H. Roubos & M. Setnes, Compact Fuzzy Models and Classifiers through Model Reduction and Evolutionary Optimization. In L. Chambers, The Practical Handbook of Genetic Algorithms, 2001

L. Yu, and H. Liu, Feature Selection for High Dimentional Data : A Fast Correlation-Based Filter Solution, Proceedings of the Twentieth International Conference on Machine Learning (ICML). Washington DC, 2003

J. Quinlan, C4.5: Programs for machine learning. Morgan Kaufmann, 1993

S. Abe and T. Inoue, Fuzzy Support Vector Machines for Multiclass Problems, Jepang : Kobe University, 2002

V. N. Vapnik, The Nature of Statictical Learning Theory 2nd Edition, Springer-Verlag: New York Berlin Heidelberg, 1999

B. Scholkopf and A. Smola, Learning with Kernel: Support Vector Machines, Regulerization, Optimization, and Beyond. Cambridge: MA: MIT Press, 2002

C. Lin and S. Wang, "Fuzzy Support Vector Machines", IEEE Trans. Neural Network , 464-471, 2002

S. Selvaraj and J. Natarajan, "Microarray Data Analysis and Mining Tools", Bioinformation, 6(3), 95-99, 2011

A. P. Kusumaningrum, Optimasi Parameter Supprort Vector Machine Menggunakan Genetic Algorithm Untuk Klasifikasi Microarray Data. Surabaya : DepartemenStatistika FMKSD ITS, 2018

U. Alon, N. Barkai, D. A. Notterman, K. Gish, S. Ybarra, D. Mack, and A. J. Levine, "Broad Patterns Of Gene Expression Revealed By Clustering Analysis Of Tumor And Normal Colon Tissues Probed By Oligonucleotide Arrays", ProcNatlAadSci U S A. Jun 8;96(12):6745-6750, 1999




DOI: https://doi.org/10.18860/ca.v7i1.10337

Refbacks

  • There are currently no refbacks.


Copyright (c) 2021 Irhamah Irhamah, Elok Faiqah, Heri Kuswanto, NLP Satyaning Pradnya Paramita

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Editorial Office
Mathematics Department,
Universitas Islam Negeri Maulana Malik Ibrahim Malang
Gajayana Street 50 Malang, East Java, Indonesia 65144
Faximile (+62) 341 558933
e-mail: cauchy@uin-malang.ac.id

Creative Commons License
CAUCHY: Jurnal Matematika Murni dan Aplikasi is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.