Feature Selection Based on Artificial Bee Colony and Gradient Boosting Decision Tree for Hotel Reservation Cancellation Prediction Using Random Forest

Hamida Maulana Lailatal Baroah, Lukman Hakim

Abstract


This study focuses on predicting hotel booking cancellations using machine learning to improve accuracy and operational efficiency. The methods used include Random Forest, Artificial Bee Colony (ABC), and Gradient Boosting Decision Tree (GBDT). ABC, which excels in optimization but is prone to local optima, is combined with GBDT for feature selection. The dataset used is Hotel_Bookings from Kaggle, with 119,390 entries and 28 features. The data is processed through cleansing, normalization, and split into 75% for training and 25% for testing. Feature selection is performed using ABC and GBDT, and the prediction model is built using Random Forest. Model evaluation using confusion matrix and metrics like precision, recall, f1-score, and accuracy shows accuracies of 86.17% and 86.65% for ABC and GBDT, respectively. Increasing the number of trees and features generally improves model performance, with feature selection showing significant performance improvements compared to models without feature selection.

Full Text:

PDF

References


[1] H. Annur, “Penerapan Algoritma Naïve Bayes Berbasis Backward Elimination Untuk Prediksi Pemesanan Kamar Hotel,” J. Ilm. Ilmu Komput. Banthayo Lo Komput., vol. 1, no. 1, pp. 1–5, 2022, doi: 10.37195/balok.v1i1.99.

[2] F. H. Qani’ah, R. Ramadhan, and ..., “Prediksi Pembatalan Reservasi Hotel Menggunakan Algoritma Naive Bayes,” … Informatics …, vol. 4, no. 1, pp. 76–80, 2023, [Online]. Available: https://journal.univpancasila.ac.id/index.php/jiac/article/view/5499%0Ahttps://journal.univpancasila.ac.id/index.php/jiac/article/download/5499/2514

[3] D. M. Wardani, “Hotel Reservation Policy Pada Masa Pandemi : Refund, Rescedule Atau Cancel Di Labuanbajo,” J. Pariwisata, vol. 8, no. 1, pp. 63–72, 2021, doi: 10.31294/par.v8i1.10021.

[4] I. Gusti Naufhal Daffa Adnyana, R. Mufli Arjuna, A. Nur Indraini, and D. Sandya Pasvita, “Pengaruh Seleksi Fitur pada Algoritma Machine Learning untuk Memprediksi Pembatalan Pesanan Hotel,” Semin. Nas. Mhs. Ilmu Komput. dan Apl., no. April, pp. 551–558, 2021.

[5] F. Sholahuddin, Mohammad, A. Holik, C. Suprapto, I. Mahendra, Iqbal, S. Wibawanto, and M. Kurniawan, “Perbandingan Model Logistic Regression dan K-Nearest Neighbors Dalam Prediksi Pembatalan Hotel,” Semin. Nas. Tek. Elektro, Sist. Informasi, dan Tek. Inform., pp. 137–143, 2023.

[6] R. Rosaly and A. Prasetyo, “Pengertian Flowchart Beserta Fungsi dan Simbol-simbol Flowchart yang Paling Umum Digunakan,” Https://Www.Nesabamedia.Com, vol. 2, p. 2, 2019, [Online]. Available: https://www.nesabamedia.com/pengertian-flowchart/https://www.nesabamedia.com/pengertian-flowchart/

[7] E. Rahmawati, A. B. Nando, C. Agustina, and F. C. Kusumarini, “Perbandingan Teknik Resample pada Algoritma K-NN dan SVM untuk Prediksi Pembatalan Pemesanan Kamar Hotel,” J. Teknol. Inf. dan Terap. (J-TIT, vol. 10, no. 2, pp. 2580–2291, 2023, [Online]. Available: https://doi.org/10/25047/jtit.v10i2.333

[8] Y. Azhar, G. A. Mahesa, and M. C. Mustaqim, “Prediction of hotel bookings cancellation using hyperparameter optimization on Random Forest algorithm,” J. Teknol. dan Sist. Komput., vol. 9, no. 1, pp. 15–21, 2021, doi: 10.14710/jtsiskom.2020.13790.

[9] I. S. Manuel and I. Ernawati, “Implementasi GLCM dan Algoritma Naive Bayes Dalam Klasifikasi Jenis Bunga Anggrek,” Senamika, vol. 1, no. 2, pp. 99–109, 2020, [Online]. Available: https://conference.upnvj.ac.id/index.php/senamika/article/download/638/427

[10] A. Afifuddin and L. Hakim, “Deteksi Penyakit Diabetes Mellitus Menggunakan Algoritma Decision Tree Model Arsitektur C4.5,” J. Krisnadana, vol. 3, no. 1, pp. 25–33, 2023, doi: 10.58982/krisnadana.v3i1.470.

[11] F. Rahmadani, A. M. H. Pardede, and Nurhayati, “Jaringan Syaraf Tiruan Prediksi Jumlah Pengiriman Barang Menggunakan Metode Backpropagation,” J. Tek. Inform. Kaputama, vol. 5, no. 1, pp. 100–106, 2021.

[12] D. N. Aini, B. Oktavianti, M. J. Husain, D. A. Sabillah, S. T. Rizaldi, and M. Mustakim, “Seleksi Fitur untuk Prediksi Hasil Produksi Agrikultur pada Algoritma K-Nearest Neighbor (KNN),” J. Sist. Komput. dan Inform., vol. 4, no. 1, p. 140, 2022, doi: 10.30865/json.v4i1.4813.

[13] H. Tantyoko, D. K. Sari, and A. R. Wijaya, “Prediksi Potensial Gempa Bumi Indonesia Menggunakan Metode Random Forest Dan Feature Selection,” IDEALIS Indones. J. Inf. Syst., vol. 6, no. 2, pp. 83–89, 2023, doi: 10.36080/idealis.v6i2.3036.

[14] A. Nurdiansyah, M. T. Furqon, and B. Rahayudi, “Prediksi Harga Bitcoin Menggunakan Metode Extreme Learning Machine (ELM) dengan Optimasi Artificial Bee Colony (ABC),” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 3, no. 6, pp. 5531–5539, 2019, [Online]. Available: http://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/5507

[15] H. Rao et al., “Feature selection based on artificial bee colony and gradient boosting decision tree,” Appl. Soft Comput. J., vol. 74, pp. 634–642, 2019, doi: 10.1016/j.asoc.2018.10.036.

[16] I. Wardhana, Musi Ariawijaya, Vandri Ahmad Isnaini, and Rahmi Putri Wirman, “Gradient Boosting Machine, Random Forest dan Light GBM untuk Klasifikasi Kacang Kering,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 6, no. 1, pp. 92–99, 2022, doi: 10.29207/resti.v6i1.3682.

[17] E. Fitri, “Analisis Perbandingan Metode Regresi Linier, Random Forest Regression dan Gradient Boosted Trees Regression Method untuk Prediksi Harga Rumah,” J. Appl. Comput. Sci. Technol., vol. 4, no. 1, pp. 58–64, 2023, doi: 10.52158/jacost.v4i1.491.

[18] Y. Yuliani, “Algoritma Random Forest Untuk Prediksi Kelangsungan Hidup Pasien Gagal Jantung Menggunakan Seleksi Fitur Bestfirst,” Infotek J. Inform. dan Teknol., vol. 5, no. 2, pp. 298–306, 2022, doi: 10.29408/jit.v5i2.5896.

[19] M. S. T. Putra and Y. Azhar, “Perbandingan Model Logistic Regression dan Artificial Neural Network pada Prediksi Pembatalan Hotel,” JISKA (Jurnal Inform. Sunan Kalijaga), vol. 6, no. 1, pp. 29–37, 2021, doi: 10.14421/jiska.2021.61-04.

[20] P. H. Saputro and H. Nanang, “Exploratory Data Analysis & Booking Cancelation Prediction on Hotel Booking Demands Datasets,” J. Appl. Data Sci., vol. 2, no. 1, pp. 40–56, 2021, doi: 10.47738/jads.v2i1.20.

[21] J. Zeniarja, A. Salam, and F. A. Ma’ruf, “Seleksi Fitur dan Perbandingan Algoritma Klasifikasi untuk Prediksi Kelulusan Mahasiswa,” J. Rekayasa Elektr., vol. 18, no. 2, pp. 102–108, 2022, doi: 10.17529/jre.v18i2.24047.




DOI: https://doi.org/10.18860/mat.v16i2.28862

Refbacks

  • There are currently no refbacks.




Copyright (c) 2024 Hamida Maulana Lailatal Baroah, Lukman Hakim

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

The journal is indexed by :

Dimensions Sinta CrossRef GoogleScholar
Index Copernicus Moraref Portal Garuda

 

_______________________________________________________________________________________________________________

Editorial Office:
Informatics Engineering Department
Faculty of Science and Technology
Universitas Islam Negeri Maulana Malik Ibrahim Malang
Jalan Gajayana 50 Malang, Jawa Timur, Indonesia 65144
Email: matics@uin-malang.ac.id
_______________________________________________________________________________________________________________

Creative Commons License
This work is licensed under a CC-BY-NC-SA 4.0.
© All rights reserved 2015. MATICS , ISSN : 1978-161X | e-ISSN :  2477-2550