Identifikasi Kemiripan Teks Menggunakan Class Indexing Based dan Cosine Similarity Untuk Klasifikasi Dokumen Pengaduan

Syahroni Wahyu Iriananda, Muhammad Aziz Muslim, Harry Soekotjo Dachlan


Report handling on "LAPOR!" systemdepends on the system administrator who manually reads every incoming report [3]. Read manually can lead to errors
in handling complaints [4] if the data flow is very large and grows rapidly it can take at least three days and sensitive to inconsistencies [3]. In this study, the authors propose a model that can measure and identify the similarity of document reports computerized that can identify the similarity between the Query (Incoming) with Document (Archive). In this study, the authors employed term weighting scheme Class-Based Indexing, and Cosine Similarity to analyze document similarities. CoSimTFIDF, CoSimTFICF and CoSimTFIDFICF values are defined as feature sets for the text classification process using the KNearest

Neighbor (K-NN) method. The optimum result
evaluation with preprocessing employ Stemming and the best
result of all features is 75% training data ratio and 25% test
data on the CoSimTFIDF feature that is 84%. Value k = 5
has a high accuracy of 84.12%


Complaints;Text Similarity;Class-Based Indexing;Cosine Similarity;K-Nearest Neighbor;LAPOR!

Full Text:



A. Sofyan And S. Santosa, “Text Mining Untuk Klasifikasi Pengaduan Pada Sistem Lapor Menggunakan Metode C4.5 Berbasis Forward Selection,” Cyberku J., Vol. 12, No. 1, Pp. 8–8, 2016.

I. Surjandari, “Application of Text Mining for Classification of Textual Reports: A Study of Indonesia’s National Complaint Handling System,” in 6th International Conference on Industrial Engineering and Operations Management (IEOM 2016), Kuala Lumpur, Malaysia.

A. Fauzan and M. L. Khodra, “Automatic multilabel categorization using learning to rank framework for complaint text on Bandung government,” in 2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA), 2014, pp. 28–33.

S. Tjandra, A. A. P. Warsito, and J. P. Sugiono, “Determining citizen complaints to the appropriate government departments using KNN algorithm,” in 2015 13th International Conference on ICT and Knowledge Engineering (ICT Knowledge Engineering 2015), 2015, pp. 1–4.

W. H. Gomaa and A. A. Fahmy, “A Survey of Text Similarity Approaches,” Int. J. Comput. Appl., vol. 68, no. 13, pp. 13–18, 2013.

M. A. Rosid, G. Gunawan, and E. Pramana, “Centroid Based Classifier With TF – IDF – ICF for Classfication of Student’s Complaint at Appliation E-Complaint in Muhammadiyah University of Sidoarjo,” J. Electr. Electron. Eng.-UMSIDA, vol. 1, no. 1, pp. 17–24, Feb. 2016.

R. R. M. Putri, R. Y. Herlambang, and R. C. Wihandika, “Implementasi Metode K-Nearest Neighbour Dengan Pembobotan TF.IDF.ICF Untuk Kategorisasi Ide Kreatif Pada Perusahaan,” J. Teknol. Inf. Dan Ilmu Komput., vol. 4, no. 2, pp. 97–103, May 2017.

C. F. Suharno, M. A. Fauzi, and R. S. Perdana, “Klasifikasi Teks Bahasa Indonesia Pada Dokumen Pengaduan Sambat Online Menggunakan Metode K-Nearest Neighbors (K-NN) dan Chi-Square,” J. Pengemb. Teknol. Inf. Dan Ilmu Komput. Vol 1 No 10 2017, Jul. 2017.

N. H. A. Sari, M. A. Fauzi, and P. P. Adikara, “Klasifikasi Dokumen Sambat Online Menggunakan Metode K-Nearest Neighbor dan Features Selection Berbasis Categorical Proportional Difference,” J. Pengemb. Teknol. Inf. Dan Ilmu Komput. Vol 2 No 8 2018, Oct. 2017.

A. A. Prasanti, M. A. Fauzi, and M. T. Furqon, “Klasifikasi Teks Pengaduan Pada Sambat Online Menggunakan Metode N-Gram dan Neighbor Weighted K-Nearest Neighbor (NW-KNN),” J. Pengemb. Teknol. Inf. Dan Ilmu Komput. Vol 2 No 2 2018, Aug. 2017.

S. Dong and Z. Wang, “Evaluating service quality in insurance customer complaint handling throught text categorization,” in 2015 International Conference on Logistics, Informatics and Service Sciences (LISS), 2015, pp. 1–5.

D. Wang and H. Zhang, “Inverse-Category-Frequency based supervised term weighting scheme for text categorization,” J. Inf. Sci. Eng., vol. 29, no. 2, pp. 209–225, Dec. 2010.

F. Ren and M. G. Sohrab, “Class-indexing-based term weighting for automatic text classification,” Inf. Sci., vol. 236, pp. 109–125, Jul. 2013.

S. W. Iriananda, “Klasifikasi Dokumen Pengaduan Dengan Cosine Similariy Berbasis Class-Based Indexing,” Open Science Framework, Jul. 2018.



  • There are currently no refbacks.

Copyright (c) 2019 Syahroni Wahyu Iriananda, Muhammad Aziz Muslim, Harry Soekotjo Dachlan

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


Editorial Office:
Jurusan Teknik Informatika
Fakultas Sains dan Teknologi
Universitas Islam Negeri Maulana Malik Ibrahim Malang
Jalan Gajayana 50 Malang, Jawa Timur, Indonesia 65144

Creative Commons License
This work is licensed under a CC-BY-NC-SA.
© All rights reserved 2015. MATICS , ISSN : 1978-161X | e-ISSN :  2477-2550