Penerapan Algoritma C4.5 Pada Imbalanced Dataset Untuk Memprediksi Kegagalan Angsuran Properti

Yodi Susanto, Devit Setiono, Muhammad Syafrullah

Abstract

In this research, the data collection carried out by studying the patterns of consumers who fail to pay, which aimed to build a model so that it could be used in predicting customers who have the potential to fail to pay. The research used the Cross-Industry Standard Process for Data Mining (CRISP-DM) method with details of the business understanding process, data understanding, data preparation, modeling, evaluation and deployment / interpretation. The dataset in this research was taken from sales, cancellation and consumer data from January 2016 to December 2019. Because the dataset in this research was an imbalanced dataset, the researchers tried to use Synthetic Minority Oversampling Technique (SMOTE) in handling the imbalanced dataset. The research conducted a comparison of the value of accuracy, precision, recall, f measure and Area Under the ROC Curve (AUC) between the original dataset and the dataset for the addition of the SMOTE technique to several algorithms including C4.5, K-NN and Naïve Bayes. The attributes used in this research were source of funds, purpose of purchase, age, selling price, occupation, total installments, percentage of total installments, monthly installments, percentage of late installments and status. From the comparison, it was found that the C4.5 algorithm with the SMOTE 480% dataset had the highest accuracy value of 97.62%, precision of 0.976, recall of 0.976, f measure of 0.976 and AUC of 0.986 which meant Excellent Classification. From the research conducted, it was expected that the model formed on the imbalanced dataset with the C4.5 and SMOTE algorithms could be used to predict consumer installment failures.

Full Text:

PDF

References

M. Roestamy and R. Rahmawati, “Model Pengembangan Paradigma Masyarakat bagi Kepemilikan Rumah yang Terpisah dari Tanah,” Mimb. Huk. - Fak. Huk. Univ. Gadjah Mada, vol. 30, no. 2, p. 331, 2018, doi: 10.22146/jmh.17646.

D. T. Larose and C. D. Larose, DISCOVERING KNOWLEDGE IN DATA An Introduction to Data Mining Second Edition Wiley Series on Methods and Applications in Data Mining. 2014.

A. Jafar Hamid and T. M. Ahmed, “Developing Prediction Model of Loan Risk in Banks Using Data Mining,” Mach. Learn. Appl. An Int. J., vol. 3, no. 1, pp. 1–9, 2016, doi: 10.5121/mlaij.2016.3101.

T. T. Muryono and I. Irwansyah, “Implementasi Data Mining Untuk Menentukan Kelayakan Pemberian Kredit Dengan Menggunakan Algoritma K-Nearest Neighbors (K-NN),” Infotech J. Technol. Inf., vol. 6, no. 1, pp. 43–48, Jun. 2020, doi: 10.37365/it.v6i1.78.

M. Hasan, “Prediksi Tingkat Kelancaran Pembayaran Kredit Bank Menggunakan Algoritma Naïve Bayes Berbasis forward Selection,” vol. 9, pp. 317–324, 2017.

D. M. Tampubolon, “Evaluasi Performa Kredit Menggunakan Data Mining untuk menilai permohonan Kredit Fasilitas Layanan Pembiayaan Perumahan: Studi Kasus PT. Bank XYZ,” 2017.

Novakovic, J., Veljovi, A., Iiic, S., Papic, Z. dan Tomovic, M. (2017) “Evaluation of Classification Models in Machine Learning,” Theory and Applications of Mathematics & Computer Science, 7(1), hal. 39–46.

A. Subasi and S. Cankurt, “Prediction of default payment of credit card clients using Data Mining Techniques,” Proc. 5th Int. Eng. Conf. IEC 2019, pp. 115–120, 2019, doi: 10.1109/IEC47844.2019.8950597.

Taufiq, I., Nur, A., Setiawan, N. Y. dan Bachtiar, F. A. (2018) “Prediksi Kredit Macet Berdasarkan Preferensi Nasabah Menggunakan Metode Klasifikasi C4 . 5 pada Koperasi Simpan Pinjam Mitra Raya Wates,” J-ptiik, 2(12), hal. 6118–6127.

Kasanah, A. N., Muladi, M. dan Pujianto, U. (2019) “Penerapan Teknik SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Objektivitas Berita Online Menggunakan Algoritma KNN,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), 3(2), hal. 196–201. doi: 10.29207/resti.v3i2.945.

Refbacks

  • There are currently no refbacks.