Penerapan Algoritma C4.5 dengan Teknik Resample untuk Prediksi Kinerja Pegawai PT X

Wahyu Saputro

doi:10.61132/mars.v3i4.1049

Authors

Wahyu Saputro Politeknik Prasetiya Mandiri

DOI:

https://doi.org/10.61132/mars.v3i4.1049

Keywords:

C4.5 (J48) Algorithm, Classification, Employee Performance Prediction, Resample, Testing

Abstract

Human Resource Management (HRM) plays a strategic role in improving organizational competitiveness through proper management of employee placement, training, and performance evaluation. To support the achievement of these goals, a predictive model is needed that can provide an accurate picture of employee performance. This study utilizes a Human Resource Management (HRM) dataset of 1,200 data and applies several classification algorithms to compare their effectiveness, namely J48 or C4.5, Random Forest, Naive Bayes, K-Nearest Neighbor (KNN), Logistic Regression, and Support Vector Machine (SVM). To obtain more optimal results, this study uses resampling techniques and attribute selection methods with a correlation attribute eval approach, so that class distribution can be more balanced and model accuracy increases. From the test results, the Decision Tree J48 algorithm showed the best performance with an accuracy level reaching 95.41%, a kappa value of 0.8925, a mean absolute error (MAE) of 0.0432, a precision of 0.955, a recall of 0.954, and an area under the ROC curve of 0.964. These findings indicate that J48 has excellent predictive capabilities compared to other algorithms. Furthermore, this study also found that the most influential variables in determining employee performance include the percentage of the last salary increase (EmpLast Salary Hike Percent), the level of work environment satisfaction (Emp Environment Satisfaction), the length of time since the last promotion (Years Since Last Promotion), and experience in the current role (Experience Years in Current Role). Overall, the results of the study indicate that the C4.5 algorithm with the application of the resampling technique can be an optimal solution in building an employee performance prediction system. Thus, this model has the potential to be a strong basis for managerial decision-making, particularly in designing HR development strategies and policies to improve organizational performance.

References

A’yuniyah, Q. A., & Reza, M. (2023). Penerapan algoritma K-Nearest Neighbor untuk klasifikasi jurusan siswa di SMA Negeri 15 Pekanbaru. Indonesian Journal of Informatic Research and Software Engineering (IJIRSE), 3(1), 39–45. https://doi.org/10.57152/ijirse.v3i1.484

Abdillah, M. A., Setyanto, A., & Sudarmawan, S. (2020). Implementasi decision tree algoritma C4.5 untuk memprediksi kesuksesan pendidikan karakter. Respati, 15(2), 59. https://doi.org/10.35842/jtir.v15i2.349

Agung, A., Daniswara, A., Kadek, I., & Nuryana, D. (2023). Data preprocessing patterns in the assessment of teacher education program students. Journal of Informatics and Computer Science, 5, 97–100.

Diana, D., Indrajit, R. E., & Dazki, E. (2022). Komparasi algoritma Naïve Bayes, logistic regression dan support vector machine pada klasifikasi file application package kit Android malware. JUTISI: Jurnal Ilmiah Teknik Informatika dan Sistem Informasi, 11(1), 109. https://doi.org/10.35889/jutisi.v11i1.815

Gori, T., Sunyoto, A., & Al Fatta, H. (2024). Preprocessing data dan klasifikasi untuk prediksi kinerja akademik siswa. Jurnal Teknologi Informasi dan Ilmu Komputer, 11(1), 215–224. https://doi.org/10.25126/jtiik.20241118074

Hamami, F., & Dahlan, I. A. (2022). Klasifikasi cuaca Provinsi DKI Jakarta menggunakan algoritma random forest dengan teknik oversampling. Jurnal Teknoinfo, 16(1), 87. https://doi.org/10.33365/jti.v16i1.1533

Ibrahim, N. H., & Khikmah, L. (2024). Perbandingan metode algoritma C4.5, Naïve Bayes, dan logistic regression untuk penentuan kelayakan penerima kredit. Teknologi, 14(2), 85–93. https://doi.org/10.26594/teknologi.v14i2.4650

Kurniawan, I., Buani, D. C. P., Abdussomad, A., Apriliah, W., & Saputra, R. A. (2023). Implementasi algoritma random forest untuk menentukan penerima bantuan raskin. Jurnal Teknologi Informasi dan Ilmu Komputer, 10(2), 421–428. https://doi.org/10.25126/jtiik.20236225

Luhur, U. B., Utara, P., Pembaharuan, P., & Asuransi, P. (2019). Data mining klasifikasi untuk memprediksi status keberlanjutan polis asuransi kesehatan dengan algoritme Naïve Bayes. Jurnal Data Mining, 3(10), 219–223.

Pahlevi, O., Amrin, A., & Handrianto, Y. (2023). Implementasi algoritma klasifikasi random forest untuk penilaian kelayakan kredit. Jurnal Infortech, 5(1), 71–76. https://doi.org/10.31294/infortech.v5i1.15829

Putri Latif, D., & Ali, H. (2025). Pengaruh pengambilan keputusan, investasi teknologi informasi dan pengembangan SDM terhadap efisiensi operasional. Jurnal Komunikasi dan Ilmu Sosial, 3(1), 1–10. https://doi.org/10.38035/jkis.v3i1.1724

Ramadani, P., Fadillah, R., Adawiyah, Q., & Al Ghazali, B. R. (2024). Perbandingan algoritma Naïve Bayes, C4.5, dan K-Nearest Neighbor untuk klasifikasi kelayakan program keluarga harapan. Jurnal Media Informatika, 6(2), 775–782.

Susanti, Y., Choyyin, M. G., Priyatna, A., & Lestari, S. (2023). Perbandingan penerapan algoritma decision tree C4.5 dan Naïve Bayes dalam analisa kelulusan siswa pada SMK Swadhipa 2 Natar Kabupaten Lampung Selatan. Jurnal SIMADA (Sistem Informasi dan Manajemen Basis Data), 6(2), 117–123. https://doi.org/10.30873/simada.v6i2.3772

Umri, S. A. (2021). Analisis dan komparasi algoritma klasifikasi dalam indeks pencemaran udara di DKI Jakarta. JIKO (Jurnal Informatika dan Komputer), 4(2), 98–104. https://doi.org/10.33387/jiko.v4i2.2871

Wijiyanto, W., Pradana, A. I., Sopingi, S., & Atina, V. (2024). Teknik K-Fold cross validation untuk mengevaluasi kinerja mahasiswa. Jurnal Algoritma, 21(1), 239–248. https://doi.org/10.33364/algoritma/v.21-1.1618

Penerapan Algoritma C4.5 dengan Teknik Resample untuk Prediksi Kinerja Pegawai PT X

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Menu new