Penerapan Algoritma C4.5 dengan Teknik Resample untuk Prediksi Kinerja Pegawai PT X
DOI:
https://doi.org/10.61132/mars.v3i4.1049Keywords:
C4.5 (J48) Algorithm, Classification, Employee Performance Prediction, Resample, TestingAbstract
Human Resource Management (HRM) plays a strategic role in improving organizational competitiveness through proper management of employee placement, training, and performance evaluation. To support the achievement of these goals, a predictive model is needed that can provide an accurate picture of employee performance. This study utilizes a Human Resource Management (HRM) dataset of 1,200 data and applies several classification algorithms to compare their effectiveness, namely J48 or C4.5, Random Forest, Naive Bayes, K-Nearest Neighbor (KNN), Logistic Regression, and Support Vector Machine (SVM). To obtain more optimal results, this study uses resampling techniques and attribute selection methods with a correlation attribute eval approach, so that class distribution can be more balanced and model accuracy increases. From the test results, the Decision Tree J48 algorithm showed the best performance with an accuracy level reaching 95.41%, a kappa value of 0.8925, a mean absolute error (MAE) of 0.0432, a precision of 0.955, a recall of 0.954, and an area under the ROC curve of 0.964. These findings indicate that J48 has excellent predictive capabilities compared to other algorithms. Furthermore, this study also found that the most influential variables in determining employee performance include the percentage of the last salary increase (EmpLast Salary Hike Percent), the level of work environment satisfaction (Emp Environment Satisfaction), the length of time since the last promotion (Years Since Last Promotion), and experience in the current role (Experience Years in Current Role). Overall, the results of the study indicate that the C4.5 algorithm with the application of the resampling technique can be an optimal solution in building an employee performance prediction system. Thus, this model has the potential to be a strong basis for managerial decision-making, particularly in designing HR development strategies and policies to improve organizational performance.
References
A’yuniyah, Q. A., & Reza, M. (2023). Penerapan algoritma K-Nearest Neighbor untuk klasifikasi jurusan siswa di SMA Negeri 15 Pekanbaru. Indonesian Journal of Informatic Research and Software Engineering (IJIRSE), 3(1), 39–45. https://doi.org/10.57152/ijirse.v3i1.484
Abdillah, M. A., Setyanto, A., & Sudarmawan, S. (2020). Implementasi decision tree algoritma C4.5 untuk memprediksi kesuksesan pendidikan karakter. Respati, 15(2), 59. https://doi.org/10.35842/jtir.v15i2.349
Agung, A., Daniswara, A., Kadek, I., & Nuryana, D. (2023). Data preprocessing patterns in the assessment of teacher education program students. Journal of Informatics and Computer Science, 5, 97–100.
Diana, D., Indrajit, R. E., & Dazki, E. (2022). Komparasi algoritma Naïve Bayes, logistic regression dan support vector machine pada klasifikasi file application package kit Android malware. JUTISI: Jurnal Ilmiah Teknik Informatika dan Sistem Informasi, 11(1), 109. https://doi.org/10.35889/jutisi.v11i1.815
Gori, T., Sunyoto, A., & Al Fatta, H. (2024). Preprocessing data dan klasifikasi untuk prediksi kinerja akademik siswa. Jurnal Teknologi Informasi dan Ilmu Komputer, 11(1), 215–224. https://doi.org/10.25126/jtiik.20241118074
Hamami, F., & Dahlan, I. A. (2022). Klasifikasi cuaca Provinsi DKI Jakarta menggunakan algoritma random forest dengan teknik oversampling. Jurnal Teknoinfo, 16(1), 87. https://doi.org/10.33365/jti.v16i1.1533
Ibrahim, N. H., & Khikmah, L. (2024). Perbandingan metode algoritma C4.5, Naïve Bayes, dan logistic regression untuk penentuan kelayakan penerima kredit. Teknologi, 14(2), 85–93. https://doi.org/10.26594/teknologi.v14i2.4650
Kurniawan, I., Buani, D. C. P., Abdussomad, A., Apriliah, W., & Saputra, R. A. (2023). Implementasi algoritma random forest untuk menentukan penerima bantuan raskin. Jurnal Teknologi Informasi dan Ilmu Komputer, 10(2), 421–428. https://doi.org/10.25126/jtiik.20236225
Luhur, U. B., Utara, P., Pembaharuan, P., & Asuransi, P. (2019). Data mining klasifikasi untuk memprediksi status keberlanjutan polis asuransi kesehatan dengan algoritme Naïve Bayes. Jurnal Data Mining, 3(10), 219–223.
Pahlevi, O., Amrin, A., & Handrianto, Y. (2023). Implementasi algoritma klasifikasi random forest untuk penilaian kelayakan kredit. Jurnal Infortech, 5(1), 71–76. https://doi.org/10.31294/infortech.v5i1.15829
Putri Latif, D., & Ali, H. (2025). Pengaruh pengambilan keputusan, investasi teknologi informasi dan pengembangan SDM terhadap efisiensi operasional. Jurnal Komunikasi dan Ilmu Sosial, 3(1), 1–10. https://doi.org/10.38035/jkis.v3i1.1724
Ramadani, P., Fadillah, R., Adawiyah, Q., & Al Ghazali, B. R. (2024). Perbandingan algoritma Naïve Bayes, C4.5, dan K-Nearest Neighbor untuk klasifikasi kelayakan program keluarga harapan. Jurnal Media Informatika, 6(2), 775–782.
Susanti, Y., Choyyin, M. G., Priyatna, A., & Lestari, S. (2023). Perbandingan penerapan algoritma decision tree C4.5 dan Naïve Bayes dalam analisa kelulusan siswa pada SMK Swadhipa 2 Natar Kabupaten Lampung Selatan. Jurnal SIMADA (Sistem Informasi dan Manajemen Basis Data), 6(2), 117–123. https://doi.org/10.30873/simada.v6i2.3772
Umri, S. A. (2021). Analisis dan komparasi algoritma klasifikasi dalam indeks pencemaran udara di DKI Jakarta. JIKO (Jurnal Informatika dan Komputer), 4(2), 98–104. https://doi.org/10.33387/jiko.v4i2.2871
Wijiyanto, W., Pradana, A. I., Sopingi, S., & Atina, V. (2024). Teknik K-Fold cross validation untuk mengevaluasi kinerja mahasiswa. Jurnal Algoritma, 21(1), 239–248. https://doi.org/10.33364/algoritma/v.21-1.1618
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Mars : Jurnal Teknik Mesin, Industri, Elektro Dan Ilmu Komputer

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



