Analisis Perbandingan Algoritma Regresi Linear dan Decision Tree untuk Prediksi Dropout Mahasiswa

Authors

  • Abdah Syakiroh Gustian Universitas Sebelas April
  • Asep Saeppani Universitas Sebelas April

DOI:

https://doi.org/10.61132/merkurius.v4i1.1362

Keywords:

Comparison, Decision Tree, Linear Regression, Machine Learning, Student Dropout

Abstract

This study aims to develop an effective predictive model for identifying students at risk of academic dropout using the Decision Tree and Linear Regression algorithms. The data used are sourced from the public Kaggle dataset Students Dropout and Academic Success, which includes demographic, socioeconomic, and academic performance variables for each semester. The research method includes data preprocessing stages, such as data cleaning, label encoding for categorical variables, numeric feature normalization, and target class adjustment to focus on binary classification, namely Dropout and Graduate. The modeling process is carried out by comparing the performance of the two algorithms using evaluation metrics of accuracy, precision, and recall. The results show that the Decision Tree algorithm has superior performance compared to Linear Regression in mapping non-linear patterns in student data. Feature importance analysis revealed that the number of curricular units in the second semester and tuition payment status are the main predictors of dropout risk. These findings are expected to assist educational institutions in implementing early interventions to improve student academic success.

 

References

Ahmad, F., Ismail, N., & Khan, S. (2020). Student academic performance prediction using machine learning algorithms. IEEE Access, 8, 67–79. https://doi.org/10.1109/ACCESS.2020.2968515

Aljohani, A. (2016). A comprehensive review of factors influencing student dropout in higher education. Education and Information Technologies, 21, 983–1010. https://doi.org/10.1007/s10639-015-9363-y

Alturki, H., & Alturki, M. (2019). Using decision tree algorithm for classifying students’ academic risk. Procedia Computer Science, 163, 16–24. https://doi.org/10.1016/j.procs.2019.12.076

Amani, N. N., Martanto, M., & Hayati, U. (2024). Penggunaan algoritma decision tree untuk prediksi prestasi siswa di Sekolah Dasar Negeri 3 Bayalangu Kidul. JATI: Jurnal Mahasiswa Teknik Informatika, 8(1), 473–479. https://doi.org/10.36040/jati.v8i1.8355

Cortez, P., & Silva, A. M. G. (2015). Using data mining to predict secondary school student performance. Journal of Educational Data Mining, 7(1), 1–17.

Dagdagui, R. T. (2022). Predicting students’ academic performance using regression analysis. American Journal of Educational Research, 10(11), 640–646. https://doi.org/10.12691/education-10-11-2

Esananda, S. C., Nugroho, B., & Anggraeny, F. (2021). Penerapan algoritma decision tree dalam menentukan prestasi akademik siswa. Jurnal Informatika dan Sistem Informasi (JIFOSI), 2(2), 413–424. https://doi.org/10.33005/jifosi.v2i2.311

Fadilla, Z., et al. (2021). Metodologi penelitian kuantitatif. Penerbit Zaini.

Khosravi, T. M., Kitto, K., & Pardo, A. (2020). Predicting students’ academic risk using machine learning. Computers & Education, 158, Article 104117. https://doi.org/10.1016/j.compedu.2020.104117

Lakkaraju, H., Leskovec, J., & Kleinberg, J. (2015). Modeling student dropout using semi-supervised learning. Dalam Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (hlm. 10–18). https://doi.org/10.1145/2783258.2783387

Mahwiz, M. (2024). Students dropout and academic success dataset. Kaggle. https://www.kaggle.com/datasets/mahwiz/students-dropout-and-academic-success-dataset

Nurhidayat, A., Asmunin, & Suyatno, D. F. (2021). Prediksi kinerja akademik mahasiswa menggunakan machine learning dengan sequential minimal optimization. Journal of Information Engineering and Educational Technology, 5(2), 84–91. https://doi.org/10.26740/jieet.v5n2.p84-91

Patil, S., & Kulkarni, U. (2022). A comparative study of machine learning models for student dropout prediction. Education and Information Technologies, 27(3), 113–130. https://doi.org/10.1007/s10639-021-10563-3

Putra, D., & Lestari, N. (2020). Klasifikasi mahasiswa berisiko dropout menggunakan decision tree C4.5. Jurnal Sistem dan Teknologi Informasi (JUSTIN).

Rahman, M. F., & Yulianti, I. (2021). Prediksi dropout mahasiswa menggunakan algoritma machine learning. Jurnal Ilmiah Informatika, 9(1).

Rajagukguk, S. A. (2021). Tinjauan pustaka sistematis: Prediksi prestasi belajar peserta didik dengan algoritma pembelajaran mesin. Jurnal Sains, Nalar, dan Aplikasi Teknologi Informasi, 1(1). https://doi.org/10.20885/snati.v1i1.4

Romero, C., & Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, 40(6), 601–618.

Sumarmo, W. (2021). Penerapan data mining dalam memprediksi risiko putus studi mahasiswa. Dalam Seminar Nasional Teknologi Informasi dan Komunikasi (TIK) UNIKOM.

Suyanto, A., & Sari, R. (2021). Prediksi dropout mahasiswa menggunakan algoritma random forest dan logistic regression. Jurnal Teknologi Informasi dan Ilmu Komputer.

Zen, L. A., Wardani, R. M., & Firmansyah, R. (2020). Penerapan algoritma random forest untuk prediksi kelulusan mahasiswa. Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer (JPTIIK), 4(7).

Downloads

Published

2026-01-17

How to Cite

Abdah Syakiroh Gustian, & Asep Saeppani. (2026). Analisis Perbandingan Algoritma Regresi Linear dan Decision Tree untuk Prediksi Dropout Mahasiswa. Merkurius : Jurnal Riset Sistem Informasi Dan Teknik Informatika, 4(1), 155–164. https://doi.org/10.61132/merkurius.v4i1.1362

Similar Articles

<< < 1 2 3 4 5 6 7 8 > >> 

You may also start an advanced similarity search for this article.