Data Pipeline Engineering untuk LSTM Forecasting Seismisitas Melalui Integrasi Proses ETL Katalog Gempa Indonesia
DOI:
https://doi.org/10.61132/merkurius.v4i1.1426Keywords:
Data Pipeline, Earthquake Catalog, ETL, LSTM, Seismicity ForecastingAbstract
Indonesia, as a country with the highest seismicity in the world, requires an accurate earthquake prediction system through the use of the BMKG earthquake catalogue. This research aims to implement ETL-based data pipeline engineering to process 92,887 earthquake catalog entries for the 2008-2023 period into ready-to-use daily time series for the LSTM seismicity forecasting model. The ETL process includes raw data extraction, cleaning of 97% missing values columns on focal mechanism parameters, datetime conversion, daily resampling producing 5,200 entries with earthquake count, total magnitude, and average magnitude features, as well as Min-Max Scaler normalization for LSTM compatibility. The dataset was processed using Google Colab with a stacked LSTM architecture of two layers of 50 and 25 units, dropout 0.2, Adam optimizer, and a sequence window of 30 days to predict the daily earthquake count. The model trained for 100 epochs shows the ability to capture stable seismic activity trends with a consistent decrease in MSE loss, although it shows deviations in extreme spikes due to aftershock sequences. The ETL pipeline proved crucial in ensuring temporal consistency, 100% data completeness, and relevant physics representation, resulting in a reproducible end-to-end framework for disaster mitigation.
References
Akın, P., Koç, T., & Koç, H. (2026). Hybrid LSTM model with efficient hyperparameter tuning for earthquake magnitude prediction in Turkey. Soil Dynamics and Earthquake Engineering, 200, 109753. https://doi.org/10.1016/j.soildyn.2025.109753
Aulia, A. I., Adiono, T., Machbub, C., & Widiyantoro, S. (2025). LSTM regression models for real-time earthquake source localization from single station. Citizen: Jurnal Ilmiah Multidisiplin Indonesia, 5(3), 931–937. https://doi.org/10.53866/jimi.v5i3.913
BMKG. (2025). Earthquakes in Indonesia [Data set]. Kaggle. https://doi.org/10.34740/KAGGLE/DSV/13265963
Chanda, D. (2024). Automated ETL pipelines for modern data warehousing: Architectures, challenges, and emerging solutions. The Eastasouth Journal of Information System and Computer Science, 1(3), 209–212. https://doi.org/10.58812/esiscs.v1i03.523
Enhanced sliding-window deep learning for earthquake magnitude prediction: A multi-regional study on USGS data from Java–Bali, Iran, and Chile (1970–2020). (2026). International Journal of Intelligent Engineering and Systems, 19(2), 572–589. https://doi.org/10.22266/ijies2026.0228.36
Fazira, R., Yudistira, D., & Harahap, L. S. (2024). Evaluasi kinerja model RNN dan LSTM untuk prediksi magnitude gempa di Indonesia. Mars: Jurnal Teknik Mesin, Industri, Elektro dan Ilmu Komputer, 2(6), 62–75. https://doi.org/10.61132/mars.v2i6.498
Garani, G., Pramantiotis, G., & Arboleda, F. J. M. (2025). Spatio-temporal earthquake analysis via data warehousing for big data-driven decision systems. Computers, Materials & Continua, 1–10. https://doi.org/10.32604/cmc.2025.071509
Merdiansah, R., Wulandari, K., Hasibuan, M., & Umaidah, Y. (2024). Perbandingan kinerja model RNN, LSTM, dan BLSTM dalam memprediksi jumlah gempa bulanan di Indonesia. Jurnal Penelitian Rumpun Ilmu Teknik, 3(1), 262–277. https://doi.org/10.55606/juprit.v3i1.3466
Nurindahsari, S., Wiyono, S., & Dairoh. (2024). Predicting earthquake magnitudes in Indonesia: Exploring the potential of the Prophet algorithm. Jurnal Ilmu Komputer dan Informasi, 17(1), 77–87. https://doi.org/10.21609/jiki.v17i1.1203
Oliver, M., Smallwood, S., Moore, S., Carpenter, J. R., & Cohn, J. (2022). Dissecting my data body. Proceedings of the ACM on Computer Graphics and Interactive Techniques, 5(4), 1–9. https://doi.org/10.1145/3533387
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Prakosa, H. A., Choiruddin, A., & Widhianingsih, T. D. A. (2024). Prediction of earthquake intensity and location in Sumatra using deep learning. In 2024 IEEE Asia-Pacific Conference on Geoscience, Electronics and Remote Sensing Technology (AGERS) (pp. 178–184). IEEE. https://doi.org/10.1109/AGERS65212.2024.10932904
Putri, M. A., Suhendra, R., Ridho, A., Peunyareng, J. A., Darat, T., & Barat, A. (2025). Analisis kinerja algoritma long short-term memory (LSTM) untuk prediksi gempa bumi di Aceh. Jurnal Teknologi Informasi, 4(2), 8–18.
Quinteros-Cartaya, C., Quintero-Arenas, J., Padilla-Lafarga, A., Moraila, C., Faber, J., Li, W., Köhler, J., & Srivastava, N. (2025). A deep learning pipeline for large earthquake analysis using high-rate global navigation satellite system data. Earth Science Informatics, 18(4), 1–20. https://doi.org/10.1007/s12145-025-02023-4
Setiyawati, N., Bangkalang, D. H., & Asmara, G. W. (2025). Design and implementation of an ETL pipeline for prospective student data analysis in higher education admissions. Sistemasi, 14(5), 2125. https://doi.org/10.32520/stmsi.v14i4.5158
Zarkoni, A., Almais, A. T. W., Crysdian, C., Hariyadi, M. A., Pagalay, U., & Sugiharto, T. I. (2025). Utilizing long short-term memory (LSTM) networks for predicting seismic-induced building damage: A Bawean region case study. Jurnal Ilmiah Teknologi Informasi Asia, 20(1), 8–15. https://doi.org/10.32815/jitika.v20i1.1212
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Merkurius : Jurnal Riset Sistem Informasi dan Teknik Informatika

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



