Analisis Machine Learning pada Data Netflix Shows untuk Mengklasifikasikan Tren Genre dan Karakteristik Film
DOI:
https://doi.org/10.61132/mars.v3i6.1389Keywords:
Content Classification, Data Mining, Genre Analysis, Naive Bayes, NetflixAbstract
The rapid development of digital streaming platforms such as Netflix has generated a large volume of content data with diverse characteristics, thereby requiring effective analytical methods to understand emerging patterns and trends. This study aims to classify Netflix content into two main categories, namely movies and television shows, and to analyze genre trends and content characteristics using a data mining approach with the Naive Bayes algorithm. The dataset used in this study is the Netflix Shows dataset, consisting of 8,809 content entries, with the primary features analyzed including genre, rating, and country of production. The research process begins with data exploration and preprocessing stages, including data cleaning, handling missing values, and transforming categorical features to enable effective model construction. Subsequently, the dataset is divided into training and testing sets to objectively and systematically build and evaluate the Naive Bayes classification model. Model performance is evaluated using accuracy, precision, recall, and F1-score metrics to assess the model’s ability to accurately distinguish between Netflix content types. The experimental results demonstrate that the Naive Bayes algorithm is able to classify Netflix content into Movie and TV Show categories with accuracy, precision, recall, and F1-score values of 100%, respectively. The confusion matrix indicates that no misclassification occurred, suggesting that genre, rating, and country of production features provide a very clear separation between content classes. These findings indicate that the Naive Bayes algorithm can achieve exceptionally high classification performance with optimal evaluation results. The results further reveal distinct differences in characteristics between movies and television shows based on genre and production attributes. Therefore, this study is expected to contribute to the development of content recommendation systems and strategic content management within the streaming industry.
References
Aggarwal, C. C. (2015). Data mining: The textbook. Springer. https://doi.org/10.1007/978-3-319-15510-4
Awalia, A. D. N., Muhammad, F. H., & Dewi, F. S. (2025). Analysis of Naive Bayes and support vector machine algorithms in classification of diabetes cases based on lifestyle factors. Journal of Embedded Systems, Security and Intelligent Systems, 6(3), 390–403.
Bintang, R. A. K. N., & Romadloni, N. T. (2024). Perbandingan kinerja algoritma klasifikasi pada review pengguna aplikasi Netflix. Jurnal Informatika dan Teknik Elektro Terapan (JITET), 13(2).
Géron, A. (2019). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow (2nd ed.). O’Reilly Media.
Gómez-Uribe, C. A., & Hunt, N. (2016). The Netflix recommender system: Algorithms, business value, and innovation. ACM Transactions on Management Information Systems, 6(4), Article 13. https://doi.org/10.1145/2843948
Jusia, P. A., Pahlevi, R., Simanjuntak, D. S. P., & Jasmir, J. (2025). Peningkatan performa Naive Bayes dengan fitur chi-square pada analisis sentimen komentar pengguna aplikasi Netflix. Bulletin of Computer Science Research, 5(4), 614–621.
Kotsiantis, S. B. (2007). Supervised machine learning: A review of classification techniques. Informatica, 31, 249–268.
Mahendra, D. S., Rahmat, B., & Mumpuni, R. (2024). Implementasi metode multinomial Naive Bayes dalam klasifikasi judul berita clickbait. Neptunus: Jurnal Ilmu Komputer dan Teknologi Informasi, 2(3), 303–316.
Netflix, Inc. (2021). Netflix movies and TV shows dataset. Kaggle.
Noroozian, S., Rahmani, A. M., & Hosseinzadeh, M. (2024). In the arena of the content war: A social network analysis approach for content differentiation in VOD platforms. IEEE Access, 12, 1–14. https://doi.org/10.1109/ACCESS.2024.3406533
Oancea, B. (2025). Text classification using machine learning methods (arXiv preprint).
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Rieuwpassa, J. A., Sugito, & Widiharih, T. (2023). Implementasi metode Naive Bayes classifier untuk klasifikasi sentimen ulasan pengguna aplikasi Netflix pada Google Play. Jurnal Gaussian, 12(3), 362–371.
Rotman, D., Assael, Y., & Zisserman, A. (2020). Learnable optimal sequential grouping for video scene detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 1–10).
Shafira, F., & Nugraha, A. H. (2025). Sentiment analysis of Netflix app reviews on Google Play Store using the Naive Bayes. Journal of Information Technology and Applications Research (JITAR), 1(2).
Wardani, N. W., Nugraha, P. G. S. C., & Mahendra, G. S. (2024). Implementasi Naive Bayes pada data mining untuk mengklasifikasikan penjualan barang terlaris pada perusahaan ritel. Jurnal Sains dan Teknologi, 12(3), 656–668.
William, W., & Handhayani, T. (2025). Perbandingan kinerja Naive Bayes dan random forest dalam mendeteksi berita palsu. JISKA (Jurnal Informatika Sunan Kalijaga), 10(2), 137–144.
Yanuargi, B., Utami, E., Kusrini, & Parikesit, A. A. (2024). Data clustering for sentiment classification with Naive Bayes and support vector machine. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), 8(6), 819–827.
Zulkarnain, Z., Mutia, R., Ariani, J. A., & Barik, Z. A. (2024). Performance comparison K-nearest neighbor, Naive Bayes, and decision tree algorithms for Netflix rating classification. Indonesian Journal of Applied Technology and Innovation Science, 1(1), 16–22.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Mars: Jurnal Teknik Mesin, Industri, Elektro Dan Ilmu Komputer

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.



