Fuzzy logic Deep Learning Control System for Detecting Arabic  Tweets Spam based on  Large Language Models

Ghosoon K.munahy

doi:10.61132/mars.v3i1.713

Authors

Ghosoon K.munahy University of Kerbala

DOI:

https://doi.org/10.61132/mars.v3i1.713

Keywords:

Spam Detecting, Arabic Tweets, Fuzzy Logic, Deep Learning, LLM

Abstract

spam is posting unsolicited messages or advertising on social media, particularly Twitter. These messages are normally designed to sell specific products and services or links. In this research, we developed a fuzzy control system to detect Arabic spam tweets based on deep learning with a large language model. Initially, we performed text cleaning and further transformed text into vectors with the help of AraGpt and AraBert. Subsequently, we employed a multi-layer perceptron network model in feature extraction of essential features. Finally, we adopted the fuzzy logic control system for classifying spam tweets using features filtered from deep networks. Employing the proposed Fuzzy logic control system provided nearly a 100% comparative to only utilizing the deep neural networks, which yielded an almost 99% throughput for both large language models Aragpt and Arabert, with a 100% F1 score for the Aragpt model and 99% for Arabert model respectively.

References

Alfaidi, A., Alwadei, H., Alshutayri, A., & Alahdal, S. (2023). Exploring the performance of Farasa and CAMeL taggers for Arabic dialect tweets. International Arab Journal of Information Technology, 20(3), 349–356.

Alom, Z., Carminati, B., & Ferrari, E. (2020). A deep learning model for Twitter spam detection. Online Social Networks and Media, 18, 100079.

Antoun, W., Baly, F., & Hajj, H. (2020). AraBERT: Transformer-based model for Arabic language understanding. arXiv preprint arXiv:2003.00104.

Antoun, W., Baly, F., & Hajj, H. (2020). AraGPT2: Pre-trained transformer for Arabic language generation. arXiv preprint arXiv:2012.15520.

Avgerinos, C., Vretos, N., & Daras, P. (2023). Less is more: Adaptive trainable gradient dropout for deep neural networks. Sensors, 23(3), 1325.

Barushka, A., & Hajek, P. (2020). Spam detection on social networks using cost-sensitive feature selection and ensemble-based regularized deep neural networks. Neural Computing and Applications, 32(9), 4239–4257.

Bird, S. K., & Loper, E. (n.d.). Natural Language Toolkit (NLTK). University of Pennsylvania. Retrieved from https://www.nltk.org

Chae, Y., & Davidson, T. (2023). Large language models for text classification: From zero-shot learning to fine-tuning. Open Science Foundation.

Devlin, J. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

Gasparetto, A., Marcuzzo, M., Zangari, A., & Albarelli, A. (2022). A survey on text classification algorithms: From text to predictions. Information, 13(2), 83.

Gong, Q., Kang, W., & Fahroo, F. (2023). Approximation of compositional functions with ReLU neural networks. Systems & Control Letters, 175, 105508.

Guo, Z., Yu, K., Jolfaei, A., Ding, F., & Zhang, N. (2021). Fuz-spam: Label smoothing-based fuzzy detection of spammers in Internet of Things. IEEE Transactions on Fuzzy Systems, 30(11), 4543–4554.

Hadi, M. U., et al. (2024). Large language models: A comprehensive survey of its applications, challenges, limitations, and future prospects. Authorea Preprints.

Hegazi, M. O., Al-Dossari, Y., Al-Yahy, A., Al-Sumari, A., & Hilal, A. (2021). Preprocessing Arabic text on social media. Heliyon, 7(2).

Jamal, S., & Wimmer, H. (2023). An improved transformer-based model for detecting phishing, spam, and ham: A large language model approach. arXiv preprint arXiv:2311.04913.

Jana, C., Pal, M., Muhiuddin, G., & Liu, P. (n.d.). Fuzzy optimization, decision-making and operations research.

Kaddoura, S., Alex, S. A., Itani, M., Henno, S., AlNashash, A., & Hemanth, D. J. (2023). Arabic spam tweets classification using deep learning. Neural Computing and Applications, 35(23), 17233–17246.

Kamyab, M., Liu, G., & Adjeisah, M. (2021). Attention-based CNN and Bi-LSTM model based on TF-IDF and GloVe word embedding for sentiment analysis. Applied Sciences, 11(23), 11255.

Kardaş, B., Bayar, İ. E., Özyer, T., & Alhajj, R. (2021). Detecting spam tweets using machine learning and effective preprocessing. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (pp. 393–398).

Keraghel, I., Morbieu, S., & Nadif, M. (2024). Beyond words: A comparative analysis of LLM embeddings for effective clustering. In International symposium on intelligent data analysis (pp. 205–216). Springer.

Kumar, N., & Sonowal, S. (2020). Email spam detection using machine learning algorithms. In 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA) (pp. 108–113). IEEE.

Kumar, P. (2024). Large language models (LLMs): Survey, technical frameworks, and future challenges. Artificial Intelligence Review, 57(10), 260. https://doi.org/10.1007/s10462-024-10888-y

Patil, R. G. (2024). A review of current trends, techniques, and challenges in large language models (LLMs). Applied Sciences, 14(5), 2074.

Reyes-García, C. A. T.-G., & A. A. (2022). Fuzzy logic and fuzzy systems. In Biosignal processing and classification using computational learning and intelligence: Principles, algorithms, and applications. Elsevier.

Rojas-Galeano, S. (2024). Zero-shot spam email classification using pre-trained large language models. arXiv preprint arXiv:2405.15936.

Rutkowski, L., Cpalka, K., Nowicki, R., Pokropinska, A., & Scherer, R. (2023). Neuro-fuzzy systems. In T.-Y. Lin, C.-J. Liau, & J. Kacprzyk (Eds.), Granular, fuzzy, and soft computing (pp. 843–858). Springer.

Sahmoud, T., & Mikki, D. M. (2022). Spam detection using BERT. arXiv preprint arXiv:2206.02443.

Soltanifar, M., Sharafi, H., Hosseinzadeh Lotfi, F., Pedrycz, W., & Allahviranloo, T. (2023). Introduction to fuzzy logic. In Preferential voting and applications: Approaches based on data envelopment analysis (pp. 31–45). Springer.

Thomas, M., & Meshram, B. (2023). Chso-DNFNet: Spam detection in Twitter using feature fusion and optimized deep neuro-fuzzy network. Advances in Engineering Software, 175, 103333.

Xiao, A. S., & Liang, Q. (2024). Spam detection for YouTube video comments using machine learning approaches. Machine Learning with Applications, 16, 100550.