Evaluating ChatGPT-4 and Machine Learning Models for Sentiment Analysis on a Multi-Script Moroccan Arabic Corpus: Insights, Challenges, and Future Directions

Mohamed HANNANI; Abdelhadi SOUDI; Kristof Van Laerhoven

doi:10.46298/jdmdh.15092

Mohamed HANNANI ; Abdelhadi SOUDI ; Kristof Van Laerhoven - Evaluating ChatGPT-4 and Machine Learning Models for Sentiment Analysis on a Multi-Script Moroccan Arabic Corpus: Insights, Challenges, and Future Directions

jdmdh:15092 - Journal of Data Mining & Digital Humanities, 2 avril 2025, NLP4DH - https://doi.org/10.46298/jdmdh.15092

Evaluating ChatGPT-4 and Machine Learning Models for Sentiment Analysis on a Multi-Script Moroccan Arabic Corpus: Insights, Challenges, and Future DirectionsArticle

Auteurs : Mohamed HANNANI ¹; Abdelhadi SOUDI ²; Kristof Van Laerhoven ¹

1 University of Siegen
2 École Nationale de L'Industrie Minérale in Rabat

The application of Large Language Models (LLMs) to low-resource languages and dialects, such as Moroccan Arabic (MA), remains a relatively unexplored area. This study evaluates the performance of ChatGPT-4, fine-tuned BERT models, FastText embeddings, and traditional machine learning approaches for sentiment analysis on MA. Using two publicly available MA datasets—the Moroccan Arabic Corpus (MAC) from X (formerly Twitter) and the Moroccan Arabic YouTube Corpus (MYC)—we assess the ability of these models to detect sentiment across different contexts. Although fine-tuned models performed well, ChatGPT-4 exhibited substantial potential for sentiment analysis, even in zero-shot scenarios. However, performance on MA was generally lower than on Modern Standard Arabic (MSA), attributed to factors such as regional variability, lack of standardization, and limited data availability. Future work should focus on expanding and standardizing MA datasets, as well as developing new methods like combining FastText and BERT embeddings with attention mechanisms to improve performance on this challenging dialect.

https://doi.org/10.46298/jdmdh.15092

Source : zenodo.org:14968542

Volume : NLP4DH

Rubrique : Humanités numériques en langues

Publié le : 2 avril 2025

Accepté le : 9 février 2025

Soumis le : 16 janvier 2025

Licence : Attribution 4.0 International (CC BY 4.0)

Fichiers

Nom	Taille
Evaluating_ChatGPT_4_and_Machine_Learning_Models_for_Sentiment_Analysis_on_a_Multi_Script_Moroccan_Arabic_Corpus_Insights__Challenges_FV (1).pdf md5 : bb71f1e556558a31957cfb382aa10656	280.38 KB

Mohamed HANNANI ; Abdelhadi SOUDI ; Kristof Van Laerhoven - Evaluating ChatGPT-4 and Machine Learning Models for Sentiment Analysis on a Multi-Script Moroccan Arabic Corpus: Insights, Challenges, and Future Directions

Fichiers

Références bibliographiques

Partager et exporter

Statistiques de consultation