Erkki Mervaala ; Ilona Kousa - Out of Context! Managing the Limitations of Context Windows in ChatGPT-4o Text Analyses

jdmdh:15090 - Journal of Data Mining & Digital Humanities, 7 mars 2025, NLP4DH - https://doi.org/10.46298/jdmdh.15090
Out of Context! Managing the Limitations of Context Windows in ChatGPT-4o Text AnalysesArticle

Auteurs : Mervaala, Erkki ORCID1; Kousa, Ilona ORCID2

In recent years, large language model (LLM) applications have surged in popularity, and academia has followed suit. Researchers frequently seek to automate text annotation - often a tedious task – and, to some extent, text analysis. Notably, popular LLMs such as ChatGPT have been studied as both research assistants and analysis tools, revealing several concerns regarding transparency and the nature of AI-generated content. This study assesses ChatGPT’s usability and reliability for text analysis – specifically keyword extraction and topic classification – within an “out-of-the-box” zero-shot or few-shot context, emphasizing how the size of the context window and varied text types influence the resulting analyses. Our findings indicate that text type and the order in which texts are presented both significantly affect ChatGPT’s analysis. At the same time, context-building tends to be less problematic when analyzing similar texts. However, lengthy texts and documents pose serious challenges: once the context window is exceeded, “hallucinated” results often emerge. While some of these issues stem from the core functioning of LLMs, some can be mitigated through transparent research planning.


Volume : NLP4DH
Publié le : 7 mars 2025
Accepté le : 9 février 2025
Soumis le : 16 janvier 2025
Mots-clés : Large language models,ChatGPT,Text analysis,Green transition,Parliamentary speeches

Fichiers

Nom Taille
OutOfContext_Mervaala_Kousa
md5 : 3eb9025bcaf998d283fd5c060824e116
918.36 KB

Publications

isNewVersionOf
Mervaala, E., & Kousa, I. (2024). Order Up! Micromanaging Inconsistencies in ChatGPT-4o Text Analyses. In Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities (1–, pp. 521-535). Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities. Association for Computational Linguistics. 10.18653/v1/2024.nlp4dh-1.51 1
  • 1 Zenodo

Statistiques de consultation

Cette page a été consultée 485 fois.
Le PDF de cet article a été téléchargé 73 fois.