Erkki Mervaala ; Ilona Kousa - Out of Context! Managing the Limitations of Context Windows in ChatGPT-4o Text Analyses

jdmdh:15090 - Journal of Data Mining & Digital Humanities, March 7, 2025, NLP4DH - https://doi.org/10.46298/jdmdh.15090
Out of Context! Managing the Limitations of Context Windows in ChatGPT-4o Text AnalysesArticle

Authors: Mervaala, Erkki ORCID1; Kousa, Ilona ORCID2

  • 1 Finnish Environment Institute
  • 2 University of Helsinki

In recent years, large language model (LLM) applications have surged in popularity, and academia has followed suit. Researchers frequently seek to automate text annotation - often a tedious task – and, to some extent, text analysis. Notably, popular LLMs such as ChatGPT have been studied as both research assistants and analysis tools, revealing several concerns regarding transparency and the nature of AI-generated content. This study assesses ChatGPT’s usability and reliability for text analysis – specifically keyword extraction and topic classification – within an “out-of-the-box” zero-shot or few-shot context, emphasizing how the size of the context window and varied text types influence the resulting analyses. Our findings indicate that text type and the order in which texts are presented both significantly affect ChatGPT’s analysis. At the same time, context-building tends to be less problematic when analyzing similar texts. However, lengthy texts and documents pose serious challenges: once the context window is exceeded, “hallucinated” results often emerge. While some of these issues stem from the core functioning of LLMs, some can be mitigated through transparent research planning.


Volume: NLP4DH
Published on: March 7, 2025
Accepted on: February 9, 2025
Submitted on: January 16, 2025
Keywords: Large language models,ChatGPT,Text analysis,Green transition,Parliamentary speeches

Files

Name Size
OutOfContext_Mervaala_Kousa.pdf
md5: 3eb9025bcaf998d283fd5c060824e116
918.36 KB

Publications

isNewVersionOf
Mervaala, E., & Kousa, I. (2024). Order Up! Micromanaging Inconsistencies in ChatGPT-4o Text Analyses. In Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities (1–, pp. 521-535). Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities. Association for Computational Linguistics. 10.18653/v1/2024.nlp4dh-1.51 1
  • 1 Zenodo

Consultation statistics

This page has been seen 611 times.
This article's PDF has been downloaded 151 times.