Jani Marjanen ; Jussi Kurunmäki ; Lidia Pivovarova ; Elaine Zosa - The expansion of isms, 1820-1917: Data-driven analysis of political language in digitized newspaper collections

jdmdh:6159 - Journal of Data Mining & Digital Humanities, December 18, 2020, HistoInformatics - https://doi.org/10.46298/jdmdh.6159
The expansion of isms, 1820-1917: Data-driven analysis of political language in digitized newspaper collectionsArticle

Authors: Jani Marjanen ORCID1; Jussi Kurunmäki 2; Lidia Pivovarova ORCID1; Elaine Zosa ORCID1

Words with the suffix-ism are reductionist terms that help us navigate complex social issues by using a simple one-word label for them. On the one hand they are often associated with political ideologies, but on the other they are present in many other domains of language, especially culture, science, and religion. This has not always been the case. This paper studies isms in a historical record of digitized newspapers from 1820 to 1917 published in Finland to find out how the language of isms developed historically. We use diachronic word embeddings and affinity propagation clustering to trace how new isms entered the lexicon and how they relate to one another over time. We are able to show how they became more common and entered more and more domains. Still, the uses of isms as traditions for political action and thinking stand out in our analysis.


Volume: HistoInformatics
Published on: December 18, 2020
Accepted on: September 11, 2020
Submitted on: February 26, 2020
Keywords: isms,ideology,political language,diachronic word embeddings,affinity propagation clustering,[SHS.HIST]Humanities and Social Sciences/History,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Funding:
    Source : OpenAIRE Graph
  • NewsEye: A Digital Investigator for Historical Newspapers; Funder: European Commission; Code: 770299
  • Cross-Lingual Embeddings for Less-Represented Languages in European News Media; Funder: European Commission; Code: 825153

Datasets

Is related to
Hengchen, S., Ros, R., & Marjanen, J. (2019). A data-driven approach to the changing vocabulary of the ‘nation’ in English, Dutch, Swedish and Finnish newspapers, 1750-1950 (1–) [Dataset]. DataverseNL. 10.34894/AVBD7A 1
Hengchen, S., Ros, R., & Marjanen, J. (2019). Models for "A data-driven approach to the changing vocabulary of the ’nation’ in English, Dutch, Swedish and Finnish newspapers, 1750-1950" (Version 1.0.0, 1–) [Dataset]. Zenodo. 10.5281/ZENODO.3270648 1
  • 1 ScholeXplorer

2 Documents citing this article

Consultation statistics

This page has been seen 3166 times.
This article's PDF has been downloaded 1285 times.