Jani Marjanen ; Jussi Kurunmäki ; Lidia Pivovarova ; Elaine Zosa - The expansion of isms, 1820-1917: Data-driven analysis of political language in digitized newspaper collections

jdmdh:6159 - Journal of Data Mining & Digital Humanities, December 18, 2020, HistoInformatics - https://doi.org/10.46298/jdmdh.6159
The expansion of isms, 1820-1917: Data-driven analysis of political language in digitized newspaper collections

Authors: Jani Marjanen ORCID-iD1; Jussi Kurunmäki 2; Lidia Pivovarova ORCID-iD1; Elaine Zosa ORCID-iD1

  • 1 Helsingin yliopisto = Helsingfors universitet = University of Helsinki
  • 2 University of Tampere [Finland]

Words with the suffix-ism are reductionist terms that help us navigate complex social issues by using a simple one-word label for them. On the one hand they are often associated with political ideologies, but on the other they are present in many other domains of language, especially culture, science, and religion. This has not always been the case. This paper studies isms in a historical record of digitized newspapers from 1820 to 1917 published in Finland to find out how the language of isms developed historically. We use diachronic word embeddings and affinity propagation clustering to trace how new isms entered the lexicon and how they relate to one another over time. We are able to show how they became more common and entered more and more domains. Still, the uses of isms as traditions for political action and thinking stand out in our analysis.

Volume: HistoInformatics
Published on: December 18, 2020
Accepted on: September 11, 2020
Submitted on: February 26, 2020
Keywords: isms,ideology,political language,diachronic word embeddings,affinity propagation clustering,[SHS.HIST]Humanities and Social Sciences/History,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
    Source : OpenAIRE Graph
  • NewsEye: A Digital Investigator for Historical Newspapers; Funder: European Commission; Code: 770299
  • Cross-Lingual Embeddings for Less-Represented Languages in European News Media; Funder: European Commission; Code: 825153

Linked publications - datasets - softwares

Source : ScholeXplorer IsRelatedTo DOI 10.5281/zenodo.4274481
Source : ScholeXplorer IsRelatedTo DOI 10.5281/zenodo.4274482
Source : ScholeXplorer IsRelatedTo DOI 10.5281/zenodo.4301658
Source : ScholeXplorer IsRelatedTo DOI 10.5334/johd.22
  • 10.5334/johd.22
  • 10.5334/johd.22
  • 10.5281/zenodo.4301658
  • 10.5281/zenodo.4274481
  • 10.5281/zenodo.4274482
A collection of Swedish diachronic word embedding models trained on historical newspaper data

2 Documents citing this article

Consultation statistics

This page has been seen 2254 times.
This article's PDF has been downloaded 915 times.