Blouin Baptiste ; Cécile Armand ; Christian Henriot - HistText: An Application for leveraging large-scale historical textbases

jdmdh:11756 - Journal of Data Mining & Digital Humanities, 10 novembre 2023, 2023 - https://doi.org/10.46298/jdmdh.11756
HistText: An Application for leveraging large-scale historical textbasesArticle

Auteurs : Blouin Baptiste ORCID1,2,3; Cécile Armand ORCID4; Christian Henriot ORCID4


This paper introduces HistText, a pioneering tool devised to facilitate large-scale data mining in historical documents, specifically targeting Chinese sources. Developed in response to the challenges posed by the massive Modern China Textual Database, HistText emerges as a solution to efficiently extract and visualize valuable insights from billions of words spread across millions of documents. With a user-friendly interface, advanced text analysis techniques, and powerful data visualization capabilities, HistText offers a robust platform for digital humanities research. This paper explores the rationale behind HistText, underscores its key features, and provides a comprehensive guide for its effective utilization, thus highlighting its potential to substantially enhance the realm of computational humanities.


Volume : 2023
Rubrique : Présentations de projets
Publié le : 10 novembre 2023
Accepté le : 3 novembre 2023
Soumis le : 22 août 2023
Mots-clés : [SHS]Humanities and Social Sciences, [INFO]Computer Science [cs], [en] natural language processing, data mining, Text analysis, history, Chinese, document
Financement :
    Source : HAL
  • Elites, networks, and power in modern urban China (1830-1949).; Financeur: European Commission; Code: 788476; Call ID: ERC-2017-ADG; Projet Financeur: ERC-2017-ADG

Datasets

Référence
Ehrmann, M., Bunout, E., & Düring, M. (2019). Survey of digitized newspaper interfaces (dataset and notebooks) (Version v1.0) [Dataset]. Zenodo. 10.5281/ZENODO.3369874 1
Ehrmann, M., Bunout, E., & Düring, M. (2019). Survey of digitized newspaper interfaces (dataset and notebooks) (Version v1.0) [Dataset]. Zenodo. 10.5281/ZENODO.3369875 1
  • 1 ScholeXplorer

1 Document citant cet article

Statistiques de consultation

Cette page a été consultée 2070 fois.
Le PDF de cet article a été téléchargé 888 fois.