Blouin Baptiste ; Cécile Armand ; Christian Henriot - HistText: An Application for leveraging large-scale historical textbases

jdmdh:11756 - Journal of Data Mining & Digital Humanities, November 10, 2023, 2023 - https://doi.org/10.46298/jdmdh.11756
HistText: An Application for leveraging large-scale historical textbasesArticle

Authors: Blouin Baptiste ORCID1,2,3; Cécile Armand ORCID4; Christian Henriot ORCID4


This paper introduces HistText, a pioneering tool devised to facilitate large-scale data mining in historical documents, specifically targeting Chinese sources. Developed in response to the challenges posed by the massive Modern China Textual Database, HistText emerges as a solution to efficiently extract and visualize valuable insights from billions of words spread across millions of documents. With a user-friendly interface, advanced text analysis techniques, and powerful data visualization capabilities, HistText offers a robust platform for digital humanities research. This paper explores the rationale behind HistText, underscores its key features, and provides a comprehensive guide for its effective utilization, thus highlighting its potential to substantially enhance the realm of computational humanities.


Volume: 2023
Section: Project presentations
Published on: November 10, 2023
Accepted on: November 3, 2023
Submitted on: August 22, 2023
Keywords: [SHS]Humanities and Social Sciences, [INFO]Computer Science [cs], [en] natural language processing, data mining, Text analysis, history, Chinese, document
Funding:
    Source : HAL
  • Elites, networks, and power in modern urban China (1830-1949).; Funder: European Commission; Code: 788476; Call ID: ERC-2017-ADG; Projet Financing: ERC-2017-ADG

Consultation statistics

This page has been seen 1864 times.
This article's PDF has been downloaded 775 times.