Towards a Digital Ecosystem: NLP. Corpus infrastructure. Methods for Retrieving Texts and Computing Text Similarities (9 articles)


  • Methods for the detection of intertexts and text reuse, manual (e.g. crowd-sourcing) or automatic (e.g. algorithms);

  • Infrastructure for the preservation of digital texts and quotations between different text passages;

  • Linguistic preprocessing and data normalisation, such as lemmatisation of historical languages, root stemming, normalisation of variants, etc.


Managing different types of text re-uses (3 articles)


This part focuses on the conceptual definitions, the modelling of the unstable idea of “quotation” and the XML-TEI encoding to implement for its characterization.


Visualisation of intertextuality and text reuse (3 articles)

Project presentations (11 articles)

Digital libraries and virtual exhibitions (2 articles)

Data deluge: which skills for wich data? (5 articles)

Dataset (1 article)

A "Data paper" is a publication that promotes how data has been built and which kind of potential processing cna be applied to applied to exploit this kind of dataset. An ontology can be also published , i.e. a knowledge organization that is associate to a type of datasets; in that case fine description is required.

Project (2 articles)

A position paper describes goals of a specific project. Sponsorship is required. A fine description of all packages is useful to understand complementariy of each contribution in the framework of the project.

HistoInformatics (7 articles)

Digital humanities in languages (9 articles)

Sciences of Antiquity and digital humanities (4 articles)

Editors: Julien Cavero ; Marie-Laure Massot

I. Historical and linguistic approaches (1 article)

V. The contribution of corpora (2 articles)

II. Pedagogical practices (3 articles)

III. Biotranslation vs. Machine Translation (2 articles)

VI. Feedback from professional translators (3 articles)

IV. Challenges for professional translation (3 articles)