Sciences de l'Antiquité et humanités numériques

Rédacteurs : Julien Cavero ; Marie-Laure Massot

EpiSearch. Identifying Ancient Inscriptions in Epigraphic Manuscripts

Calvelli, Lorenzo ; Boschetti, Federico ; Tommasi, Tatiana.

AbstractEpigraphic documents are an essential source of evidence for our knowledge of the ancient world. Nonetheless, a significant number of inscriptions have not been preserved in their material form. In fact, their texts can only be recovered thanks to handwritten materials and, in particular, the so-called epigraphic manuscripts. EpiSearch is a pilot project that explores the application of digital technologies deployed to retrieve the epigraphic evidence found in these sources. The application of Handwritten Text Recognition (HTR) to epigraphic manuscripts is a challenging task, given the nature and graphic layout of these documents. Yet, our research shows that, even with some limits, HTR technologies can be used successfully.

Preparing Big Manuscript Data for Hierarchical Clustering with Minimal HTR Training

Perdiki, Elpida.

HTR (Handwritten Text Recognition) technologies have progressed enough to offer high-accuracy results in recognising handwritten documents, even on a synchronous level. Despite the state-of-the-art algorithms and software, historical documents (especially those written in Greek) remain a real-world challenge for researchers. A large number of unedited or under-edited works of Greek Literature (ancient or Byzantine, especially the latter) exist to this day due to the complexity of producing critical editions. To critically edit a literary text, scholars need to pinpoint text variations on several manuscripts, which requires fully (or at least partially) transcribed manuscripts. For a large manuscript tradition (i.e., a large number of manuscripts transmitting the same work), such a process can be a painstaking and time-consuming project. To that end, HTR algorithms that train AI models can significantly assist, even when not resulting in entirely accurate transcriptions. Deep learning models, though, require a quantum of data to be effective. This, in turn, intensifies the same problem: big (transcribed) data require heavy loads of manual transcriptions as training sets. In the absence of such transcriptions, this study experiments with training sets of various sizes to determine the minimum amount of manual transcription needed to produce usable results. HTR models are trained through the Transkribus platform on manuscripts from multiple works of a single Byzantine author, […]

Contribution to the recent history of archaeology by using some digital humanities methods and techniques applied to field recording documents of an archaeological site excavated in 1970s

Tuffery, Christophe.

This article presents the results of an archaeological archive research project. Field recording documents from the Rivaux site in France which was excavated from the 1970s to the 1990s were exploited. After digitising a set of field notebook pages, the author developed an application called Archeotext which enables these documents to be transcribed and georeferenced. Some of the results obtained show new ways of exploiting this type of archive by using certain methods and techniques from the digital humanities.

Publishing open-access bibliographical data on Ancient Greek and Latin texts: challenges, constraints, progression

Giovacchini, Julie ; Capron, Laurent.

We present here both some of our thoughts on methodology in relation to the specific constraints that complexify the ways of structuring and accessing bibliographical data in the Sciences of Antiquity, and the solutions adopted by the IPhiS-CIRIS project for dealing with these constraints. The project began in 2014 in a general scientific environment that was still being standardised and structured, with digital bibliographical resources in this disciplinary field becoming increasingly numerous, although of uneven quality and hard to access and/or private.