2019

JDMDH papers published in year 2019. The volume also comprises papers published in special issues.

1. A Hackathon for Classical Tibetan

Almogi , Orna ; Dankin , Lena ; Dershowitz , Nachum ; Wolf , Lior.
We describe the course of a hackathon dedicated to the development of linguistic tools for Tibetan Buddhist studies. Over a period of five days, a group of seventeen scholars, scientists, and students developed and compared algorithms for intertextual alignment and text classification, along with some basic language tools, including a stemmer and word segmenter.
Section: Towards a Digital Ecosystem: NLP. Corpus infrastructure. Methods for Retrieving Texts and Computing Text Similarities

2. Computer Analysis of Architecture Using Automatic Image Understanding

Wei, Fan ; Li, Yuan ; Shamir, Lior.
In the past few years, computer vision and pattern recognition systems have been becoming increasingly more powerful, expanding the range of automatic tasks enabled by machine vision. Here we show that computer analysis of building images can perform quantitative analysis of architecture, and quantify similarities between city architectural styles in a quantitative fashion. Images of buildings from 18 cities and three countries were acquired using Google StreetView, and were used to train a machine vision system to automatically identify the location of the imaged building based on the image visual content. Experimental results show that the automatic computer analysis can automatically identify the geographical location of the StreetView image. More importantly, the algorithm was able to group the cities and countries and provide a phylogeny of the similarities between architectural styles as captured by StreetView images. These results demonstrate that computer vision and pattern […]

3. Transcribing Foucault’s handwriting with Transkribus

Massot , Marie-Laure ; Sforzini , Arianna ; Ventresque , Vincent.
The Foucault Fiches de Lecture (FFL) project aims both to explore and to make available online a large set of Michel Foucault’s reading notes (organized citations, references and comments) held at the BnF since 2013. Therefore, the team is digitizing, describing and enriching the reading notes that the philosopher gathered while preparing his books and lectures, thus providing a new corpus that will allow a new approach to his work. In order to release the manuscripts online, and to collectively produce the data, the team is also developing a collaborative platform, based on RDF technologies, and designed to link together archival content and bibliographic data. This project is financed by the ANR (2017-2020) and coordinated by Michel Senellart, professor of philosophy at the ENS Lyon. It benefits from the partnerships of the ENS/PSL and the BnF. In addition, a collaboration with the European READ/Transkribus project has been started so as to produce automatic transcription of the […]
Section: Digital libraries and virtual exhibitions

4. Mapping the Bentham Corpus: Concept-based Navigation

Ruiz Fabo , Pablo ; Poibeau , Thierry.
British philosopher and reformer Jeremy Bentham (1748-1832) left over 60,000 folios of unpublished manuscripts. The Bentham Project, at University College London, is creating a TEI version of the manuscripts, via crowdsourced transcription verified by experts. We present here an interface to navigate these largely unedited manuscripts, and the language technologies the corpus was enriched with to facilitate navigation, i.e Entity Linking against the DBpedia knowledge base and keyphrase extraction. The challenges of tagging a historical domain-specific corpus with a contemporary knowledge base are discussed. The concepts extracted were used to create interactive co-occurrence networks, that serve as a map for the corpus and help navigate it, along with a search index. These corpus representations were integrated in a user interface. The interface was evaluated by domain experts with satisfactory results , e.g. they found the distributional semantics methods exploited here applicable in […]
Section: Data deluge: which skills for wich data?

5. Causal reasoning and symbolic relationships in Medieval Illuminations

Diarra, Djibril ; Clouzot, Martine ; Nicolle, Christophe.
This work applies knowledge engineering’s techniques to medieval illuminations. Inside it, an illumination is considered as a knowledge graph which was used by some elites in the Middle Ages to represent themselves as a social group and exhibit the events in their lives, and their cultural values. That graph is based on combinations of symbolic elements linked each to others with semantic relations. Those combinations were used to encode visual metaphors and influential messages whose interpretations are sometimes tricky for not experts. Our work aims to describe the meaning of those elements through logical modelling using ontologies. To achieve that, we construct logical reasoning rules and simulate them using artificial intelligence mechanisms. The goal is to facilitate the interpretation of illuminations and provide, in a future evolution of current social media, logical formalisation of new encoding and information transmission services.