Niko Partanen ; Jack Rueter ; Rogier Blokland - Old Permic Universal Dependencies Treebank

jdmdh:13306 - Journal of Data Mining & Digital Humanities, 4 juin 2024, NLP4DH - https://doi.org/10.46298/jdmdh.13306
Old Permic Universal Dependencies TreebankArticle

Auteurs : Partanen, Niko ORCID1; Rueter, Jack ORCID1; Blokland, Rogier ORCID2

  • 1 University of Helsinki
  • 2 Uppsala University

Old Permic, also known as Old Komi, is an extinct variety of Komi that was spoken in the late Middle Ages in the lower Vychegda river basin in Northeastern European Russia, in an area that currently is not Komi-speaking. This language variety is attested in fragmentary records from the 14th to 17th century written both in the Old Permic alphabet and in Cyrillic. These records are of significant importance for research on the history of the Komi language. Here we introduce our attempt towards a new Universal Dependencies treebank that will eventually contain the existing corpus of Old Permic in a structured and CoNLL-U annotated format. This will be the first time this material is being made openly available in digital format, and our contribution describes the current state of the art and remaining challenges.


Volume : NLP4DH
Publié le : 4 juin 2024
Accepté le : 9 avril 2024
Soumis le : 27 mars 2024
Mots-clés : Uralic,Permic languages,Old Permic,Old Komi,CoNLL-U,Universal Dependencies

Fichiers

Nom Taille
Towards an Old Permic Universal Dependencies Treebank.pdf
md5 : 814a1962a57707e6ef51e776413e754a
1.47 MB

Statistiques de consultation

Cette page a été consultée 262 fois.
Le PDF de cet article a été téléchargé 74 fois.