Martti Mäkinen - Stylo visualisations of Middle English documents

jdmdh:5614 - Journal of Data Mining & Digital Humanities, December 23, 2020, Special issue on Visualisations in Historical Linguistics -
Authors: Martti Mäkinen ORCID-iD1

  • 1 Hanken School of Economics

Automated approaches to identifying authorship of a text have become commonplace in the stylometric studies. The current article applies an unsupervised stylometric approach on Middle English documents using the script Stylo in R, in an attempt to distinguish between texts from different dialectal areas. The approach is based on the distribution of character 3-grams generated from the texts of the corpus of Middle English Local Documents (MELD). The article adopts the middle ground in the study of Middle English spelling variation, between the concept of relational linguistic space and the real linguistic continuum of medieval England. Stylo can distinguish between Middle English dialects by using the less frequent character 3-grams.

Volume: Special issue on Visualisations in Historical Linguistics
Published on: December 23, 2020
Accepted on: December 12, 2020
Submitted on: July 8, 2019
Keywords: Middle English,historical dialectology,diatopical variation,unattended analysis,stylometry,authorship attribution,R,non-standard spelling,[SHS.LANGUE]Humanities and Social Sciences/Linguistics,[SHS]Humanities and Social Sciences

