Character Segmentation in Asian Collector's Seal Imprints: An Attempt to Retrieval Based on Ancient Character Typeface

Kangying Li; Biligsaikhan Batjargal; Akira Maeda

doi:10.46298/jdmdh.6102

Kangying Li ; Biligsaikhan Batjargal ; Akira Maeda - Character Segmentation in Asian Collector's Seal Imprints: An Attempt to Retrieval Based on Ancient Character Typeface

jdmdh:6102 - Journal of Data Mining & Digital Humanities, 11 janvier 2021, HistoInformatique - https://doi.org/10.46298/jdmdh.6102

Character Segmentation in Asian Collector's Seal Imprints: An Attempt to Retrieval Based on Ancient Character TypefaceArticle

Auteurs : Kangying Li ^1,²; Biligsaikhan Batjargal ^3,²; Akira Maeda ^4,²

1 Graduate School of Information Science and Engineering, Ritsumeikan University, Japan
2 Ritsumeikan University
3 Kinugasa Research Organization, Ritsumeikan University, Japan
4 College of Information Science and Engineering, Ritsumeikan University, Japan

Collector's seals provide important clues about the ownership of a book. They contain much information pertaining to the essential elements of ancient materials and also show the details of possession, its relation to the book, the identity of the collectors and their social status and wealth, amongst others. Asian collectors have typically used artistic ancient characters rather than modern ones to make their seals. In addition to the owner's name, several other words are used to express more profound meanings. A system that automatically recognizes these characters can help enthusiasts and professionals better understand the background information of these seals. However, there is a lack of training data and labelled images, as samples of some seals are scarce and most of them are degraded images. It is necessary to find new ways to make full use of such scarce data. While these data are available online, they do not contain information on the characters' position. The goal of this research is to assist in obtaining more labelled data through user interaction and provide retrieval tools that use only standard character typefaces extracted from font files. In this paper, a character segmentation method is proposed to predict the candidate characters' area without any labelled training data that contain character coordinate information. A retrieval-based recognition system that focuses on a single character is also proposed to support seal retrieval and matching. The experimental results demonstrate that the proposed character segmentation method performs well on Asian collector's seals, with 85% of the test data being correctly segmented.

https://doi.org/10.46298/jdmdh.6102

Source : HAL:hal-02476910v5

Volume : HistoInformatique

Rubrique : HistoInformatique

Publié le : 11 janvier 2021

Accepté le : 3 juillet 2020

Soumis le : 13 février 2020

Mots-clés : [INFO]Computer Science [cs], [INFO.INFO-DL]Computer Science [cs]/Digital Libraries [cs.DL], [en] Character segmentation, Ancient document image processing, Asian seal imprint

Licence : Hal authorisation v1

Références bibliographiques

1 Document citant cet article

Partager et exporter

Statistiques de consultation

Cette page a été consultée 2802 fois.

Le PDF de cet article a été téléchargé 715 fois.