Individual vs. Collaborative Methods of Crowdsourced Transcription

Samantha Blickhan; Coleman Krawczyk; Daniel Hanson; Amy Boyer; Andrea Simenstad; Victoria van Hyning

doi:10.46298/jdmdh.5759

Samantha Blickhan ; Coleman Krawczyk ; Daniel Hanson ; Amy Boyer ; Andrea Simenstad et al. - Individual vs. Collaborative Methods of Crowdsourced Transcription

jdmdh:5759 - Journal of Data Mining & Digital Humanities, 3 décembre 2019, Numéro spécial sur la collecte, la préservation et la diffusion du patrimoine culturel menacé pour de nouvelles compréhensions grâce à des approches multilingues - https://doi.org/10.46298/jdmdh.5759

Individual vs. Collaborative Methods of Crowdsourced TranscriptionArticle

Auteurs : Samantha Blickhan ^1,²; Coleman Krawczyk ³; Daniel Hanson ^4,⁵; Amy Boyer ¹; Andrea Simenstad ^4,⁵; Victoria van Hyning ⁶

1 Adler Planetarium
2 Adler Planetarium [Chicago]
3 University of Portsmouth
4 University of Minnesota
5 University of Minnesota [Twin Cities]
6 Library of Congress

While online crowdsourced text transcription projects have proliferated in the last decade, there is a need within the broader field to understand differences in project outcomes as they relate to task design, as well as to experiment with different models of online crowdsourced transcription that have not yet been explored. The experiment discussed in this paper involves the evaluation of newly-built tools on the Zooniverse.org crowdsourcing platform, attempting to answer the research question: "Does the current Zooniverse methodology of multiple independent transcribers and aggregation of results render higher-quality outcomes than allowing volunteers to see previous transcriptions and/or markings by other users? How does each methodology impact the quality and depth of analysis and participation?" To answer these questions, the Zooniverse team ran an A/B experiment on the project Anti-Slavery Manuscripts at the Boston Public Library. This paper will share results of this study, and also describe the process of designing the experiment and the metrics used to evaluate each transcription method. These include the comparison of aggregate transcription results with ground truth data; evaluation of annotation methods; the time it took for volunteers to complete transcribing each dataset; and the level of engagement with other project elements such as posting on the message board or reading supporting documentation. Particular focus will be given to the (at times) competing goals of data quality, efficiency, volunteer engagement, and user retention, all of which are of high importance for projects that focus on data from galleries, libraries, archives and museums. Ultimately, this paper aims to provide a model for impactful, intentional design and study of online crowdsourcing transcription methods, as well as shed light on the associations between project design, methodology and outcomes.

https://doi.org/10.46298/jdmdh.5759

Source : HAL:hal-02280013v2

Volume : Numéro spécial sur la collecte, la préservation et la diffusion du patrimoine culturel menacé pour de nouvelles compréhensions grâce à des approches multilingues

Publié le : 3 décembre 2019

Accepté le : 29 novembre 2019

Soumis le : 12 septembre 2019

Mots-clés : [SHS.STAT]Humanities and Social Sciences/Methods and statistics,[SHS.MUSEO]Humanities and Social Sciences/Cultural heritage and museology,[SHS.INFO]Humanities and Social Sciences/Library and information sciences

Références bibliographiques

9 Documents citant cet article

Joan Andreu Sánchez;Enrique Vidal;Vicente Bosch;Lorenzo Quirós, 2024, Ground-truth generation through crowdsourcing with probabilistic indexes, Neural Computing and Applications, 10.1007/s00521-024-10188-0, https://doi.org/10.1007/s00521-024-10188-0.

Liz Dowthwaite;Alexa Spence;Chris Lintott;Grant Miller;James Sprinks;et al., 2024, Exploring the Relationship between Basic Psychological Needs and Motivation in Online Citizen Science, ACM Transactions on Social Computing, 10.1145/3702210, https://doi.org/10.1145/3702210.

Niall Gandy;Lachlan C. Astfalck;Gemma L. Ives;Gwyneth E. Rivers, 2024, Ice sheet speed-dating: Using expert judgement to identify “good” simulations of the last glacial maximum North American ice sheets, Quaternary Science Reviews, 333, pp. 108690, 10.1016/j.quascirev.2024.108690, https://doi.org/10.1016/j.quascirev.2024.108690.

Sean M. O’Brien;Megan E. Schwamb;Samuel Gill;Christopher A. Watson;Matthew R. Burleigh;et al., 2024, Planet Hunters NGTS: New Planet Candidates from a Citizen Science Search of the Next Generation Transit Survey Public Data, The Astronomical Journal, 167, 5, pp. 238, 10.3847/1538-3881/ad32c8, https://doi.org/10.3847/1538-3881/ad32c8.

Solène Tarride;Tristan Faine;Mélodie Boillet;Harold Mouchère;Christopher Kermorvant, arXiv (Cornell University), Handwritten Text Recognition from Crowdsourced Annotations, 2023, San Jose CA USA, 10.1145/3604951.3605517, http://arxiv.org/abs/2306.10878.

Victoria Van Hyning;Britney Bibeault;Michael Purves;Randi Heikes, 2023, Shifting Roles of Citizen Scientists Accelerates High‐Quality Data Collection for Climate Change Research, Proceedings of the Association for Information Science and Technology, 60, 1, pp. 1155-1157, 10.1002/pra2.976, https://doi.org/10.1002/pra2.976.

Samantha Blickhan;Stephanie Dawson;Bahar Mehmani;Nici Pfeiffer;Jodi Schneider, 2022, The role of the information community in ensuring that information is authoritative: Strategies from NISO Plus 2022, Information Services & Use, 42, 3-4, pp. 423-432, 10.3233/isu-220169, https://doi.org/10.3233/isu-220169.

Hervé Ménard;Christian Cole;Alexander Gray;Roy Mudie;Joyce K. Klu;et al., 2021, Creation of a universal experimental protocol for the investigation of transfer and persistence of trace evidence: Part 1 - From design to implementation for particulate evidence, Forensic Science International Synergy, 3, pp. 100165, 10.1016/j.fsisyn.2021.100165, https://doi.org/10.1016/j.fsisyn.2021.100165.

Montserrat Prats López;Maura Soekijad;Hans Berends;Marleen Huysman, 2020, A Knowledge Perspective on Quality in Complex Citizen Science, Citizen Science Theory and Practice, 5, 1, 10.5334/cstp.250, https://doi.org/10.5334/cstp.250.

Sources : OpenCitations, OpenAlex & Crossref

Partager et exporter

Statistiques de consultation

Cette page a été consultée 3180 fois.

Le PDF de cet article a été téléchargé 1208 fois.