Erkki Mervaala ; Jari Lyytimäki - Towards efficient and reliable utilization of automated data collection: Media scrapers applied to news on climate change

jdmdh:13123 - Journal of Data Mining & Digital Humanities, 29 avril 2024, NLP4DH - https://doi.org/10.46298/jdmdh.13123
Towards efficient and reliable utilization of automated data collection: Media scrapers applied to news on climate changeArticle

Auteurs : Mervaala, Erkki ORCID1; Lyytimäki, Jari ORCID1

  • 1 Finnish Environment Institute

Abstract: Automated data collection provides tempting opportunities for social sciences and humanities studies. Abundant data accumulating in various digital archives allows more comprehensive, timely and cost-efficient ways of harvesting and processing information. While easing or even removing some of the key problems, such as laborious and time-consuming data collection and potential errors and biases related to subjective coding of materials and distortions caused by focus on small samples, automated methods also bring in new risks such as poor understanding of contexts of the data or non-recognition of underlying systematic errors or missing information. Results from testing different methods to collect data describing newspaper coverage of climate change in Finland emphasize that fully relying on automatable tools such as media scrapers has its limitations and can provide comprehensive but incomplete document acquisition for research. Many of these limitations can, however, be addressed and not all of them rely on manual control.


Volume : NLP4DH
Publié le : 29 avril 2024
Accepté le : 9 avril 2024
Soumis le : 27 février 2024
Mots-clés : Text scraping,Media analysis,Climate change communication

Fichiers

Nom Taille
Towards_efficient_and_reliable_utilization_of_automated_data_collection.pdf
md5 : 046ddbb7fcbc1ee591040cbc72b58343
1.05 MB

Statistiques de consultation

Cette page a été consultée 71 fois.
Le PDF de cet article a été téléchargé 30 fois.