Simon Gonzalez - Interactive Analysis and Visualisation of Annotated Collocations in Spanish (AVAnCES)

jdmdh:11383 - Journal of Data Mining & Digital Humanities, 22 septembre 2023, NLP4DH - https://doi.org/10.46298/jdmdh.11383
Interactive Analysis and Visualisation of Annotated Collocations in Spanish (AVAnCES)Article

Auteurs : Simon Gonzalez ORCID1

Phraseology studies have been enhanced by Corpus Linguistics, which has become an interdisciplinary field where current technologies play an important role in its development. Computational tools have been implemented in the last decades with positive results on the identification of phrases in different languages. One specific technology that has impacted these studies is social media. As researchers, we have turned our attention to collecting data from these platforms, which comes with great advantages and its own challenges. One of the challenges is the way we design and build corpora relevant to the questions emerging in this type of language expression. This has been approached from different angles, but one that has given invaluable outputs is the building of linguistic corpora with the use of online web applications. In this paper, we take a multidimensional approach to the collection, design, and deployment of a phraseology corpus for Latin American Spanish from Twitter data, extracting features using NLP techniques, and presenting it in an interactive online web application. We expect to contribute to the methodologies used for Corpus Linguistics in the current technological age. Finally, we make this tool publicly available to be used by any researcher interested in the data itself and also on the technological tools developed here.


Volume : NLP4DH
Rubrique : Humanités numériques en langues
Publié le : 22 septembre 2023
Accepté le : 6 juillet 2023
Soumis le : 29 mai 2023
Mots-clés : corpus-based phraseology, collocations, Latin American Spanish, social media, web-based applications, Natural Language Processing, parts of speech tagging

Fichiers

Nom Taille
Interactive Analysis and Visualisation of Annotated Collocations in Spanish.pdf.pdf
md5 : 19525cb13cafe9eecd43be864780f1e3
751.72 KB
NLP4DH_Avances_gonzalez.pdf.pdf
md5 : 76d339527f79de237bf1e3c722fff3f4
456.62 KB

Statistiques de consultation

Cette page a été consultée 653 fois.
Le PDF de cet article a été téléchargé 177 fois.