Enhancing Legal Argument Mining with Domain Pre-training and Neural Networks

Zhang, Gechuan; Nulty, Paul; Lillis, David

doi:10.46298/jdmdh.9147

Gechuan Zhang ; Paul Nulty ; David Lillis - Enhancing Legal Argument Mining with Domain Pre-training and Neural Networks

jdmdh:9147 - Journal of Data Mining & Digital Humanities, 10 juin 2022, NLP4DH - https://doi.org/10.46298/jdmdh.9147

Enhancing Legal Argument Mining with Domain Pre-training and Neural NetworksArticle

Auteurs : Zhang, Gechuan ¹; Nulty, Paul ²; Lillis, David ¹

1 University College Dublin
2 Birkbeck, University of London

The contextual word embedding model, BERT, has proved its ability on downstream tasks with limited quantities of annotated data. BERT and its variants help to reduce the burden of complex annotation work in many interdisciplinary research areas, for example, legal argument mining in digital humanities. Argument mining aims to develop text analysis tools that can automatically retrieve arguments and identify relationships between argumentation clauses. Since argumentation is one of the key aspects of case law, argument mining tools for legal texts are applicable to both academic and non-academic legal research. Domain-specific BERT variants (pre-trained with corpora from a particular background) have also achieved strong performance in many tasks. To our knowledge, previous machine learning studies of argument mining on judicial case law still heavily rely on statistical models. In this paper, we provide a broad study of both classic and contextual embedding models and their performance on practical case law from the European Court of Human Rights (ECHR). During our study, we also explore a number of neural networks when being combined with different embeddings. Our experiments provide a comprehensive overview of a variety of approaches to the legal argument mining task. We conclude that domain pre-trained transformer models have great potential in this area, although traditional embeddings can also achieve strong performance when combined with additional neural network layers.

https://doi.org/10.46298/jdmdh.9147

Source : zenodo.org:6630045

Volume : NLP4DH

Publié le : 10 juin 2022

Accepté le : 6 avril 2022

Soumis le : 1 mars 2022

Licence : Attribution 4.0 International (CC BY 4.0)

Fichiers

Nom	Taille
JDMDH_submission.pdf md5 : 5ed2ef657fb43940d8fd181684da2aa9	304.04 KB

Publications

autre

10.5281/zenodo.6316771

1 Zenodo

Références bibliographiques

8 Documents citant cet article

Alexandra Schofield;Siqi Wu;Theo Bayard de Volo;Tatsuki Kuze;Alfredo Gomez;et al., 2025, "My Very Subjective Human Interpretation": Domain Expert Perspectives on Navigating the Text Analysis Loop for Topic Models, Proceedings of the ACM on Human-Computer Interaction, 9, 1, pp. 1-30, 10.1145/3701201, https://doi.org/10.1145/3701201.

Miloš Bogdanović;Milena Frtunić Gligorijević;Jelena Kocić;Leonid Stoimenov, 2025, Improving Text Recognition Accuracy for Serbian Legal Documents Using BERT, Applied Sciences, 15, 2, pp. 615, 10.3390/app15020615, https://doi.org/10.3390/app15020615.

Mirko Locatelli;Lavinia Chiara Tagliabue;Giuseppe M. Di Giuda, 2024, A multi-label text classifier: application on an Italian public tender procedure, project ISCOL@, Journal of Information Technology in Construction, 29, pp. 864-893, 10.36680/j.itcon.2024.038, https://doi.org/10.36680/j.itcon.2024.038.

Xuran Wang;Xinguang Zhang;Vanessa Hoo;Zhouhang Shao;Xuguang Zhang, 2024, LegalReasoner: A Multi-Stage Framework for Legal Judgment Prediction via Large Language Models and Knowledge Integration, IEEE Access, 12, pp. 166843-166854, 10.1109/access.2024.3496666, https://doi.org/10.1109/access.2024.3496666.

Abdullah Al Zubaer;Michael Granitzer;Jelena Mitrović, 2023, Performance analysis of large language models in the domain of legal argument mining, Frontiers in Artificial Intelligence, 6, 10.3389/frai.2023.1278796, https://doi.org/10.3389/frai.2023.1278796.

Charles F. O. Viegas;Bruno C. Costa;Renato P. Ishii, Lecture notes in computer science, JurisBERT: A New Approach that Converts a Classification Corpus into an STS One, pp. 349-365, 2023, 10.1007/978-3-031-36805-9_24.

Daniele Licari;Giovanni Comandè, 2023, ITALIAN-LEGAL-BERT models for improving natural language processing tasks in the Italian legal domain, Computer Law & Security Review, 52, pp. 105908, 10.1016/j.clsr.2023.105908.

Georgi Karadzhov;Tom Stafford;Andreas Vlachos, 2023, DeliData: A Dataset for Deliberation in Multi-party Problem Solving, Proceedings of the ACM on Human-Computer Interaction, 7, CSCW2, pp. 1-25, 10.1145/3610056, https://doi.org/10.1145/3610056.

Sources : OpenCitations, OpenAlex & Crossref

Partager et exporter

Statistiques de consultation

Cette page a été consultée 3154 fois.

Le PDF de cet article a été téléchargé 938 fois.