Lessons learned building a legal inference dataset

511 Accesses
Explore all metrics

Abstract

Legal inference is fundamental for building and verifying hypotheses in police investigations. In this study, we build a Natural Language Inference dataset in Korean for the legal domain, focusing on criminal court verdicts. We developed an adversarial hypothesis collection tool that can challenge the annotators and give us a deep understanding of the data, and a hypothesis network construction tool with visualized graphs to show a use case scenario of the developed model. The data is augmented using a combination of Easy Data Augmentation approaches and round-trip translation, as crowd-sourcing might not be an option for datasets with sensible data. We extensively discuss challenges we have encountered, such as the annotator’s limited domain knowledge, issues in the data augmentation process, problems with handling long contexts and suggest possible solutions to the issues. Our work shows that creating legal inference datasets with limited resources is feasible and proposes further research in this area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Mining legal arguments in court decisions

Article Open access 23 June 2023

Data Augmentation and Large Language Model for Legal Case Retrieval and Entailment

Article 26 March 2024

A Decade of Legal Argumentation Mining: Datasets and Approaches

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The data is available on GitHub (https://github.com/onspark/LEAP_NLI_v2.0).

Notes

Both police manuals for creating investigation result reports and investigation review reports were internal documents. We pursued expert interviews and solicited detailed explanations in written form to understand these processes more accurately.

References

Auto-GPT: An Autonomous GPT-4 Experiment. (2023). [Python]. Significant Gravitas. https://github.com/Significant-Gravitas/Auto-GPT
Bayer M, Kaufhold M-A, Reuter C (2022) A survey on data augmentation for text classification. ACM Comput Surv. https://doi.org/10.1145/3544558
Article Google Scholar
Belinkov Y, Bisk Y (2018) Synthetic and natural noise both break neural machine translation (arXiv:1711.02173)
Beltagy I, Peters ME, Cohan A (2020) Longformer: the long-document transformer (arXiv:2004.05150)
Bhagavatula C, Bras RL, Malaviya C, Sakaguchi K, Holtzman A, Rashkin H, Downey D, Yih SW, Choi Y (2020) Abductive commonsense reasoning (arXiv:1908.05739)
Bowman SR, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference (arXiv:1508.05326)
Bras RL, Swayamdipta S, Bhagavatula C, Zellers R, Peters ME, Sabharwal A, Choi Y (2020) Adversarial filters of dataset biases (arXiv:2002.04108)
Clark K, Luong M-T, Le QV, Manning CD (2020) ELECTRA: pre-training text encoders as discriminators rather than generators (arXiv:2003.10555)
Conneau A, Rinott R, Lample G, Williams A, Bowman S, Schwenk H, Stoyanov V (2018) XNLI: evaluating cross-lingual sentence representations. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 2475–2485. https://doi.org/10.18653/v1/D18-1269
Coulombe C (2018) Text data augmentation made simple by leveraging NLP cloud APIs (arXiv:1812.04718). arXiv. http://arxiv.org/abs/1812.04718
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding (arXiv:1810.04805). arXiv. http://arxiv.org/abs/1810.04805
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. arXiv. http://arxiv.org/abs/1412.6572
Gururangan S, Swayamdipta S, Levy O, Schwartz R, Bowman S, Smith NA (2018) Annotation artifacts in natural language inference data. In: Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: human language technologies, Volume 2 (Short Papers), pp 107–112. https://doi.org/10.18653/v1/N18-2017
Ham J, Choe YJ, Park K, Choi I, Soh H (2020) KorNLI and KorSTS: new benchmark datasets for Korean natural language understanding. Findings of the Association for Computational Linguistics: EMNLP 2020, pp 422–430. https://doi.org/10.18653/v1/2020.findings-emnlp.39
Heo J (2021) 110 cases per person... Police investigation examiner who was hit by a “day bomb.” Seoul Economic Daily. https://www.sedaily.com/NewsView/22M7D9OSWB
Jia Y, Liu Y, Yu X, Voida S (2017) Designing leaderboards for gamification: perceived differences based on user ranking, application domain, and personality traits. In: Proceedings of the 2017 CHI conference on human factors in computing systems, pp 1949–1960. https://doi.org/10.1145/3025453.3025826
Kaushik D, Hovy E, Lipton ZC (2020) Learning the difference that makes a difference with counterfactually-augmented data (arXiv:1909.12434)
Kim A (2022) How Democratic Party of Korea-led prosecution reforms fail victims. Korean Herald. https://www.koreaherald.com/view.php?ud=20220501000254&ACE_SEARCH=1
Kim M-Y, Rabelo J, Okeke K, Goebel R (2022) Legal information retrieval and entailment based on BM25, transformer and semantic thesaurus methods. Rev Socionetwork Strateg 16(1):157–174. https://doi.org/10.1007/s12626-022-00103-1
Article Google Scholar
Kim T (2020) KorEDA [Python]. https://github.com/catSirup/KorEDA (Original work published 2020)
KLAID LJP Base (2022) Law&Company. lawcompany/KLAID_LJP_base
LBox Open (2022) [Python]. LBOX. https://github.com/lbox-kr/lbox-open
Likert R (1932) A technique for the measurement of attitudes. Arch Psychol 22(140):55–55
Google Scholar
Liu, H., Cui, L., Liu, J., & Zhang, Y. (2020). Natural Language Inference in Context—Investigating Contextual Reasoning over Long Texts (arXiv:2011.04864)
Nakajima, Y. (2023). BabyAGI [Python]. https://github.com/yoheinakajima/babyagi
Nie Y, Williams A, Dinan E, Bansal M, Weston J, Kiela D (2020) Adversarial NLI: a new benchmark for natural language understanding. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 4885–4901. https://doi.org/10.18653/v1/2020.acl-main.441
Oshin M (2023) GPT-4 & LangChain—Create a ChatGPT Chatbot for Your PDF Files [TypeScript]. https://github.com/mayooear/gpt4-pdf-chatbot-langchain
Park D (2021) KoEDA [Python]. https://github.com/toriving/KoEDA (Original work published 2020)
Park J (2022) KoELECTRA [Python]. https://github.com/monologg/KoELECTRA
Park S, Moon J, Kim S, Cho WI, Han J, Park J, Song C, Kim J, Song Y, Oh T, Lee J, Oh J, Lyu S, Jeong Y, Lee I, Seo S, Lee D, Kim H, Lee M et al (2021) KLUE: Korean language understanding evaluation (arXiv:2105.09680)
Pirolli P, Card S (2005) The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In: The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis
Poliak A, Naradowsky J, Haldar A, Rudinger R, Van Durme B (2018) Hypothesis only baselines in natural language inference. In: Proceedings of the seventh joint conference on lexical and computational semantics, pp 180–191. https://doi.org/10.18653/v1/S18-2023
Rabelo J, Goebel R, Kim M-Y, Kano Y, Yoshioka M, Satoh K (2022) Overview and discussion of the competition on legal information extraction/entailment (COLIEE) 2021. Rev Socionetwork Strateg 16(1):111–133. https://doi.org/10.1007/s12626-022-00105-z
Article Google Scholar
Um J (2022) S. Korean Democrats’ long road to reforming prosecution service: Victory or blunder? Hankyoreh. https://english.hani.co.kr/arti/english_edition/e_national/1041606.html
Wei J, Zou K (2019) EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 6381–6387. https://doi.org/10.18653/v1/D19-1670
Williams A, Nangia N, Bowman S (2018) A broad-coverage challenge corpus for sentence understanding through inference. In: Proceedings of the 2018 conference of the north american chapter of the association for computational linguistics: human language technologies, vol 1 (Long Papers), pp 1112–1122. https://doi.org/10.18653/v1/N18-1101
Woo J (2020) S. Korea takes long overdue steps to rein in prosecution service, but task far from over. Yonhap News. https://en.yna.co.kr/view/AEN20201217008100315
Xie Q, Dai Z, Hovy E, Luong M-T, Le QV (2020) Unsupervised data augmentation for consistency training (arXiv:1904.12848)
Yu AW, Dohan D, Luong M-T, Zhao R, Chen K, Norouzi M, Le QV (2018) QANet: combining local convolution with global self-attention for reading comprehension (arXiv:1804.09541)
Zhang WE, Sheng QZ, Alhazmi A, Li C (2020) Adversarial attacks on deep-learning models in natural language processing: a survey. ACM Tran Intell Syst Technol 11(3):1–41. https://doi.org/10.1145/3374217
Article Google Scholar

Download references

Acknowledgements

This research was supported and funded by the Korean National Police Agency [Project Name: AI-Based Crime Investigation Support System/ Project Number: PR10-02-000-21]. The authors also thank the Legal Informatics and Forensic Science (LIFS) institute at Hallym University and its researchers for their indispensable help in creating the data.

Author information

Authors and Affiliations

Korea University School of Cybersecurity & HM Company, Seoul, South Korea
Sungmi Park
DFIR Science, LLC, Champaign, IL, USA
Joshua I. James

Authors

Sungmi Park
View author publications
You can also search for this author in PubMed Google Scholar
Joshua I. James
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joshua I. James.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1

See Table 13. Key phrases are highlighted in bold.

Table 13 Types of inference marked with an asterisk (*) were adapted from (Nie et al. 2020)

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Park, S., James, J.I. Lessons learned building a legal inference dataset. Artif Intell Law 32, 1011–1044 (2024). https://doi.org/10.1007/s10506-023-09370-x

Download citation

Accepted: 01 July 2023
Published: 31 July 2023
Issue Date: December 2024
DOI: https://doi.org/10.1007/s10506-023-09370-x

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mining legal arguments in court decisions

Data Augmentation and Large Language Model for Legal Case Retrieval and Entailment

A Decade of Legal Argumentation Mining: Datasets and Approaches

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix 1

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Lessons learned building a legal inference dataset

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Mining legal arguments in court decisions

Data Augmentation and Large Language Model for Legal Case Retrieval and Entailment

A Decade of Legal Argumentation Mining: Datasets and Approaches

Explore related subjects

Data availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix 1

Appendix 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now