[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Joining metadata and textual features to advise administrative courts decisions: a cascading classifier approach

Published: 18 February 2023 Publication History

Abstract

Decisions of regulatory government bodies and courts affect many aspects of citizens’ lives. These organizations and courts are expected to provide timely and coherent decisions, although they struggle to keep up with the increasing demand. The ability of machine learning (ML) models to predict such decisions based on past cases under similar circumstances was assessed in some recent works. The dominant conclusion is that the prediction goal is achievable with high accuracy. Nevertheless, most of those works do not consider important aspects for ML models that can impact performance and affect real-world usefulness, such as consistency, out-of-sample applicability, generality, and explainability preservation. To our knowledge, none considered all those aspects, and no previous study addressed the joint use of metadata and text-extracted variables to predict administrative decisions. We propose a predictive model that addresses the abovementioned concerns based on a two-stage cascade classifier. The model employs a first-stage prediction based on textual features extracted from the original documents and a second-stage classifier that includes proceedings’ metadata. The study was conducted using time-based cross-validation, built on data available before the predicted judgment. It provides predictions as soon as the decision date is scheduled and only considers the first document in each proceeding, along with the metadata recorded when the infringement is first registered. Finally, the proposed model provides local explainability by preserving visibility on the textual features and employing the SHapley Additive exPlanations (SHAP). Our findings suggest that this cascade approach surpasses the standalone stages and achieves relatively high Precision and Recall when both text and metadata are available while preserving real-world usefulness. With a weighted F1 score of 0.900, the results outperform the text-only baseline by 1.24% and the metadata-only baseline by 5.63%, with better discriminative properties evaluated by the receiver operating characteristic and precision-recall curves.

References

[1]
Aletras N, Tsarapatsanis D, Preoţiuc-Pietro D, and Lampos V Predicting judicial decisions of the European court of human rights: a natural language processing perspective PeerJ Comput Sci 2016 2016 10 1-19
[2]
Bibal A, Lognoul M, De Streel A, and Frénay B Legal requirements on explainability in machine learning Artif Intell Law 2021 29 2 149-169
[3]
Bird S, Klein E, and Loper E Natural language processing with python O’Reilly Med 2009
[4]
Blei DM, Ng AY, and Jordan MI Latent dirichlet allocation J Mach Learn Res 2003 3 4–5 993-1022
[5]
Brill E (1992) A simple rule-based part of speech tagger. In: Proceedings of the third conference on applied natural language processing. Association for Computational Linguistics.
[6]
Browlee J (2018) How to reduce variance in a final machine learning model. Mach Learn Mast. https://machinelearningmastery.com/how-to-reduce-model-variance/
[7]
Cer D, Yang Y, Kong SYI, Hua N, Limtiaco N, John SR, Constant N, Guajardo-Céspedes M, Yuan S, Tar C, Sung YH, Strope B, Kurzweil R (2018) Universal sentence encoder. In: EMNLP 2018–conference on empirical methods in natural language processing: system demonstrations, Proceedings.
[8]
Chawla NV, Bowyer KW, Hall LO, and Kegelmeyer WP SMOTE: synthetic minority over-sampling technique J Artif Intell Res 2002 16 321-357
[9]
Chen DL, Eagel J (2017) Can machine learning help predict the outcome of asylum adjudications? In: Proceedings of the international conference on artificial intelligence and law, pp 237–240.
[10]
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM sigkdd international conference on knowledge discovery and data mining, pp 785–794.
[11]
Chen L (2009). Curse of dimensionality. In: Encyclopedia of database systems pp 545–546. Springer.
[12]
Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL HLT 2019–2019 conference of the north american chapter of the association for computational linguistics: human language technologies–proceedings of the conference, vol 1, pp 4171–4186. https://github.com/tensorflow/tensor2tensor
[13]
Dietterich TG Approximate statistical tests for comparing supervised classification learning algorithms Neural Comput 1998 10 7 1895-1923
[14]
Fonseca ER, Rosa JGL, and Aluísio SM Evaluating word embeddings and a revised corpus for part-of-speech tagging in Portuguese J Br Comput Soc 2015
[15]
Gama J and Brazdil P Cascade generalization Mach Learn 2000 41 3 315-343
[16]
Herman-Saffar O (2020) Time based cross validation. Towards Data Science. https://towardsdatascience.com/time-based-cross-validation-d259b13d42b8
[18]
Katz DM, Bommarito MJ, and Blackman J A general approach for predicting the behavior of the Supreme Court of the United States Plos One 2017 12 4 e0174698
[19]
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: 31st International conference on machine learning, ICML vol 4, pp 2931–2939
[20]
Luhn HP A statistical approach to mechanized encoding and searching of literary information IBM J Res Dev 1957 1 4 309-317
[21]
Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems, pp 4768–4777
[22]
Mabey B, English P (2015) pyLDAvis (2.1.2). https://pyldavis.readthedocs.io/en/latest/
[23]
Medvedeva M, Vols M, and Wieling M Using machine learning to predict decisions of the European court of human rights Artif Intell Law 2020 28 2 237-266
[24]
Mikolov T, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: NIPS’13: proceedings of the 26th international conference on neural information processing systems, vol 2, pp 3111–3119
[25]
Nason S (2018) Administrative justice can make countries fairer and more equal—if it is implemented properly. The Conversation. https://theconversation.com/administrative-justice-can-make-countries-fairer-and-more-equal-if-it-is-implemented-properly-108238
[26]
Orengo VM, Huyck C (2001) A stemming algorithm for the portuguese language. In: Proceedings 8th symposium on string processing and information retrieval, pp 186–193.
[27]
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, VanderPlas J, Passos A, Cournapeau D, Brucher M, Perrot M, and Duchesnay E Scikit-learn: machine learning in python J Mach Learn Res 2011 324 2825-2830
[28]
Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. In: EMNLP 2014–2014 conference on empirical methods in natural language processing, proceedings of the conference, pp 1532–1543.
[29]
Pillai VG, Chandran LR (2020) Verdict prediction for indian courts using bag of words and convolutional neural network. In: Proceedings of the 3rd international conference on smart systems and inventive technology, ICSSIT 2020, pp 676–683.
[31]
Ruger TW, Kim PT, Martin AD, and Quinn KM The Supreme court forecasting project: legal and political science approaches to predicting supreme court decisionmaking Columbia Law Rev 2004 104 4 1150-1210
[32]
Shinyama Y, Guglielmetti P, Marsman P (2019) pdfminer.six. https://github.com/pdfminer/pdfminer.six
[33]
Sivaranjani N, Jayabharathy J, and Teja PC Predicting the supreme court decision on appeal cases using hierarchical convolutional neural network Int J Speech Technol 2021 24 3 643-650
[34]
Spärck Jones K A statistical interpretation of term specificity and its application in retrieval J Document 1972 28 11-21
[35]
Statista (2020) Global insurance industry–statistics and facts. https://www.statista.com/topics/6529/global-insurance-industry/
[36]
SUSEP (2020a) 8° Relatório de Análise e Acompanhamento dos Mercados Supervisionados. pp 1–24. http://www.susep.gov.br/menuestatistica/SES/relat-acomp-mercado-2020a.pdf
[38]
Theodoridis S Machine learning: a bayesian and optimization perspective 2020 2 Amsterdam Elsevier

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Artificial Intelligence and Law
Artificial Intelligence and Law  Volume 32, Issue 1
Mar 2024
288 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 18 February 2023
Accepted: 27 January 2023

Author Tags

  1. Administrative decision prediction
  2. Cascade generalization
  3. Legal assistance
  4. Machine learning
  5. Natural language processing

Qualifiers

  • Research-article

Funding Sources

  • Universidade Nova de Lisboa

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media