Artificial intelligence and Law
the transforming impact on the organization of large collections of legal texts
DOI:
https://doi.org/10.35699/2965-6931.2023.47689Keywords:
natural language processing, language models, topic modeling, right, organization of legal collectionsAbstract
Recent advances in the area of artificial intelligence and natural language processing have driven several changes in the legal field. There is a constant modernization movement in Brazilian law, which focuses primarily on transparency and access to information. The large volume of legal documents makes room for the development and use of intelligent tools that seek to organize and facilitate the management of this collection. In this work, we show how the use of language models together with supported modeling techniques can organize and extract knowledge from these extensive legal collections, revealing themes that are often implicit and unknown, which brings benefits to several applications, such as the search for similar documents and the recommendation of legal texts.
Downloads
References
ARORA, S. et al. Contextual embeddings: When are they worth it? arXiv preprint arXiv:2005.09117 (2020).
BARTHOLOMEW, J. et al. How the media is covering ChatGPT. Disponível em: <https://www.cjr.org/tow_center/media-coverage-chatgpt.php> Acesso em 12 ago. 2023
BIANCHI, Federico; TERRAGNI, Silvia; HOVY, Dirk. Pre-training is a hot topic: Contextualized document embeddings improve topic coheren ce. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). [S.l.]: Association for Computational Linguistics, 2021. p. 759–766.
BIEVER, C. ChatGPT broke the Turing test — the race is on for new ways to assess AI. Disponível em: <https://www.nature.com/articles/d41586-023-02361-7> Acesso em 12 ago. 2023
BRASIL. Conselho Nacional de Justiça. Painel de projetos com inteligência artificial no Poder Judiciário, 2020. Disponível em: <https://paineisanalytics.cnj.jus.br/single/?appid=29d710f7-8d8f-47be-8af8-a9152545b771&sheet=b8267e5a-1f1f-41a7-90ff-d7a2f4ed34ea&lang=pt-BR&opt=ctxmenu,currsel>. Acesso em: 14 ago. 2023.
BRASIL. Tribunal Eleitoral Regional do Espírito Santo. Bel, a assistente virtual do TRE-ES, vence a categoria "Inovação Tecnológica" do Prêmio de Inovação Judiciário Exponencial. Comunicação, 06 out. 2021. 2021a. Disponível em: <https://www.tre-es.jus.br/comunicacao/noticias/2021/Outubro/bel-a-assistente-virtual-do-tre-es-vence-a-categoria-inovacao-tecnologica-do-premio-de-inovacao-judiciario-exponencial>. Acesso em: 14 out. 2023.
BRASIL, Tribunal de Justiça do Amazonas. TJAM automatiza classificação de petições intermediárias no Portal e-SAJ. Imprensa, 19 dez. 2019. Disponível em: <https://www.tjam.jus.br/index.php/menu/sala-de-imprensa/2387-tjam-automatiza-classificacao-de-peticoes-intermediarias-no-portal-e-saj>. Acesso em: 14 ago. 2023.
BRASIL. Tribunal de Justiça do Distrito Federal e Territórios. TJDFT lidera número de projetos de Inteligência Artificial no Poder Judiciário. Janeiro de 2021b. Disponível em: <https://www.tjdft.jus.br/institucional/imprensa/noticias/2021/janeiro/tjdft-e-o-tribunal-com-mais-projetos-de-inteligencia-artificial>. Acesso em: 14 ago. 2023.
BRASIL. Supremo Tribunal Federal. Ministra Rosa Weber lança robô VitórIA para agrupamento e classificação de processos, 17 mai. 2023. Disponível em: https://portal.stf.jus.br/noticias/verNoticiaDetalhe.asp?idConteudo=507426&ori=1. Acesso em: 14 ago. 2023.
BROWN, T. B. et al. Language models are few-shot learners. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook, NY, USA: Curran Associates Inc., 2020. (NIPS’20). ISBN 9781713829546.
CHALKIDIS, I. et al. Extreme multi-label legal text classification: A case study in EU legislation. In: Proceedings of the Natural Legal Language Processing Workshop 2019. Minneapolis, Minnesota: Association for Computational Linguistics, 2019. p. 78–87. Disponível em: <https://aclanthology.org/W19-2209>.
CHEN, H. et al. A comparative study of automated legal text classification using random forests and deep learning. Information Processing Management, v. 59, n. 2, p. 102798, 2022. ISSN 0306-4573. Disponível em: <https://www.sciencedirect.com/science/article/ pii/S0306457321002764>.
DEVLIN, J. et al. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). [S.l.: s.n.], 2019. p. 4171–4186.
DEVLIN, J. et al. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics, 2019. p. 4171–4186. Disponível em: <https://aclanthology.org/N19-1423>.
DIENG, Adji. B.; RUIZ, Francisco J. R.; BLEI, David M. Topic modeling in embedding spaces. Transactions of the Association for Computational Linguistics, MIT Press, Cambridge, MA, v. 8, p. 439–453, 2020.
ETHAYARAJH, K. BERT, ELMo, & GPT-2: How Contextual are Contextualized Word Representations? Disponível em: <http://ai.stanford.edu/blog/contextual/> Acesso em 12 ago. 2023
GROOTENDORST, Maarten. Bertopic: Neural topic modeling with a class-based tf-idf procedure. CoRR, abs/2203.05794, 2022. Citado na página 1.
HOFMANN, V. et al. Dynamic contextualized word embeddings. arXiv preprint arXiv:2010.12684 (2020).
HUANG, Z. et al. Context-aware legal citation recommendation using deep learning. In: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law. New York, NY, USA: Association for Computing Machinery, 2021. (ICAIL ’21), p. 79–88. ISBN 9781450385268. Disponível em: <https://doi.org/10.1145/3462757.3466066>.
JOJRI, P. et al. Natural language processing: History, evolution, application, and future work. In: Proceedings of 3rd International Conference on Computing Informatics and Networks: ICCIN 2020, pp. 365-375. Springer Singapore, 2021.
LE, Quoc; MIKOLOV Tomas. Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - ICML. [S.l.: s.n.], 2014. p. II–1188–II–1196.
LIU, Q. et al. A survey on contextual embeddings. arXiv preprint arXiv:2003.07278, 2020
LUND, K.; BURGESS C. Producing high-dimensional semantic spaces from lexical co-occurrence. In: Behavior research methods, instruments, & computers 28.2 (1996): 203-208. Disponível em: <https://doi.org/10.3758/BF03204766>
MIKOLOV, T. et al. Efficient Estimation of Word Representations in Vector Space. Proceedings of the First International Conference on Learning Representations, 2013.
RAHMAN, M. F. et al. Hdbscan: Density based clustering over location based services. ArXiv, abs/1602.03730, 2016.
RUSSEL, S. J. Artificial intelligence a modern approach. 3. ed. Pearson Education, Inc., 2010.
SALOMÃO, Luis Felipe (coord.). Tecnologia aplicada à gestão dos conflitos no âmbito do Poder Judiciário Brasileiro. Rio de Janeiro: Editora FGV Conhecimento, 2023.
SANSONE, Carlo; SPERLÍ, Giancarlo. Legal information retrieval systems: State-of-the-art and open issues. Inf. Syst., Elsevier Science Ltd., GBR, v. 106, n. C, may 2022. ISSN 0306-4379. Disponível em: <https://doi.org/10.1016/j.is.2021.101967>.
SCHWARCZ D. et al. AI Tools for Lawyers: A Practical Guide. In: Minnesota Law Review Headnotes. Disponível em: <http://dx.doi.org/10.2139/ssrn.4404017>
SHAO, Y. et al. Bert-pli: Modeling paragraph-level interactions for legal case retrieval. In: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. [S.l.: s.n.], 2021. (IJCAI’20). ISBN 9780999241165.
SRIVASTAVA, Akash; SUTTON, Charles. Autoencoding Variational Inference for Topic Models. arXiv e-prints, p. arXiv:1703.01488, mar. 2017.
SULIS, E. et al. Exploiting co-occurrence networks for classification of implicit inter- relationships in legal texts. Information Systems, v. 106, p. 101821, 2022. ISSN 0306-4379. Disponível em: <https://www.sciencedirect.com/science/article/pii/S0306437921000648>.
VASWANI, A. et al. Attention is all you need. In: GUYON, I. et al. (Ed.). Advances in Neural Information Processing Systems. Curran Associates, Inc., 2017. v. 30. Disponível em: <https://proceedings.neurips.cc/paper_files/paper/2017/file/ 3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf>.
WALDRON, Jeremy. The Concept and the Rule of Law. In: No. 08–35; Public Law & Legal Theory Research Paper Series, Issue November, 2008.
WIEDMANN, G. et al. Does BERT make any sense? Interpretable word sense disambiguation with contextualized embeddings. arXiv preprint arXiv:1909.10430, 2019
YANG, J. et al. Legalgnn: Legal information enhanced graph neural network for recommendation. ACM Trans. Inf. Syst., Association for Computing Machinery, New York, NY, USA, v. 40, n. 2, sep 2021. ISSN 1046-8188. Disponível em: <https://doi.org/10.1145/3469887>.