Controlled Vocabulary and Artificial Intelligence in Indexing

A Literature Review

Authors

  • Mariângela Spotti Lopes Fujita UNESP - São Paulo State University, Faculty of Philosophy and Science, Marília Campus
  • Nuno Miguel Teixeira Sousa University of Lisbon, Faculty of Arts, Center for Classical Studies, Portugal

Keywords:

Artificial Intelligence, Librarians, Manual indexing, Automatic indexing, Information representation and retrieval, Libraries

Abstract

With artificial intelligence, librarians have found an ally in indexing to organize vast data sets. With the question: how can artificial intelligence help librarians with indexing? – the goal is to understand the relationship between artificial intelligence and indexing. A qualitative approach was adopted in bibliographic research which, after inclusion criteria (title, abstract, keywords, and thematic relevance) and exclusion criteria (duplicates and articles without full access), analyzed 27 articles (17 WoS; 10 EBSCO). It was found that: (i) artificial intelligence and/or automatic indexing do not replace librarians, but function as support; (ii) automatic indexing has evolved positively with gains in accuracy and consistency; (iii) there are effective tools applied in libraries, such as Annif, FintoAI, and Kratt; (iv) these tools play an important role in the modernization of library processes; and (v) it is inevitable that libraries will invest in continuing education and the implementation of AI solutions. The use of artificial intelligence in indexing represents a strategic opportunity for the efficiency and quality of library services. The impact of AI brings indexing closer to the natural language of users, preserving the role of the librarian and expanding their role in information management.

Author Biographies

  • Mariângela Spotti Lopes Fujita, UNESP - São Paulo State University, Faculty of Philosophy and Science, Marília Campus

    PhD in Communication Sciences from the University of São Paulo (1992), Full Professor (2003) in Documentary Analysis and Alphabetic Documentary Languages, Full Professor at the São Paulo State University Júlio de Mesquita Filho UNESP from 2010 to 2017. She carried out teaching, research, extension activities, with a focus on Indexing and Indexing Languages, as well as management and teaching in the undergraduate course in Library Science and Archives at UNESP Campus de Marília from 1980 to February 2017. She currently provides voluntary services in postgraduate teaching, research and extension at the Faculty of Philosophy and Sciences of UNESP Campus de Marília. She is a permanent professor in the Postgraduate course in the research line Production and Organization of Information of the Program in Information Science at UNESP. She was Coordinator of the Postgraduate Program in Information Science at UNESP in 2003-2004 and in 2007 and Vice Coordinator in 2002-2003; 2004-2007; 2007-2010. As a researcher, she works in the Research Groups Thematic Representation of Information (leader) and Reading, organization, representation, production and use of information UFPB (member). She develops research activities at UNESP with a Research Productivity scholarship from CNPq level 1B. She is a member of the Scientific Societies of her specialty: National Association of Research and Postgraduate Studies in Information Science ANCIB and the Brazilian Chapter of the International Society for Knowledge Organization ISKO, in Brazil and abroad of the International Society for Knowledge Organization ISKO. Professionally, she served as Coordinator of the General Coordination of Libraries at UNESP from April 1999 to January 2005. She was Director of the Faculty of Philosophy and Sciences at UNESP - Marília Campus (2008/2012); Pro-Rector of University Extension at UNESP (2013/2017); Member of the Permanent Evaluation Committee at UNESP; Member of the Faculty Hiring Committee at UNESP; Member of the Editorial Board of Scientific Journals at UNESP, Advisor to the Office of the Rector of UNESP for Library matters, Vice-Coordinator of the PPGCI at UNESP - Marília Campus and member of several local and central collegiate bodies at UNESP. At CNPq, she served as Advisor and President of the Committee for the area of ​​Communication, Arts and Information Science from 2012 to 2017. She is currently Supervisor of the Institute of Public Policies of Marília, President of the Permanent Committee for Publications and of the Council of Editors of Scientific Journals of the Faculty of Philosophy and Sciences of UNESP and deputy coordinator in Information Science of the Communication and Information area of ​​CAPES. She is an ad hoc reviewer for funding agencies and participates as a reviewer and member of Scientific Committees of scientific events and journals in Information Science in Brazil and abroad.

  • Nuno Miguel Teixeira Sousa, University of Lisbon, Faculty of Arts, Center for Classical Studies, Portugal

    Nuno Miguel Teixeira Sousa completed his Master's degree in Information Science in 2023 at the Faculty of Letters of the University of Coimbra and his Bachelor's degree in Information Science in 2021 at the Faculty of Letters of the University of Coimbra. He is currently studying for a PhD in Information Science at the Faculty of Letters of the University of Lisbon since 2024 and he is investigator of Center for Classical Studies. He is a Senior Library Technician at the Faculty of Law of the University of Coimbra. He has received 4 awards. Works in the area(s) of Social Sciences with an emphasis on Communication Sciences with an emphasis on Information Sciences.

References

APPLETON, L. AI and academic libraries: what’s all the fuss about? New Review of Academic Librarianship, [s.l.], v. 30, n. 3-4, p. 281-295, 2024. DOI https://doi.org/10.1080/13614533.2024.2356474.

ASULA, M. et al. Kratt: developing an automatic subject indexing tool for the National Library of Estonia. Cataloguing e Classification Quarterly, [s.l.], v. 59, n. 8, p. 775-793, 2021. DOI https://doi.org/10.1080/01639374.2021.1998283.

BIBLIOTECARIA. Avanzando en la intersección de la tecnología y el conocimiento. [s.l.]: BibliotecarIA, ([2025]). Disponível em: https://www.bibliotecaria.es/. Acesso em: 31 out. 2025.

CHANDRASHEKARA, G. S.; MULIMANI, M. The impact of artificial intelligence on library and information science (LIS) services. International Journal of Innovative Practice and Applied Research (IJIPAR), [s.l.], v. 14, n. 5, p. 50-56, 2024. DOI http://dx.doi.org/10.2139/ssrn.4856459v.

CHEN, E.; BULLARD, J.; GIUSTANI, D. Automated indexing using NLM’s Medical Text Indexer (MTI) compared to human indexing in Medline: a pilot study. Journal of the Medical Library Association, Chicago, v. 111, n. 3, p. 684-694, 2023. DOI https://dx.doi.org/10.5195/jmla.2023.1588.

CHU, H. Information Representation and Retrieval in the Digital Age. 2. ed. Medford: American Society for Information Science and Technology : Information Today, 2010.

GIL-LEIVA, I. et al. Extracción de información de documentos PDF para su uso en la indización automática de e-books. Transinformação, Campinas, v. 34, [s.n.], p.1-11, 2022. DOI https://doi.org/10.1590/2318-0889202234e210069.

GOLUB, K. Potential and challenges of subject access in libraries today on the example of swedish libraries. International Information e Library Review, [s.l.], v. 48, n. 3, p. 204-210, 2016. DOI https://doi.org/10.1080/10572317.2016.1205406.

FERREIRA, M. H. W.; CORREA, R. F. Sistematização da obtenção de indicadores temáticos de informação científica. Encontros Bibli, Florianópolis, v. 28, [s.l.], p. 1-30, 2023. DOI https://doi.org/10.5007/1518-2924.2023.e92070.

KASPRZIK, A. Automatic subject indexing at ZBW: making research results stick in practice. Journal of the Association of European Research Libraries, [s.l.], v. 33, n. 1, p. 1-17, 2023. DOI https://doi.org/10.53377/lq.13579.

KING, S. et al. Revisiting indexing and abstracting in the digital era. Texas: University of North Texas, 2018. Disponível em: https://digital.library.unt.edu/ark:/67531/metadc1164546/m2/1/high_res_d/Revisiting_Indexing_and_Abstracting_in_the_Digital_Era.pdf. Acesso em: 31 out. 2025.

LLORÉNS, J. et al. Automatic generation of domain representations using thesaurus structures. Journal of the American Society for Information Science and Technology, [s.l.], v. 55, n. 10, p. 846-858, 2004. DOI https://doi.org/10.1002/asi.20039.

MANNHEIMER, S. Responsible AI practice in libraries and archives. Information Technology and Libraries, Ann Arbor, v. 43, n. 2, p. 1-20, 2024. Disponível em: https://ital.corejournals.org/index.php/ital/article/view/17245. Acesso em: 31 out. 2025.

NIRUDI, Y.; PARICHI, R. Artificial intelligence in libraries: an overview. SSRN Electronic Journal, [s.l.], [s.n.], [s.n.], nov. 2024. Disponível em: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5080670. Acesso em: 31 out. 2025.

OBASEKI, T. I. Automated indexing: the key to information retrieval in the 21st century. Library Philosophy and Practice, Nebraska, v. 338, [s.n.], p. 1-4, 2010. Disponível em: https://abrir.link/CMYCh. Acesso em: 31 out. 2025.

OLIVEIRA, R. O uso da inteligência artificial em bibliotecas universitárias: aplicações em catalogação e indexação. Porto Alegre: UFRGS, 2024. Dissertação (Mestrado em Ciência da Informação) – Universidade Federal do Rio Grande do Sul, Porto Alegre. Disponível em: https://lume.ufrgs.br/bitstream/handle/10183/290598/001244292.pdf. Acesso em: 31 out. 2025.

PARK, J.; BRENZA, A. Evaluation of semi-automatic metadata generation tools: a survey of the current state of the art. Information Technology and Libraries, Ann Arbor, v. 34, n. 3, p. 22-42, 2015. DOI https://doi.org/10.6017/ital.v34i3.5889.

PITTKE, F.; LEOPOLD, H; MENDLING, J. Automatic detection and resolution of lexical ambiguity in process models. IEEE Transactions on Software Engineering, [s.l.], v. 41, n. 6, p. 526-544, 2015. DOI https://doi.org/10.1109/TSE.2015.2396895.

STEIGER, K. Artificial Intelligence in higher education and academic libraries: a literature review. Endnotes, [s.l.], v. 125, n. 1, p. 1-15, 2024. Disponível em: https://journals.ala.org/index.php/endnotes/article/view/8235. Acesso em: 31 out. 2025.

SUOMINEN, O. Supporting subject librarians with AI solutions. Finland: IFLA, 2022. Disponível em: https://www.ifla.org/wp-content/uploads/1.Suominen_Supporting-Subject-Librarians-_-IFLA-AI-webinar.pdf. Acesso em: 31 out. 2025.

SUOMINEN, O.; INKINEN, J.; LEHTINEN, M. Annif and Finto AI: developing and implementing automated subject indexing. JLIS.it, [s.l.], v. 13, n. 1, p. 265-282, 2022. DOI https://doi.org/10.4403/jlis.it-12740.

TOEPFER, M.; SEIFERT, C. Descriptor-invariant fusion architectures for automatic subject indexing. ACM/IEEE Joint Conference on Digital Libraries (JCDL), 1., 2017, Toronto. Proceedings […]. Toronto: ACM : IEEE, 2017. p. 1-10. DOI 10.1109/JCDL.2017.7991557.

TRINDADE, Alessandra Stefane Cândido Elias da; OLIVEIRA, Henry Poncio Cruz de. Inteligência Artificial (IA) Generativa e Competência em Informação: habilidades informacionais necessárias ao uso de ferramentas de ia generativa em demandas informacionais de natureza acadêmica-científica. Perspectivas em Ciência da Informação, Belo Horizonte, v. 29, n. 2, p. 201-219, 2024. DOI http://dx.doi.org/10.1590/1981-5344/47485.

VALLEZ, M. et al. Updating controlled vocabularies by analysing query logs. Online Information Review, [s.l.], v. 39, n. 7, p. 1-24, 2015. DOI http://dx.doi.org/10.1108/OIR-06-2015-0180.

Published

2025-11-18

How to Cite

Controlled Vocabulary and Artificial Intelligence in Indexing: A Literature Review. (2025). Perspectivas Em Ciência Da Informação, 30, e56745. https://periodicos.ufmg.br/index.php/pci/article/view/56745