Exploiting semantic similarities applied to recommender systems with Content-Based Filtering
DOI:
https://doi.org/10.1590/1983-3652.2026.51173Keywords:
Semantic similarity, Recommender systems, Content-Based FilteringAbstract
The present work aimed to develop and evaluate alternative methods for similarity calculation, combining conventional approaches such as cosine similarity with the Wu-Palmer similarity, integrated into WordNet's semantic network, to improve the quality of recommendations in Content-Based Filtering recommender systems. The MovieLens small movie database and Google Collaboratory's Python programming environment were used for this. The results of the experiments indicate that Content-Based Filtering can be improved by implementing methods that leverage semantic similarity measures. In addition, the best-performing similarity measure was Wu-Palmer, as measured by Mean Reciprocal Rank and Mean Average Precision. Specifically regarding Mean Reciprocal Rank, Wu-Palmer's similarity consistently got better results through all positions, with the maximum average outperforming others. Concerning Mean Reciprocal Rank outcomes, the algorithm developed based on Wu-Palmer similarity also demonstrated the best overall performance in the experiment, achieving a maximum Mean Reciprocal Rank of 0.67 at position ten.
Downloads
References
ALMEIDA, Matheus Santos. Sistemas de recomendação baseado em otimização multiobjetivo para recomendação de filmes. [S. l.: s. n.], 2020. DCOMP Departamento de Computação, Ciência da Computação, Universidade Federal de Sergipe, São Cristóvão, SE, Brasil.
ANWAR, Taushif; UMA, V. Comparative study of recommender system approaches and movie recommendation using collaborative filtering. International Journal of System Assurance Engineering and Management, Springer, v. 12, p. 426–436, 2021. DOI: 10.1007/s13198-021-01087-x.
AZAMBUJA, Rogério Xavier de; MORAIS, A Jorge; FILIPE, Vítor. Teoria e prática em sistemas de recomendação. Revista de Ciências da Computação, nº16, Universidade Aberta, p. 23–46, 2021.
BERNARDO, Letícia Ellen; ANDRADE, Kleber de Oliveira. Cine Collection: um aplicativo para recomendação de filmes. [S. l.: s. n.], 2019. Faculdade de Tecnologia de Americana, SP, Brasil.
BIRD, S.; KLEIN, E.; LOPER, E. Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit. [S. l.: s. n.], 2019. Available from: https://www.nltk.org/book/. Visited on: 8 Nov. 2022.
CHANDRASEKARAN, Dhivya; MAGO, Vijay. Evolution of semantic similarity—a survey. ACM Computing Surveys (CSUR), ACM New York, NY, USA, v. 54, n. 2, p. 1–37, 2021.
D’ADDIO, Rafael M; DOMINGUES, Marcos A; MANZATO, Marcelo G. Exploiting feature extraction techniques on users’ reviews for movies recommendation. Journal of the Brazilian Computer Society, SpringerOpen, v. 23, n. 1, p. 1–16, 2017.
FELLBAUM, C. WordNet: An electronic lexical database. [S. l.]: MIT press, 1998.
GROUPLENS. Social Computing Research at the University of Minnesota. [S. l.: s. n.], 2022. Available from: https://grouplens.org/. Visited on: 30 Nov. 2022.
HAN, Mengting; ZHANG, Xuan; YUAN, Xin; JIANG, Jiahao; YUN, Wei; GAO, Chen. A survey on the techniques, applications, and performance of short text semantic similarity. Concurrency and Computation: Practice and Experience, Wiley Online Library, v. 33, n. 5, e5971, 2021.
HARPER, F. Maxwell; KONSTAN, Joseph A. The MovieLens Datasets: History and Context. ACM Trans. Interact. Intell. Syst., Association for Computing Machinery, v. 5, n. 4, 2015.
JANNACH, D.; ZANKER, M.; FELFERNIG, A.; FRIEDRICH, G. Recommender Systems An Introduction. [S. l.]: Cambridge University Press, 2011.
JAVED, Umair; SHAUKAT, Kamran; HAMEED, Ibrahim A; IQBAL, Farhat; ALAM, Talha Mahboob; LUO, Suhuai. A review of content-based and context-based recommendation systems. International Journal of Emerging Technologies in Learning (iJET), International Journal of Emerging Technology in Learning, v. 16, n. 3, p. 274–306, 2021.
LOSHIN, P. Definition: Resource Description Framework (RDF). [S. l.: s. n.], Feb. 2022. Available from: https://www.techtarget.com/searchapparchitecture/definition/Resource-Description-Framework-RDF. Visited on: 30 Mar. 2023.
MESSINA, Pablo; DOMINGUEZ, Vicente; PARRA, Denis; TRATTNER, Christoph; SOTO, Alvaro. Content-based artwork recommendation: integrating painting metadata with neural and manually-engineered visual features. User Modeling and User-Adapted Interaction, Springer, v. 29, n. 2, p. 251–290, 2019.
MOHIT, G. NLP WuPalmer – WordNet Similarity. [S. l.: s. n.], July 2022. Available from: https://www.geeksforgeeks.org/nlp-wupalmer-wordnet-similarity/. Visited on: 23 Apr. 2023.
NLTK. Sample usage for wordnet. [S. l.: s. n.], Jan. 2023. Available from: https://www.nltk.org/howto/wordnet.html. Visited on: 4 Apr. 2023.
OPPERMANN, Michael; KINCAID, Robert; MUNZNER, Tamara. VizCommender: Computing text-based similarity in visualization repositories for content-based recommendations. IEEE Transactions on Visualization and Computer Graphics, IEEE, v. 27, n. 2, p. 495–505, 2020. DOI: 10.1109/TVCG.2020.3030387.
PAZZANI, Michael J; BILLSUS, Daniel. Content-based recommendation systems. In: THE adaptive web: methods and strategies of web personalization. [S. l.]: Springer, 2007. p. 325–341.
PEDREGOSA, F.; VAROQUAUX, G.; GRAMFORT, A.; MICHEL, V.; THIRION, B.; GRISEL, O.; BLONDEL, M.; PRETTENHOFER, P.; WEISS, R.; DUBOURG, V.; VANDERPLAS, J.; PASSOS, A.; COURNAPEAU, D.; BRUCHER, M.; PERROT, M.; DUCHESNAY, E. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, v. 12, p. 2825–2830, 2011.
PRADEEP, N; RAO MANGALORE, KK; RAJPAL, B; PRASAD, N; SHASTRI, R. Content based movie recommendation system. International journal of research in industrial engineering, Ayandegan Institute of Higher Education, Iran, v. 9, n. 4, p. 337–348, 2020.
PURI, Jaskaran S. NLP Text Similarity, how it works and the math behind it. [S. l.: s. n.], May 2018. Available from: https://towardsdatascience.com/nlp-text-similarity-how-it-works-and-the-math-behind-it-a0fb90a05095. Visited on: 8 Nov. 2022.
RATHEE, Preeti; MALIK, Sanjay Kumar. An analysis of semantic similarity measures for information retrieval. In: EMERGING Technologies in Data Mining and Information Security: Proceedings of IEMIS 2022, Volume 3. [S. l.]: Springer, 2022. p. 665–673.
RICCI, F.; ROKACH, L.; SHAPIRA, B. Recommender Systems Handbook. 3rd. New York, NY, USA: Springer Science+Business Media, LLC, 2022.
ROLIM, V.; FERREIRA, R.; COSTA, E.; PINHEIRO, A.; FERREIRA, M. A. Um Estudo Sobre Sistemas de Recomendação de Recursos Educacionais. Anais dos Workshops do Congresso Brasileiro de Informática na Educação, p. 724, Oct. 2017. DOI: 10.5753/cbie.wcbie.2017.724.
SILVA, Leonardo Lima Felix da. Uso de Automated Machine Learning (Auto ML) em Sistemas de Recomendação. [S. l.: s. n.], 2021. Centro de Engenharia Elétrica e Informática, Universidade Federal de Campina Grande, PB, Brasil.
SILVA, Lucas Magnus da. Sistema de recomendação híbrido utilizando as técnicas de filtragem colaborativa e baseada em conteúdo. [S. l.: s. n.], 2020. Curso de Ciência da Computação, Grupo de Pesquisa em Inteligência Artificial Aplicada, Universidade do Extremo Sul Catarinense (UNESC). Criciúma, SC, Brasil.
SINGLA, Rujhan; GUPTA, Saamarth; GUPTA, Anirudh; VISHWAKARMA, Dinesh Kumar. FLEX: a content based movie recommender. In: IEEE. 2020 International Conference for Emerging Technology (INCET). [S. l.: s. n.], 2020. p. 1–4.
THORAT, Poonam B; GOUDAR, Rajeshwari M; BARVE, Sunita. Survey on collaborative filtering, content-based filtering and hybrid recommendation system. International Journal of Computer Applications, Foundation of Computer Science, v. 110, n. 4, p. 31–36, 2015.
WANG, Jiapeng; DONG, Yihong. Measurement of text similarity: a survey. Information, MDPI, v. 11, n. 9, p. 421, 2020.
WIJEWICKREMA, Manjula; PETRAS, Vivien; DIAS, Naomal. Selecting a text similarity measure for a content-based recommender system: A comparison in two corpora. The Electronic Library, Emerald Publishing Limited, v. 37, n. 3, p. 506–527, 2019. DOI: 10.1108/EL-08-2018-0165.
WU, Z.; PALMER, M. Verbs Semantics and Lexical Selection. In: PROCEEDINGS of the 32nd Annual Meeting of Association for Computational Linguistics. [S. l.: s. n.], 1994.
ZUIN, Gianlucca Lodron; MAGALHÃES, Luiz Felipe Gonçalves; LOURES, Túlio Corrêa. MAL-FITT: MyAnimeList Forum Interpreter Through Text. XIII Encontro Nacional de Inteligência Artificial e Computacional (SBC ENIAC-2016). Recife-PE: SBC, p. 205–216, 2016.
Downloads
Published
Data Availability Statement
Research data is available in the body of the document.
Issue
Section
License
Copyright (c) 2026 Gabriel Gonçalves Faria Costa, Diego Correa da Silva, Guilherme Souza Brandão, Vítor Hugo Barbosa dos Santos, Frederico Araújo Durão

This work is licensed under a Creative Commons Attribution 4.0 International License.
This is an open access article that allows unrestricted use, distribution and reproduction in any medium as long as the original article is properly cited.








