A Holistic Hybrid Algorithm for User Recommendation on Twitter

Authors

  • Sara Guimarães Universidade Federal de Minas Gerais
  • Marco Túlio Ribeiro Universidade Federal de Minas Gerais
  • Renato Assunção Universidade Federal de Minas Gerais
  • Wagner Meira Jr. Universidade Federal de Minas Gerais

Keywords:

data mining algorithms, logistic regression, social media, topic models, user recommendation

Abstract

As Twitter grows larger and larger, finding interesting users to follow becomes an increasingly difficult task, making it a great scenario for the application of recommender systems. Previous research has shown that there is value in combining different recommendation algorithms, as each algorithm has strengths and weaknesses. However, previous works have focused on specific classes of recommendation algorithms, or on naïvely combining different algorithms. In contrast, in this work we present a holistic hybrid algorithm that simultaneously takes into account content-based, collaborative-based and user-based information. Our algorithm learns how to combine different sources of evidence (including the output from other algorithms) from the data itself, by using a Logistic Regression model. Therefore, instead of manually determining the importance of each source, or worse - weighting all the sources equally, the appropriate emphasis given to each of the sources in our model comes from the data. Our experiments on a real dataset from Twitter show that our algorithm outperforms current state-of-the-art algorithms.  In addition, we propose new user representations for content-based algorithms (such as algorithms based on tf-idf and LDA) that capture the users' interests more fully, by also taking into account the content posted by the people they follow. Our experiments also show that these new representations outperform traditional content-based algorithms.

Downloads

Download data is not yet available.

Downloads

Published

2013-09-13

Issue

Section

SBBD Articles