A Holistic Hybrid Algorithm for User Recommendation on Twitter
Keywords:data mining algorithms, logistic regression, social media, topic models, user recommendation
AbstractAs Twitter grows larger and larger, finding interesting users to follow becomes an increasingly difficult task, making it a great scenario for the application of recommender systems. Previous research has shown that there is value in combining different recommendation algorithms, as each algorithm has strengths and weaknesses. However, previous works have focused on specific classes of recommendation algorithms, or on naïvely combining different algorithms. In contrast, in this work we present a holistic hybrid algorithm that simultaneously takes into account content-based, collaborative-based and user-based information. Our algorithm learns how to combine different sources of evidence (including the output from other algorithms) from the data itself, by using a Logistic Regression model. Therefore, instead of manually determining the importance of each source, or worse - weighting all the sources equally, the appropriate emphasis given to each of the sources in our model comes from the data. Our experiments on a real dataset from Twitter show that our algorithm outperforms current state-of-the-art algorithms. In addition, we propose new user representations for content-based algorithms (such as algorithms based on tf-idf and LDA) that capture the users' interests more fully, by also taking into account the content posted by the people they follow. Our experiments also show that these new representations outperform traditional content-based algorithms.
Download data is not yet available.