A Comparative Study of Learning-to-Rank Techniques for Tag Recommendation
Keywords:Tag Recommendation, Relevance Metrics, Learning-to-Rank
AbstractTags have become very popular on the Web 2.0 as they facilitate and encourage users to create and share their own content. In this context, there is a large interest in developing strategies to recommend relevant and useful tags for a target object, improving the quality of the generated tags and of the Information Retrieval (IR) services that use them as data source. Several existing tag recommendation strategies treat the problem as a multiple candidate tag ranking problem, recommending tags that are in top positions of the generated ranking. This motivates the use of Learning-to- Rank (L2R) based strategies to automatically “learn” good tag ranking functions. However, previous work has explored only three diﬀerent L2R techniques, namely, Genetic Programming (GP), RankSV M and RankBoost, comparing at most two of them with respect to eﬀectiveness. In contrast, we here perform a much more comprehensive comparative study of the use of L2R techniques for tag recommendation. Speciﬁcally, we compare eight diﬀerent L2R techniques, namely, Random Forest (RF), MART, λ-MART, ListNet, AdaRank and the three aforementioned techniques, with respect to both eﬀectiveness (i.e., precision, NDCG) and eﬃciency (i.e., time complexity). We perform experiments using real data collected from ﬁve popular Web 2.0 applications, namely, Bibsonomy, LastFM, MovieLens, YahooVideo and YouTube. Our results show that the best L2R based strategy signiﬁcantly outperforms the best state-of-the-art unsupervised technique (by up to 29% in NDCG). Moreover, unlike existing comparisons of diﬀerent L2R techniques in other domains, we ﬁnd that, for tag recommendation, there is a clear winning group of methods (RF, MART and λ-MART) with a slight advantage of two (RF and λ-MART) over the other, with gains in NDCG ranging from 4% to 12% over the best of the remaining alternatives considered. We also ﬁnd that recommendation time, despite some variation among the diﬀerent methods, is under 1.3 seconds, on average (in the worst case scenario), for all L2R methods, which conﬁrms the feasibility of the L2R approach for tag recommendation.
Download data is not yet available.