Is Learning to Rank Worth it? A Statistical Analysis of Learning to Rank Methods in the LETOR Benchmarks
Keywords: Information Retrieval, Learning to Rank, Statistical Analysis
AbstractThe Learning to Rank (L2R) research field has experienced a fast paced growth over the last few years, with a wide variety of benchmark datasets and baselines available for experimentation. We here investigate the main assumption behind this field, which is that, the use of sophisticated L2R algorithms and models, produce significant gains over more traditional and simple information retrieval approaches. Our experimental results in the LETOR benchmarks surprisingly indicate that many L2R algorithms, when put up against the best individual features of each dataset, may not produce statistically significant differences, even if the absolute gains may seem large. We also find that most of the reported baselines are statistically tied, with no clear winner.
SBBD 2012 Short Papers