2007 AdaRank

From GM-RKB
Jump to navigation Jump to search

Subject Headings: AdaRank Algorithm, Boosting Algorithm, Information Retrieval, Rank Function Learning Task.

Notes

Cited By

  • ~74 …

Quotes

Abstract

In this paper we address the issue of learning to rank for document retrieval. In the task, a model is automatically created with some training data and then is utilized for ranking of documents. The goodness of a model is usually evaluated with performance measures such as MAP (Mean Average Precision) and NDCG (Normalized Discounted Cumulative Gain). Ideally a learning algorithm would train a ranking model that could directly optimize the performance measures with respect to the training data. Existing methods, however, are only able to train ranking models by minimizing loss functions loosely related to the performance measures. For example, Ranking SVM and RankBoost train ranking models by minimizing classification errors on instance pairs. To deal with the problem, we propose a novel learning algorithm within the framework of boosting, which can minimize a loss function directly defined on the performance measures. Our algorithm, referred to as AdaRank, repeatedly constructs 'weak rankers' on the basis of reweighted training data and finally linearly combines the weak rankers for making ranking predictions. We prove that the training process of AdaRank is exactly that of enhancing the performance measure used. Experimental results on four benchmark datasets show that AdaRank significantly outperforms the baseline methods of BM25, Ranking SVM, and RankBoost.

1. Introduction

Several methods for learning to rank have been developed and applied to document retrieval. For example, Herbrich et al. [13] propose a learning algorithm for ranking on the basis of Support Vector Machines, called Ranking SVM. Freund et al. [8] take a similar approach and perform the learning by using boosting, referred to as RankBoost. All the existing methods used for document retrieval [2, 3, 8, 13, 16, 20] are designed to optimize loss functions loosely related to the IR performance measures, not loss functions directly based on the measures. For example, Ranking SVM and RankBoost train ranking models by minimizing classification errors on instance pairs.

References

  • 13. R. Herbrich, T. Graepel, and K. Obermayer. Large Margin rank boundaries for ordinal regression. MIT Press, Cambridge, MA, 2000.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2007 AdaRankHang Li
Jun Xu
AdaRank: A Boosting Algorithm for Information RetrievalProceedings of the 30th annual international ACM SIGIR conferencehttp://research.microsoft.com/en-us/people/junxu/sigir2007-adarank.pdf10.1145/1277741.12778092007