2009 BBMBayesianBrowsingModelfromPet

Jump to: navigation, search

Subject Headings:


Cited By


Author Keywords

Bayesian Models, Click log Analysis, Web Search


Given a quarter of petabyte click log data, how can we estimate the relevance of each URL for a given query? In this paper, we propose the Bayesian Browsing Model (BBM), a new modeling technique with following advantages : (a) it does exact inference; (b) it is single-pass and parallelizable; (c) it is effective. We present two sets of experiments to test model effectiveness and efficiency. On the first set of over 50 million search instances of 1.1 million distinct queries, BBM outperforms the state-of-the-art competitor by 29.2% in log-likelihood while being 57 times faster. On the second click-log set, spanning a quarter of petabyte data, we showcase the scalability of BBM : we implemented it on a commercial MapReduce cluster, and it took only 3 hours to compute the relevance for 1.15 billion distinct query-URL pairs.



 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 BBMBayesianBrowsingModelfromPetChao Liu
Fan Guo
Christos Faloutsos
BBM: Bayesian Browsing Model from Petabyte-scale DataKDD-2009 Proceedings10.1145/1557019.15570812009