Home Contact

Research:
 Events Seminars

People:

 Faculty members Graduate students Postdoctoral fellows Alumni Postdoctoral positions

Prospective Students:
 Admission Life at Waterloo International Students

 Algorithms and Complexity Seminar Date: Wednesday, December 17, 2008 Time: 1:30 PM Place: DC1304 Title: Showing Relevant Ads via Context Multi-Armed Bandits Speaker: David Pal, School of Computer Science, University of Waterloo
 Abstract: We study \emph{context} multi-armed bandit problems where the context comes from a metric space and the payoff satisfies a Lipschitz condition with respect to the metric. Abstractly, a context multi-armed bandit problem models a situation where, in a sequence of independent trials, an online algorithm chooses an action based on a given \emph{context} (side information) from a set of possible actions so as to maximize the total payoff of the chosen actions. The payoff depends on both the action chosen and the context. In contrast, context-\emph{free} multi-armed bandit problems, a focus of much previous research, model situations where no side information is available and the payoff depends only on the action chosen. Our problem is motivated by sponsored web search, where the task is to display ads to a user of an Internet search engine based on her search query (context) so as to maximize the click-through rate of the ads displayed. We cast this problem as a context multi-armed bandit problem where queries and ads form metric spaces and the payoff function is Lipschitz with respect to both the metrics. For any $\epsilon > 0$ we present an algorithm with regret $O(T^{\frac{a+b+1}{a+b+2} + \epsilon})$ where $a,b$ are the covering dimensions of the query space and the ad space respectively. We prove a lower bound $\Omega(T^{\frac{\tilde{a}+\tilde{b}+1}{\tilde{a}+\tilde{b}+2} - \epsilon})$ for the regret of any algorithm where $\tilde{a}, \tilde{b}$ are packing dimensions of the query spaces and the ad space respectively. For finite spaces or bounded subsets of Euclidean spaces, this gives an (almost) matching upper and lower bound. joint work with Tyler Lu and Martin Pal