Research:
People:
Prospective Students:
|
|
Algorithms and Complexity Seminar
|
| |
|
| Date: |
Wednesday, December 17, 2008
|
| Time: |
1:30 PM
|
| Place: |
DC1304
|
| |
|
| |
|
| Title: |
Showing Relevant Ads via Context Multi-Armed Bandits
|
| Speaker: |
David Pal, School of Computer Science, University of Waterloo
|
|
|
Abstract: |
| |
We study \emph{context} multi-armed bandit problems where the context
comes from a metric space and the payoff satisfies a Lipschitz
condition with respect to the metric. Abstractly, a context
multi-armed bandit problem models a situation where, in a sequence of
independent trials, an online algorithm chooses an action based on a
given \emph{context} (side information) from a set of possible actions
so as to maximize the total payoff of the chosen actions. The payoff
depends on both the action chosen and the context. In contrast,
context-\emph{free} multi-armed bandit problems, a focus of much
previous research, model situations where no side information is
available and the payoff depends only on the action chosen.
Our problem is motivated by sponsored web search, where the task is to
display ads to a user of an Internet search engine based on her search
query (context) so as to maximize the click-through rate of the ads
displayed. We cast this problem as a context multi-armed bandit
problem where queries and ads form metric spaces and the payoff
function is Lipschitz with respect to both the metrics. For any
$\epsilon > 0$ we present an algorithm with regret
$O(T^{\frac{a+b+1}{a+b+2} +
\epsilon})$ where $a,b$ are the covering dimensions of the query space
and the ad space respectively. We prove a lower bound
$\Omega(T^{\frac{\tilde{a}+\tilde{b}+1}{\tilde{a}+\tilde{b}+2} -
\epsilon})$ for the regret of any algorithm where $\tilde{a}, \tilde{b}$
are packing dimensions of the query spaces and the ad space
respectively. For finite spaces or bounded subsets of Euclidean
spaces, this gives an (almost) matching upper and lower bound.
joint work with Tyler Lu and Martin Pal
|
|
|