Abstract:
We consider a sequential decision problem where at each time step the decision maker has to choose how to distribute the future loss between k alternatives, and then observes the loss of each alternative. Motivated by load balancing and job scheduling, we consider a global cost function (over the losses incurred by each alternative), rather than a summation of the instantaneous losses as done traditionally in online learning. Such global cost functions include the makespan (the maximum over the alternatives) and the $L_d$ norm (over the alternatives).
We study two setups: an online learning setup where the losses are arbitrary and a stochastic setup where the losses are generated by an IID (albeit unknown) random process. For the stochastic setup we show that several naive algorithms have expected regret of O(sqrt{T}). Moreover, even if the statistics of the random process are known, playing the best static allocation leads to a regret of O(sqrt{T}). However, it is possible to obtain logarithmic regret with high probability using a carefully crafted dynamic allocation policy.
For the online setup we design an algorithm based on approachability that guarantees vanishing regret, where the regret is measured with respect to the best static decision that selects the same distribution over alternatives at every time step. For the special case of makespan cost we devise a simple and efficient algorithm. In contrast, we show that for concave global cost functions, such as $L_d$ norms for d < 1, the worst-case average regret does not vanish.
Biography:
Shie Mannor received the B.Sc. degree in electrical engineering, the B.A. degree in mathematics, and the Ph.D. degree in electrical engineering from the Technion-Israel Institute of Technology, Haifa, Israel, in 1996, 1996, and 2002, respectively. From 2002 to 2004, he was a Fulbright scholar and a postdoctoral associate at M.I.T. He was on the faculty of the Department of Electrical Engineering at McGill University from 2004 to 2010 where he was a Canada Research chair in Machine Learning. He has been a Horev Fellow and an associateprofessor at the Faculty of Electrical Engineering at the Technion since 2008. His research interests include machine learning and pattern recognition, planning and control, multi-agent systems, and communications.