CSC 665: Online Learning and Multi-armed Bandits - Spring 2020

This course is a continuing line of topics from CSC 665 Section 2 Machine Learning Theory that delve into online learning and multi-armed bandits (but the knowledge from ML theory is not required). The students will learn, via the lens of mathematical foundations, how and when machines can learn in an online manner. Specifically, the course offers mathematical formulation of learning environments (e.g., stochastic and adversarial worlds with possibly limited feedback), fundamental limits of learning in these environments, various algorithms concerning sample efficiency, computational efficiency, and generality. Throughout, students will not only learn fundamental mathematical tools upholding the current understanding of sequential decision making in the research community but also develop skills of adapting these techniques to their own research needs such as developing new algorithms.

Why online learning / multi-armed bandits?

Backbone of stochastic gradient descent algorithms.
How can ‘learning’ possible when when the data is arbitrarily manipulated.
Learn how companies learn your preferences by interacting with you in recommendation systems and online advertisements.
Besides, beautiful mathematical results and algorithms, and some practical algorithms.

Logistics info

Monday and Wednesday, 12:30pm-1:45pm
Saguaro Room 223
Office Hour: Tuesdays 4-5pm or by email appointment
Piazza link access code: bandits
Gradescope entry code: M46EWY
D2L

Instructor

Kwang-Sung Jun
k[lastname]@cs.arizona.edu
Gould-Simpson 746

Textbook

There is no designated textbook for this course. Much of the course materials will be based on the following materials (in the order of appearance in class schedule):

Lecture notes by Francesco Orabona (FO).
Bandit algorithms by Tor Lattimore and Csaba Szepesvari (LS)
Understanding machine learning: from theory to algorithms by Shai Shalev-Shwartz and Shai Ben-David (SSBD)

The following set of surveys and books also provide a good coverage of relevant materials:

Online learning and online convex optimization by Shai Shalev-Shwartz
Introduction to online optimization by Elad Hazan (H)
Regret analysis of stochastic and nonstochastic multi-armed bandit problems by Sebastien Bubeck and Nicolo Cesa-Bianchi
Introduction to Multi-Armed Bandits by Alex Slivkins

Review for prerequisites

Here are some excellent notes for probability review and linear algebra review.

Machine learning courses at UA

CSC 665-2 Machine Learning Theory by Chicheng Zhang (the course number will be changed as it is being approved as an official course)
CSC 535 Probabilistic Graphical Models by Kobus Barnard
[ISTA 457/INFO 557 Neural Networks] by Steven Bethard
INFO 521 Introduction to Machine Learning by Clayton Morrison
CSC 665-1 Advanced Topics in Probabilistic Graphical Models by Jason Pacheco (Fall 2019)
CSC 580 Principles of Machine Learning by Carlos Scheidegger
MIS 601 Statistical Foundations of Machine Learning by Junming Yin
MATH 574M Statistical Machine Learning by Helen Zhang

Similar courses at other institutions

The following is a far-from-complete list of learning theory courses offered at other institutions:

Introduction to Online Learning By Francesco Orabona
Online and Adaptive Methods for Machine Learning By Kevin Jamieson
Introduction to Online Learning By Haipeng Luo
Online Methods in Machine Learning By Alexander Rakhlin
Bandit Algorithms By Csaba Szepesvári