This course is a continuing line of topics from CSC 665 Section 2 Machine Learning Theory that delve into online learning and multi-armed bandits (but the knowledge from ML theory is not required). The students will learn, via the lens of mathematical foundations, how and when machines can learn in an online manner. Specifically, the course offers mathematical formulation of learning environments (e.g., stochastic and adversarial worlds with possibly limited feedback), fundamental limits of learning in these environments, various algorithms concerning sample efficiency, computational efficiency, and generality. Throughout, students will not only learn fundamental mathematical tools upholding the current understanding of sequential decision making in the research community but also develop skills of adapting these techniques to their own research needs such as developing new algorithms.
Why online learning / multi-armed bandits?
There is no designated textbook for this course. Much of the course materials will be based on the following materials (in the order of appearance in class schedule):
The following set of surveys and books also provide a good coverage of relevant materials:
Here are some excellent notes for probability review and linear algebra review.
The following is a far-from-complete list of learning theory courses offered at other institutions: