You can choose to do one of the following types of
projects:
Literature survey. You need to
choose a topic (see below for some ideas) and read at least
6 papers on this topic. Be sure to select papers
judiciously, so that both classical and state-of-the-art
perspectives are covered. Sometimes the
Introduction and Related Work sections of
recent papers will help you find most relevant papers. When
you are in doubt whether a paper is relevant, come to my
office hour and we can figure this out. It is important that
you read the papers critically, and form your own
opinions on the papers you read: What are the major open
problems in this area? What are the pros and cons of
existing approaches? (The latter can be summarized with a
table, for example.)
Implementation. You are asked to
conduct experiments on deploying theoretically-principled
learning algorithms onto synthetic or real data. If you
choose to implement an algorithm from a paper that does not
have theorem statements (or ‘trivial’ theorems with
straightforward proofs), this may not be a good fit for this
course. In addition, we ask that your experiment must be
used to support theoretical results obtained in the paper.
You are expected to conduct critical analyses on your
experimental results, by answering questions such as:
how well do the experimental results agree with the proposed
theory? If the results do not agree well with the theory,
which assumption in the theory are violated? You should also
provide a list of datasets you are using.
Research. Research projects, roughly
speaking, can have two styles: first, attacking an existing
open problem in the literature; second, formulating a new
(theoretically interesting and practically relevant) problem
and proposing a feasible solution. Completing a research
project naturally requires a thorough literature survey in
the first place - you need to ensure that your approach or
model has never been proposed in prior works. Note that
a research project may require substantially larger amount
of time compared to the first two project types, so I
suggest being careful with this choice if you already have a
heavy workload this semester. The upside of a research
project is that your work may result in
publications.
Project timelines
Project Proposal. A 2-page project
proposal. The project proposal should consist of the
following parts:
a brief description of the project topic
a justification why this is relevant in learning
theory
for literature survey, have a planned reading list; for
implementation, have a list of algorithms to implement and a
set of driving questions to answer with the experiments; for
research, have a list of research questions to attack (It
would be good to have one concrete question, and list a few
important sub-questions toward solving it.)
If you need help with choosing a project, please schedule
an appointment with me and I will help you brainstorm
one.
Final Report. A 4-page summary of
the project (8 pages for teams). You can use appendix if
there are details you want to share. The report will be
judged on both clarity and quality. The report needs to be
typeset by LaTeX.
For literature survey, provide a critical summary of the
papers you have read; discuss the connections among these
papers, and their impacts to broader field.
For implementation projects, present your experimental
results, and check whether the experimental results agree
with theory.
For research projects, your report should have a
Related work section discussing why your result is
novel compared to existing work. You should also describe
your approach (if you have a new algorithm, provide its
pseudocode; if you propose a new learning model, define new
key concepts in your model; if you show your proposed
algorithm has good performance guarantee, write down a
theorem statement on it.)
Final Presentation. It will be 10-15
minutes presentation.
Please please make sure the problem definition is well
explained.
The evaluation will be based more on how you are
presenting than what you are presenting, but please note
that the distinction can be blurred if what you are
presenting is already unclear.
Below are a few example research directions in machine
learning theory, each with a few “seed papers”; you can use
the related work section in these papers, or use the “cited
by” functionality in e.g. google scholar to
find more papers on the same topic. Please also refer to
proceeding pages of recent machine learning / learning
theory conferences and workshops for more project ideas,
such as:
M.F. Balcan, A. Beygelzimer, J. Langford. Agnostic
active learning. ICML 2006.
Sanjoy Dasgupta, Daniel J. Hsu, Claire Monteleoni. A
general agnostic active learning algorithm. NeurIPS
2007.
S. Hanneke. Theory of Disagreement-Based Active
Learning. Foundations and Trends in Machine Learning.
2014.
M.F. Balcan and P. Long. Active and Passive Learning of
Linear Separators under Log-concave Distributions. COLT
2013.
Chicheng Zhang and Kamalika Chaudhuri. Beyond
Disagreement-Based Agnostic Active Learning. 2014.
Tzu-Kuo Huang, Alekh Agarwal, Daniel J. Hsu, John
Langford, Robert E. Schapire. Efficient and Parsimonious
Agnostic Active Learning. NeurIPS 2015.
Nonparametric active
learning
Sanjoy Dasgupta and Daniel Hsu. Hierarchical sampling
for active learning. ICML 2008.
R. Urner, S. Wulff, S. Ben-David, PLAL: Cluster-based
Active Learning. COLT 2013.
Alexandra Carpentier, Andrea Locatelli, Samory Kpotufe,
Adaptivity to Noise Parameters in Nonparametric Active
Learning. COLT 2017.
Alexandra Carpentier, Andrea Locatelli, Samory Kpotufe.
An Adaptive Strategy for Active Learning with Smooth
Decision Boundary. ALT 2018.
Rui M Castro and Robert D Nowak. Minimax bounds for
active learning. IEEE Transactions on Information Theory,
2008.
Stanislav Minsker. Plug-in approach to active learning.
Journal of Machine Learning Research, 2012.
A. Kontorovich, S. Sabato, R. Urner. Active
nearest-neighbor learning in metric spaces. JMLR, 2017.
Other new topics
Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal
Daume III, John Langford. Active Learning for Cost-Sensitive
Classification. ICML 2017.
Mina Karzand, Robert D. Nowak. MaxiMin Active Learning
in Overparameterized Model Classes. Allerton 2019.
Daniel M. Kane, Shachar Lovett, Shay Moran, Jiapeng
Zhang. Active classification with comparison queries. FOCS
2017.
Efficient,
noise tolerant PAC learning of specific hypothesis classes
(e.g. linear classifiers)
Positive results
Avrim Blum, Alan M. Frieze, Ravi Kannan, and Santosh S.
Vempala. A polynomial-time algorithm for learning noisy
linear threshold functions. FOCS 1996.
Pranjal Awasthi, Maria Florina Balcan, and Philip M
Long. The power of localization for efficiently learning
linear separators with noise. JACM 2017.
Pranjal Awasthi, Maria-Florina Balcan, Nika Haghtalab,
and Hongyang Zhang. Learning and 1-bit compressed sensing
under asymmetric noise. COLT 2016.
Ilias Diakonikolas, Themis Gouleakis, and Christos
Tzamos. Distribution-independent PAC learning of halfspaces
with massart noise. NeurIPS 2019.
Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos,
and Nikos Zarifis. Learning halfspaces with massart noise
under structured distributions. COLT 2020.
Chicheng Zhang, Jie Shen, Pranjal Awasthi, Efficient
active learning of sparse halfspaces with arbitrary bounded
noise. NeurIPS 2020.
Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis,
Christos Tzamos, Nikos Zarifis. A Polynomial Time Algorithm
for Learning Halfspaces with Tsybakov Noise. arXiv
2020.
Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau.
Classification Under Misspecification: Halfspaces,
Generalized Linear Models, and Connections to Evolvability.
NeurIPS 2020.
Computational hardness
Venkatesan Guruswami and Prasad Raghavendra. Hardness of
learning halfspaces with noise. SIAM Journal on Computing,
2009.
Vitaly Feldman, Parikshit Gopalan, Subhash Khot, and
Ashok Kumar Ponnuswami. New results for learning noisy
parities and halfspaces. FOCS 2006.
Adam Klivans and Pravesh Kothari. Embedding Hard
Learning Problems Into Gaussian Space. APPROX/RANDOM
2014.
Ilias Diakonikolas, Daniel M. Kane. Hardness of Learning
Halfspaces with Massart Noise. arXiv 2020.
Adversarial robustness
Robustness of classifiers: from adversarial to random
noise. Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli,
Pascal Frossard. NIPS 2016.
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt,
Dimitris Tsipras, Adrian Vladu. Towards Deep Learning Models
Resistant to Adversarial Attacks. ICLR 2018.
Aditi Raghunathan, Jacob Steinhardt, Percy Liang.
Certified Defenses against Adversarial Examples. ICLR
2018.
Yizhen Wang, Somesh Jha, Kamalika Chaudhuri. Analyzing
the Robustness of Nearest Neighbors to Adversarial Examples.
ICML 2018.
Sébastien Bubeck, Eric Price, Ilya Razenshteyn.
Adversarial examples from computational constraints. NeurIPS
2018.
Daniel Cullina, Arjun Nitin Bhagoji, Prateek Mittal.
PAC-learning in the presence of evasion adversaries. NeurIPS
2018.
Saeed Mahloujifa and Mohammad Mahmoody. Can
Adversarially Robust Learning Leverage Computational
Hardness? ALT 2019.
Omar Montasser, Steve Hanneke, Nathan Srebro. VC Classes
are Adversarially Robustly Learnable, but Only Improperly.
COLT 2019.
Bandit theory
General
introduction (see the chapters in the books for topic
ideas)
Sébastien Bubeck, Nicolò Cesa-Bianchi. Regret Analysis
of Stochastic and Nonstochastic Multi-armed Bandit Problems.
Foundations and Trends in Machine Learning. 2012.
Aleksandrs Slivkins. Introduction to Multi-Armed
Bandits. Foundations and Trends in Machine Learning.
Tor Lattimore and Csaba Szepesvári. Bandit Algorithms.
Cambridge University Press.
Continuum-armed
Bandits
Robert Kleinberg. Nearly Tight Bounds for the
Continuum-ArmedBandit Problem. NeurIPS 2004.
Robert Kleinberg, Aleksandrs Slivkins, Eli Upfal.
Multi-Armed Bandits in Metric Spaces. STOC 2008.
Rémi Munos. From Bandits to Monte-Carlo Tree Search: The
Optimistic Principle Applied to Optimization and Planning.
Foundations and Trends in Machine Learning. 2014.
Andrea Locatelli, Alexandra Carpentier. Adaptivity to
Smoothness in X-armed bandits. COLT 2018.
Hedi Hadiji. Polynomial Cost of Adaptation for X-Armed
Bandits. NeurIPS 2019.
Akshay Krishnamurthy, John Langford, Aleksandrs
Slivkins, Chicheng Zhang. Contextual Bandits with Continuous
Actions: Smoothing, Zooming, and Adapting. COLT 2019.
Contextual
bandits via reduction to multiclass classification
John Langford and Tong Zhang The Epoch-Greedy Algorithm
for Contextual Multi-armed Bandits. NeurIPS 2007.
Miroslav Dudik, Daniel Hsu, Satyen Kale, Nikos
Karampatziakis, John Langford, Lev Reyzin, and Tong Zhang,
Efficient Optimal Learning for Contextual Bandits, UAI
2011.
Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford,
Lihong Li, Robert E. Schapire. Taming the Monster: A Fast
and Simple Algorithm for Contextual Bandits. ICML 2014.
Contextual
bandits with general regressor classes
Dylan J. Foster, Alekh Agarwal, Miroslav Dudík, Haipeng
Luo, Robert E. Schapire. Practical Contextual Bandits with
Regression Oracles. ICML 2018.
Alekh Agarwal, Miroslav Dudik, Satyen Kale, and John
Langford, Contextual Bandit Learning with Predictable
Rewards, AISTATS 2012.
Dylan J. Foster, Alexander Rakhlin, David Simchi-Levi,
Yunzong Xu. Instance-Dependent Complexity of Contextual
Bandits and Reinforcement Learning: A Disagreement-Based
Perspective. ArXiv 2020.
David Simchi-Levi, Yunzong Xu. Bypassing the Monster: A
Faster and Simpler Optimal Algorithm for Contextual Bandits
under Realizability. ArXiv 2020.
Implicit
regularization
Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Suriya
Gunasekar, Nathan Srebro. The Implicit Bias of Gradient
Descent on Separable Data. JMLR 2018.
Ziwei Ji, Matus Telgarsky. The implicit bias of gradient
descent on nonseparable data. COLT 2019.
Yuanzhi Li, Tengyu Ma, and Hongyang Zhang. Algorithmic
Regularization in Over-parameterized Matrix Sensingand
Neural Networks with Quadratic Activations. COLT 2018.
Imitation learning
Interactive
Imitation Learning
Hal Daumé III, John Langford, Daniel Marcu. Search-based
Structured Prediction. Machine Learning Journal 2009.
Stephane Ross, Geoffrey J. Gordon, J. Andrew Bagnell. A
Reduction of Imitation Learning and Structured Prediction to
No-Regret Online Learning. AISTATS 2011.
Stephane Ross, J. Andrew Bagnell. Reinforcement and
Imitation Learning via Interactive No-Regret Learning.
NeurIPS 2014.
Wen Sun, Arun Venkatraman, Geoffrey J. Gordon, Byron
Boots, J. Andrew Bagnell. Deeply AggreVaTeD: differentiable
imitation learning for sequential prediction. ICML
2017.
Ching-An Cheng and Byron Boots. Convergence of Value
Aggregation for Imitation Learning. AISTATS 2018.
Wen Sun, Anirudh Vemula, Byron Boots, J. Andrew Bagnell.
Provably Efficient Imitation Learning from Observation
Alone. ICML 2019.
Apprenticeship
Learning and Inverse Reinforcement Learning
Pieter Abbeel and Andrew Y. Ng. Apprenticeship Learning
via Inverse Reinforcement Learning. ICML 2004.
Brian D. Ziebart, Andrew Maas, J.Andrew Bagnell, Anind
K. Dey. Maximum Entropy Inverse Reinforcement Learning. AAAI
2008.
Brian D. Ziebart, J.Andrew Bagnell, Anind K. Dey.
Modeling Interaction via the Principle of Maximum Causal
Entropy. ICML 2010.
Umar Syed and Robert E. Schapire. A Game-Theoretic
Approach to Apprenticeship Learning. NeurIPS 2007.
Alekh Agarwal, Ashwinkumar Badanidiyuru, Miroslav Dudik,
Robert Schapire, Aleksandrs Slivkins, Miro Dudík. Robust
Multi-objective Learning with Mentor Feedback. COLT
2014.
Jonathan Ho, Stefano Ermon. Generative Adversarial
Imitation Learning. NeurIPS 2016.
Yizhen Wang, Kamalika Chaudhuri. An Investigation of
Data Poisoning Defenses for Online Learning. 2019.
Overparameterization
/ neural network learning
Roi Livni, Shai Shalev-Shwartz, Ohad Shamir. On the
Computational Efficiency of Training Neural Networks.
NeurIPS 2014.
Shai Shalev-Shwartz, Ohad Shamir, Shaked Shammah.
Failures of gradient-based deep learning. ICML 2017.
Mikhail Belkin, Daniel Hsu, Partha Mitra. Overfitting or
perfect fitting? Risk bounds for classification and
regression rules that interpolate. NeurIPS 2018.
Yuanzhi Li and Yingyu Liang. Learning Overparameterized
Neural Networks via Stochastic Gradient Descent on
Structured Data. NeurIPS 2018.
Zeyuan Allen-Zhu, Yuanzhi Li, Zhao Song. A Convergence
Theory for Deep Learning via Over-Parameterization. ICML
2019.
Simon S. Du, Xiyu Zhai, Barnabas Poczos, Aarti Singh.
Gradient Descent Provably Optimizes Over-parameterized
Neural Networks. ICLR 2019.
Sanjeev Arora, Simon S. Du, Wei Hu, Zhiyuan Li, Ruosong
Wang. Fine-Grained Analysis of Optimization and
Generalization for Overparameterized Two-Layer Neural
Networks. ICML 2019.
PAC Reinforcement
Learning
Michael Kearns and Satinder Singh. Near-Optimal
Reinforcement Learning in Polynomial Time. Machine Learning,
2002.
Ronen I. Brafman and Moshe Tennenholtz. R-max – A
General Polynomial Time Algorithm forNear-Optimal
Reinforcement Learning. JMLR 2002.
Sham Kakade. On the sample complexity of reinforcement
learning. University of College London, 2003.
Lihong Li, Michael L. Littman, Thomas J. Walsh,
Alexander L. Strehl. Knows what it knows: a framework for
self-aware learning. Machine Learning, 2011.
Christoph Dann, Tor Lattimore, Emma Brunskill. Unifying
PAC and Regret: Uniform PAC Bounds for Episodic
Reinforcement Learning. NeurIPS 2017.
Chi Jin, Zeyuan Allen-Zhu, Sebastien Bubeck, Michael I.
Jordan. Is Q-learning Provably Efficient? NeurIPS 2018.
Andrea Zanette, Emma Brunskill. Tighter
Problem-Dependent Regret Bounds in Reinforcement Learning
without Domain Knowledge using Value Function Bounds. ICML
2019.
Theory of generative
modeling
Sanjeev Arora, Rong Ge, Yingyu Liang, Tengyu Ma, Yi
Zhang. Generalization and Equilibrium in Generative
Adversarial Nets (GANs). ICML 2017.
Approximation and Convergence Properties of Generative
Adversarial Learning. Shuang Liu, Olivier Bousquet, Kamalika
Chaudhuri. NeurIPS 2017.
Sanjeev Arora, Andrej Risteski, Yi Zhang. Do GANs learn
the distribution? Some Theory and Empirics. ICLR 2018.