home

Assistant Professsor
Computer Science
Statistics GIDP, Applied Math GIDP (affiliated)
University of Arizona
kjun å† cs ∂ø† arizona ∂ø† edu
GouldSimpson Rm 746, 1040 E. 4th St., Tucson, AZ 85721
Google scholar
CV

intro
Broadly, I work on interactive machine learning.
I spend most of my time on on developing and analyzing adaptive decisionmaking/sampling methods including bandit algorithms and monte carlo tree search methods.
Recently, I am also looking into applications in efficient matrix decomposition, geoscience (some blackbox/bayesian optimization involved), and material science problems.
I also had some fun in the past with machine learning applied to psychology.
I was previously a postdoc with Francesco Orabona (who I call the 'master of coin’) at Boston University.
Before then, I spent 9 years at UWMadison for a PhD degree with Xiaojin (Jerry) Zhu and a postdoc position at Wisconsin Institute for Discovery with Robert Nowak, Rebecca Willett, and Stephen Wright.
news
09/22: 2 papers accepted at NeurIPS
05/22: I gave a talk at the RL theory virtual seminars on Maillard sampling.
01/22: 1 paper accpeted at AAAI, 3 papers accepted to AISTATS.
05/21: 2 papers accepted at ICML.
05/21: 1 paper accepted at ISIT.
07/20: I gave a talk at the RL theory virtual seminars on structured bandits with the asymptotic optimality.
07/20: Chicheng and I have a new work on structured bandits that is accepted to ICML’20 workshop on theoretical foundations of reinforcement learning!
11/19: In Spring 2020, I will be teaching CSC 665 Online Learning and Multiarmed Bandits.
10/19: Chicheng Zhang, Jason Pacheco, and I are organizing a machine learning reading group at UA.
10/19: Our paper on kernel regression paper is accepted at NeurIPS’19.
10/19: Our paper on parameterfree SGD with local differential privacy is accepted to PriML’19 (worshop at NeurIPS).
multiarmed bandits
Multiarmed bandit is a stateless version of the reinforcement learning (RL).
But why study an easier version of RL while RL seems to be solving all the problems these day?
Bandits are simpler and thus have strong theoretical guarantees and yet have abundant realworld applications.
Furthermore, developments in bandits can potentially improve RL algorithms, either transferring ideas from bandits to RL or directly use bandits for Monte Carlo planning in MDP (e.g., the Monte Carlo tree search algorithm used in AlphaGo was originated from this paper).
Informally speaking, bandits learn to make better decisions over time in a feedbackloop.
The decisions necessarily affect the feedback information, and the feedback data collected so far is no longer i.i.d.; most traditional learning guarantees do not apply.
Bandits are actively being studied in both theory and applications including deployable web service and hyperparameter optimization (check ray implementation).
Also, the cartoon caption contest of New Yorker is using bandit algorithms to efficiently crowdsource caption evaluations (this article)!
talks
07/20: At RL theory virtual seminars, “Crush Optimism with Pessimism: Structured Bandits Beyond Asymptotic Optimality.” [video]
05/20: At Los Alamos  Arizona Days Conference, “Adaptive data collection for accelerating discovery rates.”
09/19: At TRIPODS seminar, the U of Arizona, “Adaptive data collection for accelerating discovery rates.”
09/19: At TRIPODS RWG6, the U of Arizona, “Accelerating discovery rate in adaptive experiments via bandits with lowrank structure.”
07/19: At Microsoft New England, “Accelerating discovery rate in adaptive experiments via bandits with lowrank structure.”
04/19: At the U of Arizona, “Accelerating discovery rate in adaptive experiments via bandits with lowrank structure.”
10/18: At Open AIR: Industry Open House, Boston University, “Adapting to changing environments in online learning.”
10/17: At SILO, “Scalable Generalized Linear Bandits: Online Computation and Hashing.” [abstract]
04/17: At AISTATS, “Improved Strongly Adaptive Online Learning using Coin Betting.”
06/16: At CPCP Annual Retreat, “MultiArmed Bandit Algorithms and Applications to Experiment Selection.” [abstract & video]
03/16: At SILO, “Top Arm Identification in MultiArmed Bandits with Batch Arm Pulls.” [abstract & video]
03/16: At Soongsil University, two talks on human memory search.
06/16: At ICML, “Anytime Exploration for Multiarmed Bandits using Confidence Information.” [video]
11/15: At HAMLET (interdisciplinary seminar series at UWMadison), “Measuring semantic structure from verbal fluency data with the initialvisitemitting (INVITE) random walk.”
03/15: At TTIC, “Learning from HumanGenerated Lists.”
06/13: At ICML, “Learning from HumanGenerated Lists.” [video]
