[introductory/intermediate] Bayesian Optimization over Continuous, Discrete, or Hybrid Spaces
Bayesian optimization is a sample-efficient method for finding a global optimum of an expensive-to-evaluate black-box function. It has emerged as a promising technique in various applications such as automated machine learning, hyperparameter optimization, clinical drug trials, material design, and so on. This lecture consists of three parts, each of which is one and half hour long. Part I begins with the standard formulation of Bayesian optimization and provides an overview for sequential global optimization, explaining two key ingredients: (1) surrogate models; (2) acquisition functions. It builds a surrogate model to estimate the unknown reward function (or objective function) using the data available so far and determines where next to sample from the reward function by the acquisition function optimization. In part II, advances in Bayesian optimization are introduced. This include Bayesian optimization with black-box constraints, how to scale up in high-dimensional spaces, batch Bayesian optimization where a batch of candidate inputs is selected. We also consider the case where each input corresponds to a set, leading to Bayesian optimization over sets. The standard Bayesian optimization assumes that input space is a real-valued d-dimensional vector space. In part III, we consider the problem of Bayesian optimization over categorical spaces, where the input consists of multiple categorical variables. Finally we describe the black-box problem over hybrid spaces where both continuous and categorical inputs are present.
- Overview of Bayesian optimization
- Surrogate models: Tree-based models and Gaussian process regression
- Acquisition functions: PI, EI, GP-UCB, Thomson sampling
- Recent advances in Bayesian optimization
- Bayesian optimization with constraints
- Bayesian optimization in high-dimensional spaces
- Batch Bayesian optimization
- Bayesian optimization over sets
- Bayesian optimization over hybrid spaces
- Bayesian optimization over categorical spaces
- Bayesian optimization over continuous + categorial spaces
 B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas, “Taking the human out of the loop: A review of Bayesian optimization,” Proceedings of the IEEE, vol. 104, no. 1, pp. 148–175, 2016.
 P. I. Frazier, “A tutorial on Bayesian optimization,” Preprint arXiv:1807.0281, 2018.
 J. Močkus, V. Tiesis, and A. Zĭlinska, “The application of Bayesian methods for seeking the extremum,” Toward Global Optimization, 1978.
 N. Srinivas, A. Krause, S. Kakade, and M. Seeger, “Gaussian process optimization in the bandit setting: No regret and experimental design,” in Proceedings of the International Conference on Machine Learning (ICML), Haifa, Israel, 2010.
 W. R. Thomson, “On the likelihood that one unknown probability exceeds another in view of the evidence of two samples,” Biometrika, vol. 25, pp. 285–294, 1933.
 J. Kim and S. Choi, “On local optimizers of acquisition functions in Bayesian optimization,” in Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), 2020.
 J. R. Gardner, M. J. Kusner, Z. E. Xu, K. Q. Weinberger, and J. P. Cunningham, “Bayesian optimization with inequality constraints,” in Proceedings of the International Conference on Machine Learning (ICML), Beijing, China, 2014.
 M. A. Gelbart, J. Snoek, and R. P. Adams, “Bayesian optimization with unknown constraints,” in Proceedings of the Annual Conference on Uncertainty in Artificial Intelligence (UAI), 2014.
 K. Kandasamy, J. Schneider, and B. Póczos, “High dimensional Bayesian optimisation and bandits via additive models,” in Proceedings of the International Conference on Machine Learning (ICML), 2015.
 Z. Wang, M. Zoghi, F. Hutter, D. Matheson, and N. de Freitas, “Bayesian optimization in high dimensions via random embeddings,” in Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2013.
 J. González, Z. Dai, P. Hennig, and N. Lawrence, “Batch Bayesian optimization via local penalization,” in Proceedings of the International Conference on Artificial Intelligence and Statistics (AIS- TATS), Cadiz, Spain, 2016.
 J. Kim, M. McCourt, T. You, S. Kim, and S. Choi, “Bayesian optimization with approximate set kernels,” Machine Learning, vol. 110, pp. 857–879, 2021.
 R. Baptista and M. Poloczek, “Bayesian optimization of combinatorial structures,” in Proceedings of the International Conference on Machine Learning (ICML), Stockholm, Sweden, 2018.
 C. Oh, J. M. Tomczak, E. Gavves, and M. Welling, “Combinatorial Bayesian optimization using the graph cartesian product,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 32, 2019.
 B. Ru, A. S. Alvi, V. Nguyen, and M. A. O. S. J. Roberts, “Bayesian optimisation over multiple continuous and categorical inputs,” in Proceedings of the International Conference on Machine Learning (ICML), 2020.
References are not complete. Only a few exemplary references are listed here.
Familiarity with tree-based regression and Gaussian process regression is a bonus but is not necessary. Basic knowledge of probability might be required.
Seungjin Choi received B.S. and M.S. degrees in electrical engineering from Seoul National University, Korea in 1987 and 1989, respectively, and a Ph.D. degree in electrical engineering from the University of Notre Dame, Indiana, in 1996. He was with the Laboratory for Artificial Brain Systems, RIKEN, Japan, in 1997, working with Prof. Andrzej Cichocki and Prof. Shunichi Amari on independent component analysis as a Frontier researcher. He was an Assistant Professor in the School of Electrical and Electronics Engineering, Chungbuk National University from 1997 to 2000. From 2001 to 2019, he was a Professor of Computer Science at Pohang University of Science and Technology, Korea. He was the director of Machine Learning Research Center where about 15 professors in top five universities, Korea, participated. He had held Advisory Professor position for Shinhan Card Bigdata Center, Samsung Research, Samsung Advanced Institute of Technology. He is currently an executive advisor at BARO AI Academy where he is developing online lectures for machine learning, deep learning, and mathematics for machine learning. He was the president of AI Society in Korea Institute of Information Scientists and Engineers. He is serving or has served as Area Chairs or Senior Program Committee for top-tier machine learning or AI conferences, including NeurIPS, ICML, ICLR, AISTATS, AAAI, IJCAI. His current research interests include Bayesian optimization, meta-learning, and deep probabilistic models.