Do the recent advances in deep learning spell the demise of classifier ensembles? I would say no. While the philosophies of the two approaches are different, there is still a way to harness the benefits of both. Ensemble learning sprang from the idea of combining simple classifiers to solve complex classification problems. Deep learning, on the other hand, throws one, monolithic, complex model at the problem. There is room, however, for combining deep learners in a quest to boost the deep learner’s accuracy. Ensembles of deep learners do not follow the standard design methodologies. For example, we cannot afford to build a large ensemble due to computational limitation. Such ensembles require bespoke design strategies and flexible combination rules. This tutorial will introduce the fundamentals of classifier ensembles and will bring examples of successful applications of deep learning ensembles.
Basic knowledge of machine learning and pattern recognition.
Ludmila (Lucy) I. Kuncheva is a Professor of Computer Science at Bangor University, UK. Her interests include pattern recognition, and specifically classifier ensembles. She has published two monographs and over 200 research papers. Lucy has won two Best Paper Awards (2006 IEEE TFS and 2003 IEEE TSMC). She is a Fellow of International Association of Pattern Recognition (IAPR). https://scholar.google.co.uk/citations?user=WIc3assAAAAJ&hl=en
The field of graph signal processing extends classical signal processing tools to signals (data) with an irregular structure that can be characterized my means of a graph (e.g., network data). One of the cornerstones of this field are graph filters, direct analogues of time-domain filters, but intended for signals defined on graphs. In this course, we introduce the field of graph signal processing and specifically give an overview of the graph filtering problem. We look at the family of finite impulse response (FIR) and infinite impulse response (IIR) graph filters and show how they can be implemented in a distributed manner. To further limit the communication and computational complexity of such a distributed implementation, we also generalize the state-of-the-art distributed graph filters to filters whose weights show a dependency on the nodes sharing information. These so-called edge-variant graph filters yield significant benefits in terms of filter order reduction and can be used for solving specific distributed optimization problems with an extremely fast convergence. Finally, we will overview how graph filters can be used in deep learning applications involving data sets with an irregular structure. Different types of graph filters can be used in the convolution step of graph convolutional networks leading to different trade-offs in performance and complexity. The numerical results presented in this talk illustrate the potential of graph filters in distributed optimization and deep learning.
Basics in digital signal processing, linear algebra, optimization and machine learning.
Geert Leus received the M.Sc. and Ph.D. degree in Electrical Engineering from the KU Leuven, Belgium, in June 1996 and May 2000, respectively. Geert Leus is now an "Antoni van Leeuwenhoek" Full Professor at the Faculty of Electrical Engineering, Mathematics and Computer Science of the Delft University of Technology, The Netherlands. His research interests are in the broad area of signal processing, with a specific focus on wireless communications, array processing, sensor networks, and graph signal processing. Geert Leus received a 2002 IEEE Signal Processing Society Young Author Best Paper Award and a 2005 IEEE Signal Processing Society Best Paper Award. He is a Fellow of the IEEE and a Fellow of EURASIP. Geert Leus was a Member-at-Large of the Board of Governors of the IEEE Signal Processing Society, the Chair of the IEEE Signal Processing for Communications and Networking Technical Committee, a Member of the IEEE Sensor Array and Multichannel Technical Committee, and the Editor in Chief of the EURASIP Journal on Advances in Signal Processing. He was also on the Editorial Boards of the IEEE Transactions on Signal Processing, the IEEE Transactions on Wireless Communications, the IEEE Signal Processing Letters, and the EURASIP Journal on Advances in Signal Processing. Currently, he is the Chair of the EURASIP Technical Area Committee on Signal Processing for Multisensor Systems, a Member of the IEEE Signal Processing Theory and Methods Technical Committee, a Member of the IEEE Big Data Special Interest Group, an Associate Editor of Foundations and Trends in Signal Processing, and the Editor in Chief of EURASIP Signal Processing.
The past few years have seen a dramatic increase in the performance of recognition systems thanks to the introduction of deep networks for representation learning. However, the mathematical reasons for this success remain elusive. For example, a key issue is that the neural network training problem is nonconvex, hence optimization algorithms are not guaranteed to return a global minima. The first part of this tutorial will overview recent work on the theory of deep learning that aims to understand how to design the network architecture, how to regularize the network weights, and how to guarantee global optimality. The second part of this tutorial will present sufficient conditions to guarantee that local minima are globally optimal and that a local descent strategy can reach a global minima from any initialization. Such conditions apply to problems in matrix factorization, tensor factorization and deep learning. The third part of this tutorial will present an analysis of dropout for matrix factorization, and establish connections
Basic understanding of sparse and low-rank representation and non-convex optimization.
Rene Vidal is a Professor of Biomedical Engineering and the Innaugural Director of the Mathematical Institute for Data Science at The Johns Hopkins University. His research focuses on the development of theory and algorithms for the analysis of complex high-dimensional datasets such as images, videos, time-series and biomedical data. Dr. Vidal has been Associate Editor of TPAMI and CVIU, Program Chair of ICCV and CVPR, co-author of the book 'Generalized Principal Component Analysis' (2016), and co-author of more than 200 articles in machine learning, computer vision, biomedical image analysis, hybrid systems, robotics and signal processing. He is a fellow of the IEEE, IAPR and Sloan Foundation, a ONR Young Investigator, and has received numerous awards for his work, including the 2012 J.K. Aggarwal Prize for "outstanding contributions to generalized principal component analysis (GPCA) and subspace clustering in computer vision and pattern recognition” as well as best paper awards in machine learning, computer vision, controls, and medical robotics.