Many familiar dilemmas that we find in the application of data-driven AI have their origins in technical-mathematical choices that we have made along the way to this version of AI. Several of them might need to be reconsidered in order for the field to move forward. After reviewing some of the current problems related to AI, we trace their cultural, technical and economic origins, then we discuss possible solutions.
Nello Cristianini is Professor of Artificial Intelligence at the University of Bristol. His research covers machine learning methods, and applications of AI to the analysis of media content, as well as the social and ethical implications of AI. Cristianini is the co-author of two widely known books in machine learning, as well as a book in bioinformatics. He is a recipient of the Royal Society Wolfson Research Merit Award, and of a European Research Council Advanced Grant. Before joining the University of Bristol, he has been a professor of statistics at the University of California, Davis. Currently he is working on social and ethical implications of AI. His animated videos dealing with the social aspects of AI can be found here: https://www.youtube.com/seeapattern
Recently, computer vision approaches specially assisted by deep learning techniques have shown unexpected advancements that practically solve problems that never have been imagined to be automatized like face recognition or automated driving. However, food image recognition due to its high complexity and ambiguity, still remains far from being solved. In this project, we focus on how to combine two challenging research lines: deep learning and uncertainty modeling (epistemic and aleatoric uncertainty). After discussing our methodology to advance in this direction, we comment on potential applications, as well as the social and economic impact of the research on food image analysis.
Prof. Petia Radeva is a Full professor at the Universitat de Barcelona (UB), PI of the Consolidated Research Group “Computer Vision and Machine Learning” at the University of Barcelona (CVUB) at UB (www.ub.edu/cvub) and Senior researcher in Computer Vision Center (www.cvc.uab.es). She was PI of UB in 4 European, 3 international and more than 20 national projects devoted to applying Computer Vision and Machine learning for real problems like food intake monitoring (e.g. for patients with kidney transplants and for older people). Petia Radeva is a REA-FET-OPEN vice-chair since 2015 on, and international mentor in the Wild Cards EIT program since 2017.
She is an Associate editor of Pattern Recognition journal (Q1) and International Journal of Visual Communication and Image Representation (Q2).
Petia Radeva has been awarded IAPR Fellow since 2015, ICREA Academia assigned to the 30 best scientists in Catalonia for her scientific merits since 2014, received several international awards (“Aurora Pons Porrata” of CIARP, Prize “Antonio Caparrós” for the best technology transfer of UB, etc).
She supervised 18 PhD students and published more than 100 SCI journal publications and 250 international chapters and proceedings, her Google scholar h-index is 44 with more than 7600 cites.
Deep learning, the latest extension of machine learning, has pushed the accuracy of algorithms to unseen limits, especially for perceptual problems such as the ones tackled by computer vision and image analysis. This workshop will cover the foundations of the field, the communities organized around it, some important tools and resources to get started with these techniques, and the latest applications of deep learning in the field of bioimage analysis. In particular, we will focus on the problems of semantic and instance segmentation of biological images, unsupervised image denoising and deep learning-based super-resolution. All classes will have a theoretical part followed by a hands-on practical session.
Mathematics at the level of an undergraduate degree in computer science: basic multivariate calculus, probability theory, and linear algebra.
Ignacio Arganda-Carreras is an Ikerbasque Research Fellow at the department of Computer Science and Artificial Intelligence of the UPV/EHU, also associated with the Donostia International Physics Center (DIPC). He is one of the founders of Fiji, one of the most popular open source image processing packages in the world, and widely used by the bio-image analysis community. His lab is focused on image processing and machine learning, especially to develop open source computer vision methods for biomedical images. For publications, see https://scholar.google.com/citations?user=02VpQlGwa_kC&hl=en
The course will focus on machine learning for computer vision in the specific topic of human behavior understanding (HBU). Here, supervised and self-supervised learning is often adopted with the use of annotated or partially datasets. Although the apparent narrowness of the topic (just only about human aspect!), and the long-lasting research activity in people detection, tracking, action analysis, interactive behavior and 3D crowd modeling, ultimate solutions with high robustness and accuracy are still far to be achieved. As well, these application fields are now classified “high risky” according with the EU Artificial intelligence Act and thus must be re-thought with special care on the trustworthiness of datasets without any bias, the privacy awareness and the transparency and sustainability of the process. These aspects must correspond to specific technical solutions and large research is orienting in sustainable machine learning for HBU. The course will address these concerns and these technical challenges more specifically in people detection by pose in 2D and 3D, people aspect tracking and re-id using synthetical and real datasets (e.g. JTA and the new large MOTsyn vs MOT dataset), in human action analysis by graph-based spatio-temporal self-attentive architectures. Finally, some aspects of human-AI interaction with NLP generation will be discussed to explain what the machine sees when detect human behavior. The talks will discuss some theoretical and technical ques as well the possible applications in field of automotive, industry and smart cities, presenting some projects at AImagelab, UNIMORE.
Basic Machine Learning and Computer Vision knowledge.
Rita Cucchiara (Laurea in Ing.Elettronica, 1989 and Phd in Ing. Informatica - 1993 University of Bologna), is Full Professor of Computer Engineering at the University of Modena and Reggio Emilia, Department of Engineering "Enzo Ferrari". In Modena she coordinates the AImagelab research lab, which gathers more than 35 researchers in to AI, Artificial Vision, Machine Learning, Pattern Recognition and Multimedia. (www.aimagelab.unimore.it). She is in charge of the joint laboratory with Ferrari, Red-Vision and from 2020 of the NVIDIA AI Technical Center (NVAITC@UNIMORE) and is the director of Modena unit of the European network ELLIS. He coordinates several international, European and national projects in topics related to AI applied to human-AI Interaction, video-surveillance, automotive, industry 4.0 and cultural heritage. Rita Cucchiara is since 2018 Director of the Lab. National CINI of Artificial Intelligence and Intelligent Systems (AIIS). She has been President of the Italian Ass. in Computer Vision, Pattern Recognition and Machine Learning from 2016 to 2018. Since 2017 she is a member of the Board of Directors of the Italian Institute of Technology. She is currently responsible of the working group on Artificial Intelligence of the National Research Plan PNR2021-2027 of the Ministry of University and Research. She has more than 450 scientific publications on international journals and conferences. She is Associate Editor of T-PAMI and will be GC of ECCV2022.
How can we develop machine learning methods that we can trust in high-risk applications? We need ML methods that know their own range of competence so that they can detect when input queries lie outside that range. This class will present several related areas of ML research that seek to achieve this goal including (a) classification with a "reject" option, (b) calibrated confidence estimation, (c) out of distribution detection, and (d) open category detection. The first two topics focus on problems where the training set and test set come from the same distribution, and the classifier must assess its own competence on each test instance. The second two topics can be viewed as applications of anomaly detection, so we will study anomaly detection methods for both featurized data and for signal data (e.g., images) where a good feature space must be learned. Our discussion of anomaly detection will be complementary to Peter Rousseeuw's course (which makes a good companion).
Niculescu-Mizil, A., & Caruana, R. (2005). Predicting good probabilities with supervised learning. Proceedings of the 22nd International Conference on Machine Learning ICML ’05, (2005), 625–632. http://doi.org/10.1145/1102351.1102430
Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On Calibration of Modern Neural Networks. http://arxiv.org/abs/1706.04599
Romano, Y., Patterson, E., & Candès, E. J. (2019). Conformalized Quantile Regression. http://arxiv.org/abs/1905.03222
Shafer, G., & Vovk, V. (2008). A tutorial on conformal prediction. Journal of Machine Learning Research, 9, 371–421. Retrieved from http://arxiv.org/abs/0706.3188
Cortes, C., DeSalvo, G., & Mohri, M. (2016). Learning with rejection. Lecture Notes in Artificial Intelligence, 9925 LNAI, 67–82. http://doi.org/10.1007/978-3-319-46379-7_5
Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Transactions on Knowledge Discovery from Data, 6(1), 1–39. http://doi.org/10.1145/2133360.2133363
Emmott, A., Das, S., Dietterich, T., Fern, A., & Wong, W.-K. (2015). Systematic construction of anomaly detection benchmarks from real data. https://arxiv.org/abs/1503.01158
Siddiqui, A., Fern, A., Dietterich, T. G., & Das, S. (2016). Finite Sample Complexity of Rare Pattern Anomaly Detection. In Proceedings of UAI-2016 (p. 10). http://auai.org/uai2016/proceedings/papers/226.pdf
Bulusu, S., Kailkhura, B., Li, B., Varshney, P. K., & Song, D. (2020). Anomalous Instance Detection in Deep Learning: A Survey. ArXiv, 2003.06979(v1). http://arxiv.org/abs/2003.06979
Bendale, A., & Boult, T. (2016). Towards Open Set Deep Networks. In CVPR 2016 (pp. 1563–1572). http://doi.org/10.1109/CVPR.2016.173
Liu, S., Garrepalli, R., Dietterich, T. G., Fern, A., & Hendrycks, D. (2018). Open Category Detection with PAC Guarantees. Proceedings of the 35th International Conference on Machine Learning, PMLR, 80, 3169–3178. http://proceedings.mlr.press/v80/liu18e.html
Boult, T. E., Cruz, S., Dhamija, A., Gunther, M., Henrydoss, J., & Scheirer, W. (2019). Learning and the Unknown: Surveying Steps Toward Open World Recognition. AAAI 2019.
Familiarity with standard machine learning methods such as decision trees, random forests, and support vector machines. Basic knowledge of deep learning for images. Basic knowledge of probability and principle component analysis.
Thomas Dietterich (PhD Stanford, 1985) is Distinguished Professor (Emeritus) of Computer Science at Oregon State University. Dietterich is one of the pioneers of the field of Machine Learning and has authored more than 200 refereed publications and two books. His current research topics include robust artificial intelligence (calibration and anomaly detection), robust human-AI systems, and applications in sustainability. He is a former president of the International Machine Learning Society (the parent organization of ICML) and the Association for the Advancement of Artificial Intelligence. He is one of the moderators of the cs.LG category on arXiv.
Inference of functions from data is ubiquitous in Statistical Learning. This course deals with Gaussian process (GP) based approaches that not only learn over a class of nonlinear functions, but also quantify the associated uncertainty. To cope with the curse of dimensionality, random feature Fourier (RF) vectors lead to parametric GP-RF function models, that offer scalable estimators. The course will next focus on online learning with ensembles (E) of GP-RF learners, each with a distinct kernel belonging to a prescribed dictionary, and jointly learning a much richer class of functions. Whether in batch or online forms, EGPs remain robust to dynamics captured by adaptive Kalman filters. Being able to cope with unknown dynamics and quantify uncertainty, are critical especially in adversarial settings. EGP performance can be refined online, and it is benchmarked using regret analysis. Further, the course will cross-fertilize ideas from Deep Gaussian Processes and EGPs in order to gain degrees of freedom. Broader applicability of EGPs will be also demonstrated for interactive optimization and policy evaluation in reinforcement learning.
Day 1: Online Scalable Learning Adaptive to Unknown Dynamics and Graphs – Part I: Multi-kernel Approaches
Day 2: Online Scalable Learning with Adaptivity and Robustness – Part II: Deep and Ensemble GPs
Prof. Georgios B. Giannakis, ADC Chair in Wireless Telecommunications and McKnight Presidential Chair in ECE, University of Minnesota. Georgios B. Giannakis (Fellow’97) received his Diploma in Electrical Engr. (EE) from the Ntl. Tech. Univ. of Athens, Greece, 1981. From 1982 to 1986 he was with the U. of Southern California (USC), where he received his MSc. in EE, 1983, MSc. in Mathematics, 1986, and Ph.D. in EE, 1986. He was with the U. of Virginia from 1987 to 1998, and since 1999 he has been with the U. of Minnesota, where he holds a Chair in Wireless Communications, a U. of Minnesota McKnight Presidential Chair in ECE, and serves as director of the Digital Technology Center. His general interests span the areas of statistical learning, communications, and networking - subjects on which he has published more than 460 journal papers, 760 conference papers, 26 book chapters, two edited books and two research monographs. Current research focuses on Data Science with applications to brain, and power networks with renewables. He is the (co-) inventor of 33 patents issued, and the (co-) recipient of 9 best journal paper awards from the IEEE Signal Processing (SP) and Communications Societies, including the G. Marconi Prize Paper Award in Wireless Communications. He also received the IEEE-SPS Nobert Wiener Society Award (2019); Technical Achievement Awards from the SP Society (2000) and from EURASIP (2005); the IEEE ComSoc Education Award (2019); the G. W. Taylor Award for Distinguished Research from the University of Minnesota, and the IEEE Fourier Technical Field Award (inaugural recipient in 2015). He is a Fellow of the National Academy of Inventors, IEEE and EURASIP, and has served the IEEE in a number of posts, including that of a Distinguished Lecturer for the IEEE-SPS.
Deep learning has become one of the most widely used tools in modern science and engineering, leading to breakthroughs in many areas and disciplines ranging from computer vision to natural language processing to physics and medicine. This mini-course will introduce the basics of machine learning and classification theory based on statistical learning and describe two classes of popular algorithms in depth: decision and rule-based methods (decision trees, decision rules, bagging and boosting, random forests) and deep neural network-based models of various types (fully-connected, convolutional, recurrent, recursive and graph neural networks). The course will focus on practical applications in analysis of large scientific data, interpretability, uncertainty estimation and how to best extract meaningful features, while implementing realtime deep learning in software and hardware. No previous machine learning background is required.
Sergei Gleyzer is a particle physicist and university professor, working at the interface of particle physics and machine learning towards more intelligent systems to extract meaningful information from the data collected by the Large Hadron Collider (LHC), the world’s highest-energy particle physics experiment located at the CERN laboratory, near Geneva Switzerland. He is the a co-discover of the Higgs Boson and founder of several major machine learning initiatives such as the Inter-experimental Machine Learning Working Group and Compact Muon Solenoid experiment’s Machine Learning Forum. Professor Gleyzer is working on applying advanced machine learning methods to searches for new physics, such as dark matter.
Reinforcement learning algorithms have achieved promising results on simulations and games such as Go, Starcraft , Dota, and Deepmind Control Suite. However, doing random exploration in the real world can be dangerous, and as a result, evaluating and training RL policies in real-world environments can be costly and dangerous. On the other hand, real-world systems generate large amounts of logged data as they operate. Offline RL methods can learn policies from the logged data, potentially enabling a safe and efficient way of training offline RL policies for real-world applications such as healthcare and robotics. In this course
-Reinforcement Learning Introduction -Offline RL Basics -Conservative Q value estimations -Representation Learning for Offline RL -Off-policy Evaluation -Offline Policy Selection
Machine Learning, Linear Algebra, Basic understanding of probabilities and statistics, Calculus, basic programming skills, and familiarity with some machine learning frameworks.
Caglar Gulcehre has been a senior research scientist at Deepmind since 2017. He has done his Ph.D. under Yoshua Bengio on representation learning, memory, and natural language understanding. During his Ph.D., he has authored numerous influential papers in those areas. His current research topics include but not limited to reinforcement learning, offline RL, and representation learning. He has served as an area chair and reviewer to major Machine learning conferences such as ICML, NeurIPS, ICLR, and journals like Nature and JMLR.
I will introduce reinforcement learning from a model-based perspective. In this paradigm the core of the algorithm is the system model: a multivariate generative (probabilistic) time-series predictor. The system model is combined with online planning and/or learning model-free agents on the system model. The course is designed for students with basic classical machine learning knowledge. My goal is to open up an interesting perspective while also giving you useful tools to tackle practical applications.
The main motivation of the course is learning and improving policies to control engineering systems (autopilots or "self-driving" systems). Unlike popular game benchmarks, these systems are low-dimensional (~10s to ~100s of dimensions) with rewards coming continuously or with a short delay. On the other hand, they are physical, slow, and system access is usually extremely restricted. The focus and the main algorithmic challenge is thus not representation learning and handling sparse rewards (as in games), rather learning robust system models on extremely small (100s or 1000s of time steps) non-iid data. The perspective is also an interesting extension of the classical supervised learning paradigm in which functions are learned in a single shot on data generated (sampled and labeled) by an imaginary oracle. In the real world, supervised models are usually re-learned often, on non-iid data generated by a process which we partially control. The questions that we will ponder on (exploration, distribution shift, non-iid data) may thus also interest students planning to work on supervised machine learning in the real world.
Basic classical machine learning knowledge (e.g., Bishop: Pattern Recognition and Machine Learning)
Balázs Kégl is the Head of AI research at Huawei France where his main focus is deep generative models and model-based reinforcement learning for telecommunications applications. He is on leave from the CNRS where he has been a Senior Research Scientist from 2006 focusing on machine learning and experimental physics. He was the Head of the Center for Data Science of the Université Paris-Saclay between 2014 and 2019, with a mission to accelerate the design and development of machine learning pipelines in scientific applications. Prior to joining the CNRS, he was Assistant Professor at the University of Montreal. He has published more than hundred papers on artificial intelligence and its application in particle physics, systems biology, and climate science. Balázs is co-creator of RAMP (www.ramp.studio), a code-submission platform to accelerate building predictive workflows and to promote collaboration between domain experts and data scientists.
AI and 3D Geometry for Self-Supervised 3D Scene Understanding
3D scene understanding is a fundamental problem in Computer Vision, where one wants to not only recognise the objects present in a scene from captured images, but also retrieve their 3D properties including their poses and shapes. With the development of deep learning approaches, this field has made a remarkable progress. Unfortunately, all recent methods are trained in a supervised way on 3D annotated data. Such a supervised approach has several drawbacks: 3D manual annotations are particularly cumbersome to create and creating realistic virtual 3D scenes also has a high cost; Supervised methods also tend to generalize poorly to other datasets; Even more importantly, they can only be as good as the training 3D annotations, and mistakes in manual annotations are actually common in existing datasets. If one wants to go further and consider more scenes without creating real or synthetic training datasets, it is important to consider new directions.
In this lecture, we will present and discuss self-supervised approaches, more exactly auto-labelling methods for automatically creating 3D annotations. In particular, we will review the Monte Carlo Tree Search (MCTS), which is a general discrete AI algorithm for learning to play games, and show how it can be used for 3D scene understanding. For this, we will consider applications to hand and object pose estimation and indoor scene analysis.
Basic knowledge of Deep Learning applied to computer vision and 3D Geometry
Vincent Lepetit is a director of research at ENPC ParisTech since 2019. Prior to being at ENPC, he was a full professor at the Institute for Computer Graphics and Vision, Graz University of Technology, Austria, and before that, a senior researcher at the Computer Vision Laboratory (CVLab) of EPFL, Switzerland. His research interest are at the interface between Machine Learning and 3D Computer Vision, and currently focus on 3D scene understanding from images. He often serves as an area chair for the major computer vision conferences (CVPR, ICCV, ECCV) and is an associate editor for PAMI, IJCV, and CVIU.
The field of graph signal processing extends classical signal processing tools to signals (data) with an irregular structure that can be characterized my means of a graph (e.g., network data). One of the cornerstones of this field are graph filters, direct analogues of time-domain filters, but intended for signals defined on graphs. In this course, we introduce the field of graph signal processing and specifically give an overview of the graph filtering problem. We look at the family of finite impulse response (FIR) and infinite impulse response (IIR) graph filters and show how they can be implemented in a distributed manner. To further limit the communication and computational complexity of such a distributed implementation, we also generalize the state-of-the-art distributed graph filters to filters whose weights show a dependency on the nodes sharing information. These so-called edge-variant graph filters yield significant benefits in terms of filter order reduction and can be used for solving specific distributed optimization problems with an extremely fast convergence. Finally, we will overview how graph filters can be used in deep learning applications involving data sets with an irregular structure. Different types of graph filters can be used in the convolution step of graph convolutional networks leading to different trade-offs in performance and complexity. The numerical results presented in this talk illustrate the potential of graph filters in distributed optimization and deep learning.
Basics in digital signal processing, linear algebra, optimization and machine learning.
Geert Leus received the M.Sc. and Ph.D. degree in Electrical Engineering from the KU Leuven, Belgium, in June 1996 and May 2000, respectively. Geert Leus is now an "Antoni van Leeuwenhoek" Full Professor at the Faculty of Electrical Engineering, Mathematics and Computer Science of the Delft University of Technology, The Netherlands. His research interests are in the broad area of signal processing, with a specific focus on wireless communications, array processing, sensor networks, and graph signal processing. Geert Leus received a 2002 IEEE Signal Processing Society Young Author Best Paper Award and a 2005 IEEE Signal Processing Society Best Paper Award. He is a Fellow of the IEEE and a Fellow of EURASIP. Geert Leus was a Member-at-Large of the Board of Governors of the IEEE Signal Processing Society, the Chair of the IEEE Signal Processing for Communications and Networking Technical Committee, a Member of the IEEE Sensor Array and Multichannel Technical Committee, and the Editor in Chief of the EURASIP Journal on Advances in Signal Processing. He was also on the Editorial Boards of the IEEE Transactions on Signal Processing, the IEEE Transactions on Wireless Communications, the IEEE Signal Processing Letters, and the EURASIP Journal on Advances in Signal Processing. Currently, he is the Chair of the EURASIP Technical Area Committee on Signal Processing for Multisensor Systems, a Member of the IEEE Signal Processing Theory and Methods Technical Committee, a Member of the IEEE Big Data Special Interest Group, an Associate Editor of Foundations and Trends in Signal Processing, and the Editor in Chief of EURASIP Signal Processing.
Machine Learning and Statistics have many intersections, yet there are many distinct differences. In this course, we will examine the differences and similarities to better understand where each side is coming from and where they are going. Based on these understandings we will look at ways that machine learning tasks can be enhanced with statistical thinking as well as methods. Finally we will learn about how these methods and tools are used in real life with examples drawn from pharmaceutical research and development areas.
Introductory-level Machine Learning, Basic Statistics
Andy Liaw has been doing research and applying Statistics and Machine Learning methods to drug discovery areas such as high throughput screening, pharmacology, cheminformatics, proteomics, and biomarkers for the past 20 years. He is the author of the R package randomForest and had made several contributions to the open source R software for Statistics and Data Science. He is currently Senior Principal Scientist in Merck Research Laboratories. He received his Ph.D. in Statistics from Texas A&M University.
Smart speakers, phone assistance, and voice-activated IoT devices are becoming pervasive around the world. Research efforts in Automatic Speech Recognition (ASR) over the past 50 years enabled these successes. This course describes the theory and practice of building modern speech recognition systems with three widely adopted ASR training approaches. In addition, recent advances in speech representation learning for ultra low-resource ASR training will be discussed in detail.
Familiarity with linear algebra, probability and statistics, elementary machine learning.
Abdelrahman Mohamed is a research scientist at Facebook AI Research (FAIR). Before FAIR, he was a principal scientist/manager in Amazon Alexa and a researcher in Microsoft Research. Abdelrahman received his Ph.D. from the University of Toronto, working with Geoffrey Hinton and Gerald Penn as part of the team that started the Deep Learning revolution in Spoken Language Processing in 2009. He is the recipient of the IEEE Signal Processing Society Best Journal Paper Award for 2016. His research interests span Deep Learning, Spoken Language Processing, and Natural Language Understanding. Abdelrahman has been focusing lately on improving learned speech representations through weakly-, semi-, and self-supervised learning.
Webpage: https://research.fb.com/people/mohamed-abdelrahman/ Google Scholar: https://scholar.google.ca/citations?user=tJ_PrzgAAAAJ&hl=en
The last 40 years have seen a dramatic progress in machine learning and statistical methods for speech and language processing like speech recognition, handwriting recognition and machine translation. Many of the key statistical concepts had originally been developed for speech recognition and language translation. Examples of such key concepts are the Bayes decision rule for minimum error rate and sequence-to-sequence processing using approaches like the alignment mechanism based on hidden Markov models and the attention mechanism based on neural networks. Recently the accuracy of speech recognition and machine translation could be improved significantly by the use of artificial neural networks and specific architectures, such as deep feedforward multi-layer perceptrons and recurrent neural networks, attention and transformer architectures. We will discuss these approaches in detail and how they form part of the probabilistic approach.
Familiarity with linear algebra, numerical mathematics, probability and statistics, elementary machine learning.
Hermann Ney is a full professor of computer science at RWTH Aachen University, Germany. His main research interests lie in the area of statistical classification, machine learning, neural networks and human language technology and specific applications to speech recognition, machine translation and handwriting recognition.
In particular, he has worked on dynamic programming and discriminative training for speech recognition, on language modelling and on machine translation. His work has resulted in more than 700 conference and journal papers (h-index 102, 60000+ citations; estimated using Google scholar). He and his team contributed to a large number of European (e.g. TC-STAR, QUAERO, TRANSLECTURES, EU-BRIDGE) and American (e.g. GALE, BOLT, BABEL) large-scale joint projects.
Hermann Ney is a fellow of both IEEE and ISCA (Int. Speech Communication Association). In 2005, he was the recipient of the Technical Achievement Award of the IEEE Signal Processing Society. In 2010, he was awarded a senior DIGITEO chair at LIMIS/CNRS in Paris, France. In 2013, he received the award of honour of the International Association for Machine Translation. In 2016, he was awarded an advanced grant of the European Research Council (ERC).
In the 1980s, classical robotics already reached a high level of maturity and it was able to produce large factories. For example, cars factories were completely automated. Despite these impressive achievements, unlike personal computers, modern service robots still did not leave the factories and take a seat as robot companions on our side. The reason is that it is still harder for us to program robots than computers. Usually, modern companion robots learn their duties by a mixture of imitation and trial-and-error. This new way of programming robots has a crucial consequence in the field of industry: the programming cost increases, making mass production impossible. However, in research, this approach had a great influence and over the last ten years all top universities in the world conduct research in this area. The success of these new methods has been demonstrated in a variety of sample scenarios: autonomous helicopters learning from teachers complex maneuver, walking robot learning impressive balancing skills, self-guided cars hurtling at high speed in racetracks, humanoid robots balancing a bar in their hand and anthropomorphic arms cooking pancakes. Accordingly, this class serves as an introduction to autonomous robot learning. The class focuses on approaches from the fields of robotics, machine learning, model learning, imitation learning, reinforcement learning and motor primitives. Application scenarios and major challenges in modern robotics will be presented as well. We pay particular attention to interactions with the participants of the lecture, asking multiple question and appreciating enthusiastic students. We also offer a parallel project, the Robot Learning: Integrated Project. It is designed to enable participants to understand robot learning in its full depth by directly applying methods presented in this class to real or simulated robots. We suggest motivated students to attend it as well, either during or after the Robot Learning Class!
Contents Robot Model Learning Imitation Learning Optimal Control Robot Reinforcement Learning Policy Search Robot Inverse Reinforcement Learning
Basic Machine Learning
Jan Peters is a full professor (W3) for Intelligent Autonomous Systems at the Computer Science Department of the Technische Universitaet Darmstadt. Jan Peters has received the Dick Volz Best 2007 US PhD Thesis Runner-Up Award, the Robotics: Science & Systems - Early Career Spotlight, the INNS Young Investigator Award, and the IEEE Robotics & Automation Society's Early Career Award as well as numerous best paper awards. In 2015, he received an ERC Starting Grant and in 2019, he was appointed as an IEEE Fellow. Despite being a faculty member at TU Darmstadt only since 2011, Jan Peters has already nurtured a series of outstanding young researchers into successful careers. These include new faculty members at leading universities in the USA, Japan, Germany and Holland, postdoctoral scholars at top computer science departments (including MIT, CMU, and Berkeley) and young leaders at top AI companies (including Amazon, Google and Facebook). Jan Peters has studied Computer Science, Electrical, Mechanical and Control Engineering at TU Munich and FernUni Hagen in Germany, at the National University of Singapore (NUS) and the University of Southern California (USC). He has received four Master's degrees in these disciplines as well as a Computer Science PhD from USC. Jan Peters has performed research in Germany at DLR, TU Munich and the Max Planck Institute for Biological Cybernetics (in addition to the institutions above), in Japan at the Advanced Telecommunication Research Center (ATR), at USC and at both NUS and Siemens Advanced Engineering in Singapore. He has led research groups on Machine Learning for Robotics at the Max Planck Institutes for Biological Cybernetics (2007-2010) and Intelligent Systems (2010-2021).
I-Requisites for a Cognitive Architecture (intermediate)
II- Putting it all together (intermediate)
III- Current work (advanced)
Jose C. Principe is a Distinguished Professor of Electrical and Computer Engineering at the University of Florida where he teaches advanced signal processing, machine learning and artificial neural networks (ANNs). He is Eckis Professor and the Founder and Director of the University of Florida Computational NeuroEngineering Laboratory (CNEL) www.cnel.ufl.edu. The CNEL Lab innovated signal and pattern recognition principles based on information theoretic criteria, as well as filtering in functional spaces. His secondary area of interest has focused in applications to computational neuroscience, Brain Machine Interfaces and brain dynamics. Dr. Principe is a Fellow of the IEEE, AIMBE, and IAMBE. Dr. Principe received the Gabor Award, from the INNS, the Career Achievement Award from the IEEE EMBS and the Neural Network Pioneer Award, of the IEEE CIS. He has more than 38 patents awarded over 800 publications in the areas of adaptive signal processing, control of nonlinear dynamical systems, machine learning and neural networks, information theoretic learning, with applications to neurotechnology and brain computer interfaces. He directed 97 Ph.D. dissertations and 65 Master theses. He wrote in 2000 an interactive electronic book entitled “Neural and Adaptive Systems” published by John Wiley and Sons and more recently co-authored several books on “Brain Machine Interface Engineering” Morgan and Claypool, “Information Theoretic Learning”, Springer, “Kernel Adaptive Filtering”, Wiley and “System Parameter Adaption: Information Theoretic Criteria and Algorithms”, Elsevier. He has received four Honorary Doctor Degrees, from Finland, Italy, Brazil and Colombia, and routinely serves in international scientific advisory boards of Universities and Companies. He has received extensive funding from NSF, NIH and DOD (ONR, DARPA, AFOSR).
This course will deal with deep learning for unimodal, multimodal, and multisensorial signal analysis and synthesis. Modalities mainly include audio, video, text, or physiological signals. Methods shown will, however, be applicable to a broad range of further signal types. We will first deal with pre-processing for denoising or dereverberation or package loss concealment. This will be followed by representation learning such as by convolutional neural networks or sequence-to-sequence encoder-decoder architectures as basis for end-to-end learning from raw signals or symbolic representation. Then, we shall discuss modelling for decision making such as by recurrent neural networks with long-short-term memory or gated recurrent units including handling dynamics by connectionist temporal classification. This will also include discussion of the usage of attention on different levels. We will further elaborate on the impact of topologies including multiple targets with shared layers, and how to move towards self-shaping networks in the sense of Automatic Machine Learning. In a last part, we will deal with some practical questions. These include data efficiency, such as by weak supervision with the human in the loop, data augmentation, active and semi-supervised learning, transfer learning, self-learning, or generative adversarial networks. Further, we will have a glance at modelling efficiency such as by squeezing networks. Privacy, trustability, and explainability enhancing solutions will include federated learning, confidence measurement, and diverse means of visualization. The content shown will be accompanied by open-source implementations of according toolkits available on github. Application examples will mainly come from the domains of Affective Computing, and mHealth.
The Handbook of Multimodal-Multisensor Interfaces. Vol. 2, S. Oviatt, B. Schuller, P.R. Cohen, D. Sonntag, G. Potamianos, A. Krüger (eds.), 2018
Basic Machine Learning and Signal Processing knowledge.
Björn W. Schuller received his diploma, doctoral degree, habilitation, and Adjunct Teaching Professor in Machine Intelligence and Signal Processing all in EE/IT from TUM in Munich/Germany. He is Full Professor of Artificial Intelligence and the Head of GLAM at Imperial College London/UK, Full Professor and Chair of Embedded Intelligence for Health Care and Wellbeing at the University of Augsburg/Germany, co-founding CEO and current CSO of audEERING – an Audio Intelligence company based near Munich and in Berlin/Germany, and permanent Visiting Professor at HIT/China amongst other Professorships and Affiliations. Previous stays include Full Professor at the University of Passau/Germany, and Researcher at Joanneum Research in Graz/Austria, and the CNRS-LIMSI in Orsay/France. He is a Fellow of the IEEE and Golden Core Awardee of the IEEE Computer Society, Fellow of the ISCA, Fellow of the BCS, President-Emeritus of the AAAC, and Senior Member of the ACM. He (co-)authored 900+ publications (30k+ citations, h-index=83), is Field Chief Editor of Frontiers in Digital Health and was Editor in Chief of the IEEE Transactions on Affective Computing amongst manifold further commitments and service to the community. His 30+ awards include having been honoured as one of 40 extraordinary scientists under the age of 40 by the WEF in 2015. He served as Coordinator/PI in 15+ European Projects, is an ERC Starting Grantee, and consultant of companies such as Barclays, GN, Huawei, or Samsung.
Generative Modeling (GM) refers to building a model of data, p(x), we can sample from, e.g., x is an image. It requires building a distribution of data and latent variables. On the other hand Discriminative Modeling refers to tasks such as regression and classification, which estimate conditional distributions such as p(class|x). Even for prediction, GMs are useful due to: data efficiency and semi-supervised learning, model checking by sampling, and understanding. All GMs represent probability distributions: some allow distribution to be evaluated explicitly, others do not allow distribution to be evaluated but allow sampling. Generative Adversarial Networks (GANs) can learn high-dimensional, complex real data distributions without relying on any assumptions. They can simply generate realistic samples from latent space. This has led to various applications, such as image synthesis, image attribute editing, image translation, domain adaptation and others.
A course on Introductory machine learning covering the main topics such as described in https://cedar.buffalo.edu/~srihari/CSE574/index.html
Srihari is a SUNY Distinguished Professor in the Department of Computer Science and Engineering at the University at Buffalo, The State University of New York. He teaches a sequence of three courses in artificial intelligence and machine learning: (i) introduction to machine learning, (ii) probabilistic graphical models and (iii) deep learning. Srihari’s work led to the world’s first automated system for reading handwritten postal addresses. It was deployed by the United States Postal Service saving hundreds of millions of dollars in labor costs. A side-effect was that it led to the task of recognizing handwritten digits to be considered the fruit-fly of AI methods. Srihari also spent a decade developing AI and machine learning methods for forensic pattern evidence such as latent prints, handwriting and footwear impressions. In particular, quantifying the value of handwriting evidence-- to allow presenting such testimony in US courts.
Srihari's honors include: Fellow of the IEEE, Fellow of the International Association for Pattern Recognition and distinguished alumnus of the Ohio State University College of Engineering Srihari received a B.Sc. from the Bangalore University, a B.E. from the Indian Institute of Science and a Ph.D. in Computer and Information Science from the Ohio State University.
Neural networks & Deep learning and Support vector machines & Kernel methods are among the most powerful and successful techniques in machine learning and data driven modelling. Universal approximators and flexible models are available with neural networks and deep learning, while support vector machines and kernel methods have solid foundations in learning theory and optimization theory. In this course we will explain several synergies between neural networks, deep learning, least squares support vector machines and kernel methods. A key role at this point is played by primal and dual model representations and different duality principles. In this way the bigger and unifying picture will be obtained and future perspectives will be outlined.
A recent example is restricted kernel machines, which connects least squares support vector machines and kernel principal component analysis to restricted Boltzmann machines. New developments on this will be shown for deep learning, generative models, multi-view and tensor based models, latent space exploration, robustness and explainability. It also enables to either work with explicit or implicit feature maps and choose model representations that are tailored to the given problem characteristics such as high dimensionality or large problem sizes.
The material is organized into 3 parts:
In Part I a basic introduction is given to support vector machines (SVM) and kernel methods with emphasis on their artificial neural networks (ANN) interpretations. The latter can be understood in view of primal and dual model representations, expressed in terms of the feature map and the kernel function, respectively. Feature maps may be chosen either explicitly or implicitly in connection to kernel functions. Related to least squares support vector machines (LS-SVM) such characterizations exist for supervised and unsupervised learning, including classification, regression, kernel principal component analysis (KPCA), kernel spectral clustering (KSC), kernel canonical correlation analysis (KCCA), and other. Primal and dual representations are also relevant in order to obtain efficient training algorithms, tailored to the nature of the given application (high dimensional input spaces versus large data sizes). Application examples are given e.g. in black-box weather forecasting, pollution modelling, prediction of energy consumption, and community detection in networks.
In Part II we explain how to obtain a so-called restricted kernel machine (RKM) representation for least squares support vector machine related models. By using a principle of conjugate feature duality it is possible to obtain a similar representation as in restricted Boltzmann machines (RBM) (with visible and hidden units), which are used in deep belief networks (DBN) and deep Boltzmann machines (DBM). The principle is explained both for supervised and unsupervised learning. Related to kernel principal component analysis a generative model is obtained within the restricted kernel machine framework. Furthermore, we discuss Generative Restricted Kernel Machines (Gen-RKM), a framework for multi-view generation and disentangled feature learning, and compare with Generative Adversarial Networks (GAN) and Variational Autoencoders (VAE). The use of tensor-based models is also very natural within this new RKM framework, and either explicit feature maps (e.g. convolutional feature maps) or implicit feature maps in connection to kernel functions can be used. Latent space exploration with Gen-RKM and aspects of robustness and explainability will be explained.
In Part III deep restricted kernel machines (Deep RKM) are explained which consist of restricted kernel machines taken in a deep architecture. In these models a distinction is made between depth in a layer sense and depth in a level sense. Links and differences between Deep RKM and stacked autoencoders and deep Boltzmann machines are given. The framework enables to conceive both deep feedforward neural networks (DNN) and deep kernel machines, through primal and dual model representations. Feature maps and related kernel functions are taken then for each of the levels. By fusing the objectives of the different levels (e.g. several KPCA levels followed by an LS-SVM classifier) in the deep architecture, the training process becomes faster and gives improved solutions. Furthermore, deep kernel machines with the incorporation of orthogonality constraints for deep unsupervised learning is explained. Finally, future perspectives and challenges will be outlined.
Basics of linear algebra
Johan A.K. Suykens was born in Willebroek Belgium, May 18 1966. He received the master degree in Electro-Mechanical Engineering and the PhD degree in Applied Sciences from the Katholieke Universiteit Leuven, in 1989 and 1995, respectively. In 1996 he has been a Visiting Postdoctoral Researcher at the University of California, Berkeley. He has been a Postdoctoral Researcher with the Fund for Scientific Research FWO Flanders and is currently a full Professor with KU Leuven. He is author of the books "Artificial Neural Networks for Modelling and Control of Non-linear Systems" (Kluwer Academic Publishers) and "Least Squares Support Vector Machines" (World Scientific), co-author of the book "Cellular Neural Networks, Multi-Scroll Chaos and Synchronization" (World Scientific) and editor of the books "Nonlinear Modeling: Advanced Black-Box Techniques" (Kluwer Academic Publishers), "Advances in Learning Theory: Methods, Models and Applications" (IOS Press) and "Regularization, Optimization, Kernels, and Support Vector Machines" (Chapman & Hall/CRC). In 1998 he organized an International Workshop on Nonlinear Modelling with Time-series Prediction Competition. He has served as associate editor for the IEEE Transactions on Circuits and Systems (1997-1999 and 2004-2007), the IEEE Transactions on Neural Networks (1998-2009), the IEEE Transactions on Neural Networks and Learning Systems (from 2017) and the IEEE Transactions on Artificial Intelligence (from April 2020). He received an IEEE Signal Processing Society 1999 Best Paper Award, a 2019 Entropy Best Paper Award and several Best Paper Awards at International Conferences. He is a recipient of the International Neural Networks Society INNS 2000 Young Investigator Award for significant contributions in the field of neural networks. He has served as a Director and Organizer of the NATO Advanced Study Institute on Learning Theory and Practice (Leuven 2002), as a program co-chair for the International Joint Conference on Neural Networks 2004 and the International Symposium on Nonlinear Theory and its Applications 2005, as an organizer of the International Symposium on Synchronization in Complex Networks 2007, a co-organizer of the NIPS 2010 workshop on Tensors, Kernels and Machine Learning, and chair of ROKS 2013. He has been awarded an ERC Advanced Grant 2011 and 2017, has been elevated IEEE Fellow 2015 for developing least squares support vector machines, and is ELLIS Fellow. He is currently serving as program director of Master AI at KU Leuven.
The success of deep-learning hinges on intermediate representations: transformations of the data on which statistical learning is easier. Deep architectures can extract very rich and powerful representations, but it needs huge volumes of data. In this course, we will study the fundamentals of simple representations. Simple representations are interesting because they can be learned in limited data settings. We will also use them to provide didactic cases to understand how to build statistical models from data. The goal of the course is to provide the basic mathematical concepts that underly successful representation extracted in limited data settings.
— Shallow representations: what and why? — Matrix factorizations and its variants: — From PCA to ICA — Sparse dictionary learning: formulation and efficient solvers — Word vectors demystified — Fisher kernels: vector representations from a data model — Theory: from likelihood to representation — Encoding strings and text — Encoding covariances
— General knowledge of statistical learning — Basic knowledge of probability — Basic knowledge of linear algebra
Gaël Varoquaux is a computer-science researcher at Inria. His research focuses on statistical learning tools for data science and scientific inference. He has pioneered the use of machine learning on brain images to map cognition and pathologies. More generally, he develops tools to make machine learning easier, with statistical models suited for real-life, uncurated data, and software for data science. He co-funded scikit-learn, one of the reference machine-learning toolboxes, and helped build various central tools for data analysis in Python. Varoquaux has contributed key methods for learning on spatial data, matrix factorizations, and modeling covariance matrices. He has a PhD in quantum physics and is a graduate from Ecole Normale Superieure, Paris.
The past few years have seen a dramatic increase in the performance of recognition systems thanks to the introduction of deep networks for representation learning. However, the mathematical reasons for this success remain elusive. For example, a key issue is that the neural network training problem is nonconvex, hence optimization algorithms are not guaranteed to return a global minima. The first part of this tutorial will overview recent work on the theory of deep learning that aims to understand how to design the network architecture, how to regularize the network weights, and how to guarantee global optimality. The second part of this tutorial will present sufficient conditions to guarantee that local minima are globally optimal and that a local descent strategy can reach a global minima from any initialization. Such conditions apply to problems in matrix factorization, tensor factorization and deep learning. The third part of this tutorial will present an analysis of dropout for matrix factorization, and establish connections
Basic understanding of sparse and low-rank representation and non-convex optimization.
Rene Vidal is a Professor of Biomedical Engineering and the Innaugural Director of the Mathematical Institute for Data Science at The Johns Hopkins University. His research focuses on the development of theory and algorithms for the analysis of complex high-dimensional datasets such as images, videos, time-series and biomedical data. Dr. Vidal has been Associate Editor of TPAMI and CVIU, Program Chair of ICCV and CVPR, co-author of the book 'Generalized Principal Component Analysis' (2016), and co-author of more than 200 articles in machine learning, computer vision, biomedical image analysis, hybrid systems, robotics and signal processing. He is a fellow of the IEEE, IAPR and Sloan Foundation, a ONR Young Investigator, and has received numerous awards for his work, including the 2012 J.K. Aggarwal Prize for "outstanding contributions to generalized principal component analysis (GPCA) and subspace clustering in computer vision and pattern recognition” as well as best paper awards in machine learning, computer vision, controls, and medical robotics.
Big data holds the potential to solve many challenging problems, and one of them is natural language understanding. As an example, big data has enabled the breakthrough in machine translation. However, natural language understanding still faces tremendous challenges. It has been shown that in areas such as question answering and conversation, domain knowledge is indispensable. Thus, how to acquire, represent, and apply domain knowledge for text understanding is of critical importance. In this short course, I will focus on understanding short text, which is crucial to many applications. Short texts do not always observe the syntax of a written language. As a result, traditional natural language processing methods cannot be easily applied. Second, short texts usually do not contain sufficient statistical signals to support many state-of-the-art approaches for text processing such as topic modeling. Third, short texts are usually more ambiguous. I will go over various techniques in knowledge acquisition, representation, and inferencing has been proposed for text understanding, and will describe massive structured and semi-structured data that have been made available in the recent decade that directly or indirectly encode human knowledge, turning the knowledge representation problems into a computational grand challenge with feasible solutions insight.
- Big data and statistical inference - The rise and fall of the semantic network - Knowledge of language - Conceptual knowledge for text understanding - Knowledge Extraction / Acquisition - Knowledge Reasoning / Modeling - Conclusion and Future work
- Kenneth Church, A Pendulum Swung Too Far, Linguistic Issues in Language Technology – LiLT Volume 2, Issue 4 May 2007 - Gregory Murphy, The Big Book of Concepts, MIT Press - George Lakoff, Women, Fire and Dangerous Things: What Categories Reveal About the Mind, University of Chicago Press (1990)
Haixun Wang is an IEEE fellow, Editor in Chief of the IEEE Data Engineering Bulletin, and a VP of Engineering and Distinguished Scientist at Instacart. Before that, he was a VP of Engineering and Distinguished Scientist at WeWork, where he led the Research and Applied Science division. He was Director of Natural Language Processing at Amazon. Before Amazon, he led the NLP Infra team in Facebook working on Query and Document Understanding. From 2013 to 2015, he was with Google Research, working on natural language processing. From 2009 to 2013, he led research in semantic search, graph data processing systems, and distributed query processing at Microsoft Research Asia. His knowledge base project Probase has created a significant impact in industry and academia. He had been a research staff member at IBM T. J. Watson Research Center from 2000 – 2009. He was Technical Assistant to Stuart Feldman (Vice President of Computer Science of IBM Research) from 2006 to 2007, and Technical Assistant to Mark Wegman (Head of Computer Science of IBM Research) from 2007 to 2009. He received the Ph.D. degree in Computer Science from the University of California, Los Angeles in 2000. He has published more than 150 research papers in refereed international journals and conference proceedings. He served as PC Chair of conferences such as SIGKDD 2021, and he is on the editorial board of journals such as IEEE Transactions of Knowledge and Data Engineering (TKDE) and Journal of Computer Science and Technology (JCST). He won the best paper award in ICDE 2015, 10-year best paper award in ICDM 2013, and best paper award of ER 2009.