Course Description

 

Keynotes

  • Courses


  • Keynotes


    Nello Cristianini
    (University of Bristol) [-]
    [Virtual] Data, Intelligence and Shortcuts

    Summary

    Many familiar dilemmas that we find in the application of data-driven AI have their origins in technical-mathematical choices that we have made along the way to this version of AI. Several of them might need to be reconsidered in order for the field to move forward. After reviewing some of the current problems related to AI, we trace their cultural, technical and economic origins, then we discuss possible solutions.

    Short Bio

    Nello Cristianini is Professor of Artificial Intelligence at the University of Bristol. His research covers machine learning methods, and applications of AI to the analysis of media content, as well as the social and ethical implications of AI. Cristianini is the co-author of two widely known books in machine learning, as well as a book in bioinformatics. He is a recipient of the Royal Society Wolfson Research Merit Award, and of a European Research Council Advanced Grant. Before joining the University of Bristol, he has been a professor of statistics at the University of California, Davis. Currently he is working on social and ethical implications of AI. His animated videos dealing with the social aspects of AI can be found here: https://www.youtube.com/seeapattern



    Petia Radeva
    (University of Barcelona) [-]
    Uncertainty Modeling and Deep Learning in Food Analysis

    Summary

    Recently, computer vision approaches specially assisted by deep learning techniques have shown unexpected advancements that practically solve problems that never have been imagined to be automatized like face recognition or automated driving. However, food image recognition due to its high complexity and ambiguity, still remains far from being solved. In this project, we focus on how to combine two challenging research lines: deep learning and uncertainty modeling (epistemic and aleatoric uncertainty). After discussing our methodology to advance in this direction, we comment on potential applications, as well as the social and economic impact of the research on food image analysis.

    Short-Bio

    Prof. Petia Radeva is a Full professor at the Universitat de Barcelona (UB), PI of the Consolidated Research Group “Computer Vision and Machine Learning” at the University of Barcelona (CVUB) at UB (www.ub.edu/cvub) and Senior researcher in Computer Vision Center (www.cvc.uab.es). She was PI of UB in 4 European, 3 international and more than 20 national projects devoted to applying Computer Vision and Machine learning for real problems like food intake monitoring (e.g. for patients with kidney transplants and for older people). Petia Radeva is a REA-FET-OPEN vice-chair since 2015 on, and international mentor in the Wild Cards EIT program since 2017.

    She is an Associate editor of Pattern Recognition journal (Q1) and International Journal of Visual Communication and Image Representation (Q2).

    Petia Radeva has been awarded IAPR Fellow since 2015, ICREA Academia assigned to the 30 best scientists in Catalonia for her scientific merits since 2014, received several international awards (“Aurora Pons Porrata” of CIARP, Prize “Antonio Caparrós” for the best technology transfer of UB, etc).

    She supervised 18 PhD students and published more than 100 SCI journal publications and 250 international chapters and proceedings, her Google scholar h-index is 44 with more than 7600 cites.


    Courses


    Ignacio Arganda-Carreras
    (University of the Basque Country) [introductory/intermediate]
    Deep Learning for Bioimage Analysis

    Summary

    Deep learning, the latest extension of machine learning, has pushed the accuracy of algorithms to unseen limits, especially for perceptual problems such as the ones tackled by computer vision and image analysis. This workshop will cover the foundations of the field, the communities organized around it, some important tools and resources to get started with these techniques, and the latest applications of deep learning in the field of bioimage analysis. In particular, we will focus on the problems of semantic and instance segmentation of biological images, unsupervised image denoising and deep learning-based super-resolution. All classes will have a theoretical part followed by a hands-on practical session.

    Syllabus

    • ⇲ Introduction to deep learning in bioimage analysis.
    • ⇲ Deep learning-based super-resolution of biomedical images.
    • ⇲ Semantic and instance segmentation of microscopy image data.
    • ⇲ Unsupervised image denoising of biological image data.

    References

    • ⇲ Weigert, M., Schmidt, U., Boothe, T., Müller, A., Dibrov, A., Jain, A., Wilhelm, B., Schmidt, D., Broaddus, C., Culley, S. and Rocha-Martins, M., 2018. Content-aware image restoration: pushing the limits of fluorescence microscopy. Nature methods, 15(12), pp.1090-1097.
    • ⇲ Schmidt, U., Weigert, M., Broaddus, C. and Myers, G., 2018, September. Cell detection with star-convex polygons. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 265-273). Springer, Cham.
    • ⇲ Lehtinen, J., Munkberg, J., Hasselgren, J., Laine, S., Karras, T., Aittala, M. and Aila, T., 2018. Noise2noise: Learning image restoration without clean data. arXiv preprint arXiv:1803.04189.
    • ⇲ Krull, A., Buchholz, T.O. and Jug, F., 2019. Noise2void-learning denoising from single noisy images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2129-2137).
    • ⇲ Gómez-de-Mariscal, E., García-López-de-Haro, C., Donati, L., Unser, M., Muñoz-Barrutia, A. and Sage, D., 2019. DeepImageJ: A user-friendly plugin to run deep learning models in ImageJ. bioRxiv, p.799270.
    • ⇲ Von Chamier, L., Jukkala, J., Spahn, C., Lerche, M., Hernández-Pérez, S., Mattila, P., Karinou, E., Holden, S., Solak, A.C., Krull, A. and Buchholz, T.O., 2020. ZeroCostDL4Mic: an open platform to simplify access and use of Deep-Learning in Microscopy. BioRxiv.

    Pre-requisites

    Mathematics at the level of an undergraduate degree in computer science: basic multivariate calculus, probability theory, and linear algebra.

    Short Bio

    Ignacio Arganda-Carreras is an Ikerbasque Research Fellow at the department of Computer Science and Artificial Intelligence of the UPV/EHU, also associated with the Donostia International Physics Center (DIPC). He is one of the founders of Fiji, one of the most popular open source image processing packages in the world, and widely used by the bio-image analysis community. His lab is focused on image processing and machine learning, especially to develop open source computer vision methods for biomedical images. For publications, see https://scholar.google.com/citations?user=02VpQlGwa_kC&hl=en



    Rita Cucchiara
    (University of Modena and Reggio Emilia) [intermediate/advanced]
    Learning to Understand Humans and Their Behaviour

    Summary

    The course will focus on machine learning for computer vision in the specific topic of human behavior understanding (HBU). Here, supervised and self-supervised learning is often adopted with the use of annotated or partially datasets. Although the apparent narrowness of the topic (just only about human aspect!), and the long-lasting research activity in people detection, tracking, action analysis, interactive behavior and 3D crowd modeling, ultimate solutions with high robustness and accuracy are still far to be achieved. As well, these application fields are now classified “high risky” according with the EU Artificial intelligence Act and thus must be re-thought with special care on the trustworthiness of datasets without any bias, the privacy awareness and the transparency and sustainability of the process. These aspects must correspond to specific technical solutions and large research is orienting in sustainable machine learning for HBU. The course will address these concerns and these technical challenges more specifically in people detection by pose in 2D and 3D, people aspect tracking and re-id using synthetical and real datasets (e.g. JTA and the new large MOTsyn vs MOT dataset), in human action analysis by graph-based spatio-temporal self-attentive architectures. Finally, some aspects of human-AI interaction with NLP generation will be discussed to explain what the machine sees when detect human behavior. The talks will discuss some theoretical and technical ques as well the possible applications in field of automotive, industry and smart cities, presenting some projects at AImagelab, UNIMORE.

    Syllabus

    1. People detection by pose in 2D and 3D dealing with occlusions and attribute analysis (by supervised CNNs and GANs )
    2. Human action recognition (spatio-temporal graph-based NN and transformer-based architecture)
    3. About sustainability, privacy and transparency: Synthetic Datasets and Knowledge distillation approaches, using HPC resources
    4. Examples of Application in Human Attention assesment in vehicles, human -robot human-machine interaction and sport analysis.

    References

    • M Tomei, L Baraldi, S Calderara, S Bronzin, R Cucchiara Video action detection by learning graph-based spatio-temporal interactions. Computer Vision and Image Understanding 206, 103187
    • M Tomei, L Baraldi, S Bronzin, R Cucchiara Estimating (and fixing) the Effect of Face Obfuscation in Video Recognition 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops
    • M Cornia, L Baraldi, R Cucchiara Smart: training shallow memory-aware transformers for robotic explainability 2020 IEEE International Conference on Robotics and Automation (ICRA), 1128-1134
    • A D'Eusanio, A Simoni, S Pini, G Borghi, R Vezzani, R Cucchiara A Transformer-Based Network for Dynamic Hand Gesture Recognition International Conference on 3D Vision 2020
    • M Fincato, F Landi, M Cornia, C Fabio, R Cucchiara VITON-GT: An Image-based Virtual Try-On Model with Geometric Transformations International Conference on Pattern Recognition 2021
    • M Fabbri, F Lanzi, S Calderara, S Alletto, R Cucchiara Compressed volumetric heatmaps for multi-person 3d pose estimation Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2020
    • F Fulgeri, M Fabbri, S Alletto, S Calderara, R Cucchiara Can adversarial networks hallucinate occluded people with a plausible aspect? Computer Vision and Image Understanding 182, 71-80,2020
    • M Cornia, L Baraldi, G Serra, R Cucchiara Predicting human eye fixations via an lstm-based saliency attentive model IEEE Transactions on Image Processing 27 (10), 5142-5154, 2019

    Pre-requisites

    Basic Machine Learning and Computer Vision knowledge.

    Short Bio

    Rita Cucchiara (Laurea in Ing.Elettronica, 1989 and Phd in Ing. Informatica - 1993 University of Bologna), is Full Professor of Computer Engineering at the University of Modena and Reggio Emilia, Department of Engineering "Enzo Ferrari". In Modena she coordinates the AImagelab research lab, which gathers more than 35 researchers in to AI, Artificial Vision, Machine Learning, Pattern Recognition and Multimedia. (www.aimagelab.unimore.it). She is in charge of the joint laboratory with Ferrari, Red-Vision and from 2020 of the NVIDIA AI Technical Center (NVAITC@UNIMORE) and is the director of Modena unit of the European network ELLIS. He coordinates several international, European and national projects in topics related to AI applied to human-AI Interaction, video-surveillance, automotive, industry 4.0 and cultural heritage. Rita Cucchiara is since 2018 Director of the Lab. National CINI of Artificial Intelligence and Intelligent Systems (AIIS). She has been President of the Italian Ass. in Computer Vision, Pattern Recognition and Machine Learning from 2016 to 2018. Since 2017 she is a member of the Board of Directors of the Italian Institute of Technology. She is currently responsible of the working group on Artificial Intelligence of the National Research Plan PNR2021-2027 of the Ministry of University and Research. She has more than 450 scientific publications on international journals and conferences. She is Associate Editor of T-PAMI and will be GC of ECCV2022.



    Thomas G. Dietterich
    (Oregon State University) [introductory]
    Machine Learning Methods for Robust Artificial Intelligence

    Summary

    How can we develop machine learning methods that we can trust in high-risk applications? We need ML methods that know their own range of competence so that they can detect when input queries lie outside that range. This class will present several related areas of ML research that seek to achieve this goal including (a) classification with a "reject" option, (b) calibrated confidence estimation, (c) out of distribution detection, and (d) open category detection. The first two topics focus on problems where the training set and test set come from the same distribution, and the classifier must assess its own competence on each test instance. The second two topics can be viewed as applications of anomaly detection, so we will study anomaly detection methods for both featurized data and for signal data (e.g., images) where a good feature space must be learned. Our discussion of anomaly detection will be complementary to Peter Rousseeuw's course (which makes a good companion).

    Syllabus

      • Obtaining calibrated probabilities from supervised classifiers
      • Achieving high accuracy classification by abstaining on test queries
      • Obtaining calibrated prediction intervals for regression problems
      • Anomaly detection methods for feature-vector data
      • Anomaly detection methods for images
      • Open category detection: detecting when a test instance belongs to a class not seen during training.

    References

    • Niculescu-Mizil, A., & Caruana, R. (2005). Predicting good probabilities with supervised learning. Proceedings of the 22nd International Conference on Machine Learning ICML ’05, (2005), 625–632. http://doi.org/10.1145/1102351.1102430

    • Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On Calibration of Modern Neural Networks. http://arxiv.org/abs/1706.04599

    • Romano, Y., Patterson, E., & Candès, E. J. (2019). Conformalized Quantile Regression. http://arxiv.org/abs/1905.03222

    • Shafer, G., & Vovk, V. (2008). A tutorial on conformal prediction. Journal of Machine Learning Research, 9, 371–421. Retrieved from http://arxiv.org/abs/0706.3188

    • Cortes, C., DeSalvo, G., & Mohri, M. (2016). Learning with rejection. Lecture Notes in Artificial Intelligence, 9925 LNAI, 67–82. http://doi.org/10.1007/978-3-319-46379-7_5

    • Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-Based Anomaly Detection. ACM Transactions on Knowledge Discovery from Data, 6(1), 1–39. http://doi.org/10.1145/2133360.2133363

    • Emmott, A., Das, S., Dietterich, T., Fern, A., & Wong, W.-K. (2015). Systematic construction of anomaly detection benchmarks from real data. https://arxiv.org/abs/1503.01158

    • Siddiqui, A., Fern, A., Dietterich, T. G., & Das, S. (2016). Finite Sample Complexity of Rare Pattern Anomaly Detection. In Proceedings of UAI-2016 (p. 10). http://auai.org/uai2016/proceedings/papers/226.pdf

    • Bulusu, S., Kailkhura, B., Li, B., Varshney, P. K., & Song, D. (2020). Anomalous Instance Detection in Deep Learning: A Survey. ArXiv, 2003.06979(v1). http://arxiv.org/abs/2003.06979

    • Bendale, A., & Boult, T. (2016). Towards Open Set Deep Networks. In CVPR 2016 (pp. 1563–1572). http://doi.org/10.1109/CVPR.2016.173

    • Liu, S., Garrepalli, R., Dietterich, T. G., Fern, A., & Hendrycks, D. (2018). Open Category Detection with PAC Guarantees. Proceedings of the 35th International Conference on Machine Learning, PMLR, 80, 3169–3178. http://proceedings.mlr.press/v80/liu18e.html

    • Boult, T. E., Cruz, S., Dhamija, A., Gunther, M., Henrydoss, J., & Scheirer, W. (2019). Learning and the Unknown: Surveying Steps Toward Open World Recognition. AAAI 2019.

    Pre-requisites

    Familiarity with standard machine learning methods such as decision trees, random forests, and support vector machines. Basic knowledge of deep learning for images. Basic knowledge of probability and principle component analysis.

    Short Bio

    Thomas Dietterich (PhD Stanford, 1985) is Distinguished Professor (Emeritus) of Computer Science at Oregon State University. Dietterich is one of the pioneers of the field of Machine Learning and has authored more than 200 refereed publications and two books. His current research topics include robust artificial intelligence (calibration and anomaly detection), robust human-AI systems, and applications in sustainability. He is a former president of the International Machine Learning Society (the parent organization of ICML) and the Association for the Advancement of Artificial Intelligence. He is one of the moderators of the cs.LG category on arXiv.



    Georgios Giannakis
    (University of Minnesota) [advanced]
    Ensembles for Online, Interactive and Deep Learning Machines with Scalability, and Adaptivity

    Summary

    Inference of functions from data is ubiquitous in Statistical Learning. This course deals with Gaussian process (GP) based approaches that not only learn over a class of nonlinear functions, but also quantify the associated uncertainty. To cope with the curse of dimensionality, random feature Fourier (RF) vectors lead to parametric GP-RF function models, that offer scalable estimators. The course will next focus on online learning with ensembles (E) of GP-RF learners, each with a distinct kernel belonging to a prescribed dictionary, and jointly learning a much richer class of functions. Whether in batch or online forms, EGPs remain robust to dynamics captured by adaptive Kalman filters. Being able to cope with unknown dynamics and quantify uncertainty, are critical especially in adversarial settings. EGP performance can be refined online, and it is benchmarked using regret analysis. Further, the course will cross-fertilize ideas from Deep Gaussian Processes and EGPs in order to gain degrees of freedom. Broader applicability of EGPs will be also demonstrated for interactive optimization and policy evaluation in reinforcement learning.

    Syllabus

    • Day 1: Online Scalable Learning Adaptive to Unknown Dynamics and Graphs – Part I: Multi-kernel Approaches

      • Kernel based methods exhibit well-documented performance in various nonlinear learning tasks. Most of them rely on a preselected kernel, whose prudent choice presumes task-specific prior information. Especially when the latter is not available, multi-kernel learning has gained popularity thanks to its flexibility in choosing kernels from a prescribed kernel dictionary. Leveraging the random feature approximation, this talk will introduce first for static setups a scalable multi-kernel learning approach (termed Raker) to obtain the sought nonlinear learning function ‘on the fly,’ bypassing the curse of dimensionality associated with kernel methods. We will also present an adaptive multi-kernel learning scheme (termed AdaRaker) that relies on weighted combinations of advices from hierarchical ensembles of experts to boost performance in dynamic environments. The weights account not only for each kernel’s contribution to the learning process, but also for the unknown dynamics. Performance is analyzed in terms of both static and dynamic regrets. AdaRaker is uniquely capable of tracking nonlinear learning functions in environments with unknown dynamics, with analytic performance guarantees. The approach is further tailored for online graph-adaptive learning with scalability and privacy. Tests with synthetic and real datasets will showcase the effectiveness of the novel algorithms.
    • Day 2: Online Scalable Learning with Adaptivity and Robustness – Part II: Deep and Ensemble GPs

      • Abstract. Approximation and inference of functions from data are ubiquitous tasks in statistical learning theory and applications. Among relevant approaches with growing popularity, this talk deals with Gaussian process (GP) based approaches that not only learn over a class of nonlinear functions, but also quantify the associated uncertainty. To cope with the curse of dimensionality in this context, random feature Fourier (RF) vectors lead to parametric GP-RF function models, that offer scalable forms of Wiener’s minimum mean-square error approach. The talk will next touch upon deep GP architectures, and will further focus on a weighted ensemble (E) of GP-RF learners, each with a distinct covariance (kernel) belonging to a prescribed dictionary, and jointly learning a much richer class of functions. In addition to robustness, these ensembles can operate in either batch or online form interactively, even for dynamic functions along the lines of adaptive Kalman filters. The performance of EGP-based learning will be benchmarked using regret analysis. Broader applicability of EGPs will be also demonstrated for policy evaluation in reinforcement learning with the kernel(s) selected interactively on-the-fly. Case studies will highlight the merits of deep and ensemble GPs.

    References

    • G. B. Giannakis, Y. Shen, and G. V. Karanikolas, "Topology Identification and Learning over Graphs: Accounting for Nonlinearities and Dynamics," Proceedings of the IEEE, vol. 106, no. 5, pp. 787-807, May 2018.
    • Q. Lu, G. V. Karanikolas, Y. Shen, and G. B. Giannakis, "Ensemble Gaussian Processes with Spectral Features for Online Interactive Learning with Scalability," Proc. of 23rd Intl. Conf. on Artificial Intelligence and Statistics, Palermo, Italy, June 3-5, 2020.
    • A. Rahimi and B. Recht, “Random features for large scale kernel machines,” Proc. Advances in Neural Info. Process. Syst., pp. 117-1184, Canada, Dec. 2008.
    • C. Rasmussen, C. Williams, “Gaussian processes for machine learning,” MIT Press, Cambidge, 2006.
    • S. Shalev-Shwartz, “Online learning and online convex optimization,” Foundations and Trends in Machine Learning, vol. 4, no. 2, pp. 107–194, 2011.
    • Y. Shen , T. Chen and G. B. Giannakis, “Random Feature-based Online Multi-kernel Learning in Environments with Unknown Dynamics,” Journal of Machine Learning Research, vol. 20, no. 22, pp. 1-36, February 2019.
    • Y. Shen, G. Leus, and G. B. Giannakis, “Online Graph-Adaptive Learning with Scalability and Privacy,” IEEE Transactions on Signal Processing, vol. 67, no. 9, pp. 2471-2483, May 2019.

    Pre-requisites

    • Graduate-level courses in Random Processes, Linear Algebra, and Machine Learning

    Short Bio

    Prof. Georgios B. Giannakis, ADC Chair in Wireless Telecommunications and McKnight Presidential Chair in ECE, University of Minnesota. Georgios B. Giannakis (Fellow’97) received his Diploma in Electrical Engr. (EE) from the Ntl. Tech. Univ. of Athens, Greece, 1981. From 1982 to 1986 he was with the U. of Southern California (USC), where he received his MSc. in EE, 1983, MSc. in Mathematics, 1986, and Ph.D. in EE, 1986. He was with the U. of Virginia from 1987 to 1998, and since 1999 he has been with the U. of Minnesota, where he holds a Chair in Wireless Communications, a U. of Minnesota McKnight Presidential Chair in ECE, and serves as director of the Digital Technology Center. His general interests span the areas of statistical learning, communications, and networking - subjects on which he has published more than 460 journal papers, 760 conference papers, 26 book chapters, two edited books and two research monographs. Current research focuses on Data Science with applications to brain, and power networks with renewables. He is the (co-) inventor of 33 patents issued, and the (co-) recipient of 9 best journal paper awards from the IEEE Signal Processing (SP) and Communications Societies, including the G. Marconi Prize Paper Award in Wireless Communications. He also received the IEEE-SPS Nobert Wiener Society Award (2019); Technical Achievement Awards from the SP Society (2000) and from EURASIP (2005); the IEEE ComSoc Education Award (2019); the G. W. Taylor Award for Distinguished Research from the University of Minnesota, and the IEEE Fourier Technical Field Award (inaugural recipient in 2015). He is a Fellow of the National Academy of Inventors, IEEE and EURASIP, and has served the IEEE in a number of posts, including that of a Distinguished Lecturer for the IEEE-SPS.



    Sergei V. Gleyzer
    (University of Alabama) [introductory/intermediate]
    Machine Learning Fundamentals and Their Applications to Very Large Scientific Data: Rare Signal and Feature Extraction, End-to-end Deep Learning, Uncertainty Estimation and Realtime Machine Learning Applications in Software and Hardware

    Summary

    Deep learning has become one of the most widely used tools in modern science and engineering, leading to breakthroughs in many areas and disciplines ranging from computer vision to natural language processing to physics and medicine. This mini-course will introduce the basics of machine learning and classification theory based on statistical learning and describe two classes of popular algorithms in depth: decision and rule-based methods (decision trees, decision rules, bagging and boosting, random forests) and deep neural network-based models of various types (fully-connected, convolutional, recurrent, recursive and graph neural networks). The course will focus on practical applications in analysis of large scientific data, interpretability, uncertainty estimation and how to best extract meaningful features, while implementing realtime deep learning in software and hardware. No previous machine learning background is required.

    Syllabus

    • Introduction to Machine Learning: Theoretical Foundation, Classification Theory
    • Practical Applications and Examples in Sciences and Engineering with Large Scientific Data (LHC/VRO)
    • Tree-based Algorithms: decision trees, rules, bagging, boosting, random forests
    • Deep Learning Methods: theory, fully-connected networks, convolutional, recurrent and recursive networks, graph networks and geometric deep learning
    • Fundamentals of Feature Extraction and End-to-end Deep Learning
    • Uncertainty Estimation and Machine Learning Model Interpretations
    • Realtime Implementation of Deep Learning in Software and Hardware

    References

    • I. Goodfellow, Y. Bengio and A. Courville, “Deep Learning” MIT Press 2016
    • G. James et al., “Introduction to Statistical Learning” Springer 2013
    • C.M. Bishop “Pattern Recognition and Machine Learning” Springer 2006
    • J. R. Quinlan “C4.5: Programs for Machine Learning” Morgan Kaufmann 1992

    Pre-requisites

    None

    Short Bio

    Sergei Gleyzer is a particle physicist and university professor, working at the interface of particle physics and machine learning towards more intelligent systems to extract meaningful information from the data collected by the Large Hadron Collider (LHC), the world’s highest-energy particle physics experiment located at the CERN laboratory, near Geneva Switzerland. He is the a co-discover of the Higgs Boson and founder of several major machine learning initiatives such as the Inter-experimental Machine Learning Working Group and Compact Muon Solenoid experiment’s Machine Learning Forum. Professor Gleyzer is working on applying advanced machine learning methods to searches for new physics, such as dark matter.



    Çağlar Gülçehre
    (DeepMind) [intermediate/advanced]
    [VIRTUAL] Deep Reinforcement Learning in the Real World: Offline RL

    Summary

    Reinforcement learning algorithms have achieved promising results on simulations and games such as Go[1], Starcraft [2], Dota[3], and Deepmind Control Suite. However, doing random exploration in the real world can be dangerous, and as a result, evaluating and training RL policies in real-world environments can be costly and dangerous. On the other hand, real-world systems generate large amounts of logged data as they operate. Offline RL methods can learn policies from the logged data, potentially enabling a safe and efficient way of training offline RL policies for real-world applications such as healthcare and robotics. In this course

    Syllabus

    -Reinforcement Learning Introduction -Offline RL Basics -Conservative Q value estimations -Representation Learning for Offline RL -Off-policy Evaluation -Offline Policy Selection

    References

    • [1] Silver, David, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot et al. "A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play." Science 362, no. 6419 (2018): 1140-1144.
    • [2] Vinyals, Oriol, Igor Babuschkin, Wojciech M. Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H. Choi et al. "Grandmaster level in StarCraft II using multi-agent reinforcement learning." Nature 575, no. 7782 (2019): 350-354.
    • [3] Berner, Christopher, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi et al. "Dota 2 with large scale deep reinforcement learning." arXiv preprint arXiv:1912.06680 (2019).

    Pre-requisites

    Machine Learning, Linear Algebra, Basic understanding of probabilities and statistics, Calculus, basic programming skills, and familiarity with some machine learning frameworks.

    Short Bio

    Caglar Gulcehre has been a senior research scientist at Deepmind since 2017. He has done his Ph.D. under Yoshua Bengio on representation learning, memory, and natural language understanding. During his Ph.D., he has authored numerous influential papers in those areas. His current research topics include but not limited to reinforcement learning, offline RL, and representation learning. He has served as an area chair and reviewer to major Machine learning conferences such as ICML, NeurIPS, ICLR, and journals like Nature and JMLR.



    Balázs Kégl
    (Huawei Technologies) [introductory]
    Deep Model-based Reinforcement Learning

    Summary

    I will introduce reinforcement learning from a model-based perspective. In this paradigm the core of the algorithm is the system model: a multivariate generative (probabilistic) time-series predictor. The system model is combined with online planning and/or learning model-free agents on the system model. The course is designed for students with basic classical machine learning knowledge. My goal is to open up an interesting perspective while also giving you useful tools to tackle practical applications.

    The main motivation of the course is learning and improving policies to control engineering systems (autopilots or "self-driving" systems). Unlike popular game benchmarks, these systems are low-dimensional (~10s to ~100s of dimensions) with rewards coming continuously or with a short delay. On the other hand, they are physical, slow, and system access is usually extremely restricted. The focus and the main algorithmic challenge is thus not representation learning and handling sparse rewards (as in games), rather learning robust system models on extremely small (100s or 1000s of time steps) non-iid data. The perspective is also an interesting extension of the classical supervised learning paradigm in which functions are learned in a single shot on data generated (sampled and labeled) by an imaginary oracle. In the real world, supervised models are usually re-learned often, on non-iid data generated by a process which we partially control. The questions that we will ponder on (exploration, distribution shift, non-iid data) may thus also interest students planning to work on supervised machine learning in the real world.

    Syllabus

    • Motivation: supervised and reinforcement learning in real-world industrial applications.
    • The nuts and bolts of learning multivariate generative time-series models of dynamic systems (deep autoregressive neural ensemble, mixture density net, Gaussian process).
    • Planning and learning on system models.

    References

    Pre-requisites

    Basic classical machine learning knowledge (e.g., Bishop: Pattern Recognition and Machine Learning)

    Short Bio

    https://scholar.google.com/citations?user=s0njcGgAAAAJ https://www.linkedin.com/in/balazskegl https://twitter.com/balazskegl https://medium.com/@balazskegl

    Balázs Kégl is the Head of AI research at Huawei France where his main focus is deep generative models and model-based reinforcement learning for telecommunications applications. He is on leave from the CNRS where he has been a Senior Research Scientist from 2006 focusing on machine learning and experimental physics. He was the Head of the Center for Data Science of the Université Paris-Saclay between 2014 and 2019, with a mission to accelerate the design and development of machine learning pipelines in scientific applications. Prior to joining the CNRS, he was Assistant Professor at the University of Montreal. He has published more than hundred papers on artificial intelligence and its application in particle physics, systems biology, and climate science. Balázs is co-creator of RAMP (www.ramp.studio), a code-submission platform to accelerate building predictive workflows and to promote collaboration between domain experts and data scientists.



    Vincent Lepetit
    (ENPC ParisTech) [intermediate]
    AI and 3D Geometry for Self-Supervised 3D Scene Understanding

    Summary

    AI and 3D Geometry for Self-Supervised 3D Scene Understanding

    3D scene understanding is a fundamental problem in Computer Vision, where one wants to not only recognise the objects present in a scene from captured images, but also retrieve their 3D properties including their poses and shapes. With the development of deep learning approaches, this field has made a remarkable progress. Unfortunately, all recent methods are trained in a supervised way on 3D annotated data. Such a supervised approach has several drawbacks: 3D manual annotations are particularly cumbersome to create and creating realistic virtual 3D scenes also has a high cost; Supervised methods also tend to generalize poorly to other datasets; Even more importantly, they can only be as good as the training 3D annotations, and mistakes in manual annotations are actually common in existing datasets. If one wants to go further and consider more scenes without creating real or synthetic training datasets, it is important to consider new directions.

    In this lecture, we will present and discuss self-supervised approaches, more exactly auto-labelling methods for automatically creating 3D annotations. In particular, we will review the Monte Carlo Tree Search (MCTS), which is a general discrete AI algorithm for learning to play games, and show how it can be used for 3D scene understanding. For this, we will consider applications to hand and object pose estimation and indoor scene analysis.

    Syllabus

    • • 3D object pose estimation;
    • • 3D hand pose estimation;
    • • 3D scene understanding;
    • • self-supervised learning;
    • • auto-labelling;
    • • Monte Carlo Tree Search.

    References

    • • Monte Carlo Scene Search for 3D Scene Understanding. Shreyas Hampali, Sinisa Stekovic, Sayan Deb Sarkar, Chetan Srinivasa Kumar, Friedrich Fraundorfer, and Vincent Lepetit. CVPR 2021.
    • • General 3D Room Layout from a Single View by Render-And-Compare. Sinisa Stekovic, Shreyas Hampali, Mahdi Rad, Sayan Deb Sarkar, Friedrich Fraundorfer, and Vincent Lepetit. ECCV 2020.
    • • Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in The Wild. Alexander Grabner, Yaming Wang, Peizhao Zhang, Peihong Guo, Tong Xiao, Peter Vajda, Peter M. Roth, and Vincent Lepetit. ECCV 2020.
    • • HOnnotate: A Method for 3D Annotation of Hand and Object Poses. Shreyas Hampali, Mahdi Rad, Markus Oberweger, and Vincent Lepetit. CVPR 2020.
    • • Remi Coulom. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. In International Conference on Computers and Games, 2006
    • • Cameron Browne, Edward Powley, Daniel Whitehouse, Simon Lucas, Peter Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez liebana, Spyridon Samothrakis, and Simon Colton. A Survey of Monte Carlo Tree Search Methods. IEEE Transactions on Computational Intelligence and AI in Games, 4:1:1–43, 2012.

    Pre-requisites

    Basic knowledge of Deep Learning applied to computer vision and 3D Geometry

    Short Bio

    Vincent Lepetit is a director of research at ENPC ParisTech since 2019. Prior to being at ENPC, he was a full professor at the Institute for Computer Graphics and Vision, Graz University of Technology, Austria, and before that, a senior researcher at the Computer Vision Laboratory (CVLab) of EPFL, Switzerland. His research interest are at the interface between Machine Learning and 3D Computer Vision, and currently focus on 3D scene understanding from images. He often serves as an area chair for the major computer vision conferences (CVPR, ICCV, ECCV) and is an associate editor for PAMI, IJCV, and CVIU.



    Geert Leus
    (Delft University of Technology) [introductory/intermediate]
    Graph Signal Processing: Introduction and Connections to Distributed Optimization and Deep Learning

    Summary

    The field of graph signal processing extends classical signal processing tools to signals (data) with an irregular structure that can be characterized my means of a graph (e.g., network data). One of the cornerstones of this field are graph filters, direct analogues of time-domain filters, but intended for signals defined on graphs. In this course, we introduce the field of graph signal processing and specifically give an overview of the graph filtering problem. We look at the family of finite impulse response (FIR) and infinite impulse response (IIR) graph filters and show how they can be implemented in a distributed manner. To further limit the communication and computational complexity of such a distributed implementation, we also generalize the state-of-the-art distributed graph filters to filters whose weights show a dependency on the nodes sharing information. These so-called edge-variant graph filters yield significant benefits in terms of filter order reduction and can be used for solving specific distributed optimization problems with an extremely fast convergence. Finally, we will overview how graph filters can be used in deep learning applications involving data sets with an irregular structure. Different types of graph filters can be used in the convolution step of graph convolutional networks leading to different trade-offs in performance and complexity. The numerical results presented in this talk illustrate the potential of graph filters in distributed optimization and deep learning.

    Syllabus

    • — Introduction to graph signal processing
    • — Graph filters and their extensions
    • — Connections to distributed optimization as well as related applications
    • — Connections to deep learning as well as related applications

    References

    • — D. I. Shuman, P. Vandergheynst, and P. Frossard, “Chebyshev polynomial approximation for distributed signal processing,” in IEEE International Conference on Distributed Computing in Sensor Systems and Workshops (DCOSS), 2011, pp. 1–8.
    • — D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains,” IEEE Signal Processing Magazine, vol. 30, no. 3, pp. 83–98, 2013.
    • — A. Sandryhaila and J. M. Moura, “Discrete signal processing on graphs,” IEEE Trans. on Signal Processing, vol. 61, no. 7, pp. 1644–1656, 2013.
    • — M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” in30th Conf. Neural Inform. Process. Syst. Barcelona, Spain: Neural Inform. Process. Foundation, 5-10 Dec. 2016, pp. 3844–3858.
    • — E. Isufi, A. Loukas, A. Simonetto, and G. Leus, “Autoregressive moving average graph filtering,” IEEE Trans. on Signal Processing, vol. 65, no. 2, pp. 274–288, Jan. 2017.
    • — S. Segarra, A. Marques, and A. Ribeiro, “Optimal graph-filter design and applications to distributed linear network operators,” IEEE Trans. on Signal Processing, vol. 65, no. 15, pp. 4117–4131, 1 Aug. 2017.
    • — E. Isufi, A. Loukas, A. Simonetto, and G. Leus, “Filtering random graph processes over random time-varying graphs,” IEEE Trans. on Signal Processing, vol. 65, no. 16, pp. 4406–4421, Aug. 2017.
    • — Ortega, P. Frossard, J. Kovacevic, J. M. F. Moura, and P. Vandergheynst, “Graph signal processing: Overview, challenges and applications,” Proc. IEEE, vol. 106, no. 5, pp. 808–828, May 2018.
    • — F. Gama, A. G. Marques, G. Leus, and A. Ribeiro, “Convolutional neural network architectures for signals supported on graphs,” IEEE Trans. on Signal Processing, vol. 67, no. 4, pp. 1034–1049, Feb. 2019.
    • — J. Liu, E. Isufi, and G. Leus, “Filter design for autoregressive moving average graph filters,” IEEE Trans. on Signal Information Processing and Networking, vol. 5, no. 1, pp. 47–60, Mar. 2019.
    • — M. Coutino, E. Isufi, and G. Leus, “Advances in distributed graph filtering,” IEEE Trans. on Signal Processing, vol. 67,no. 9, pp. 2320–2333, May 2019.
    • — E. Isufi, F. Gama, and A. Ribeiro, “EdgeNets: edge varying graph neural networks,” arXiv:2001.07620v1 [cs.LG], 21 Jan. 2020. [Online]. Available: http://arxiv.org/abs/2001.07620

    Pre-requisites

    Basics in digital signal processing, linear algebra, optimization and machine learning.

    Short Bio

    Geert Leus received the M.Sc. and Ph.D. degree in Electrical Engineering from the KU Leuven, Belgium, in June 1996 and May 2000, respectively. Geert Leus is now an "Antoni van Leeuwenhoek" Full Professor at the Faculty of Electrical Engineering, Mathematics and Computer Science of the Delft University of Technology, The Netherlands. His research interests are in the broad area of signal processing, with a specific focus on wireless communications, array processing, sensor networks, and graph signal processing. Geert Leus received a 2002 IEEE Signal Processing Society Young Author Best Paper Award and a 2005 IEEE Signal Processing Society Best Paper Award. He is a Fellow of the IEEE and a Fellow of EURASIP. Geert Leus was a Member-at-Large of the Board of Governors of the IEEE Signal Processing Society, the Chair of the IEEE Signal Processing for Communications and Networking Technical Committee, a Member of the IEEE Sensor Array and Multichannel Technical Committee, and the Editor in Chief of the EURASIP Journal on Advances in Signal Processing. He was also on the Editorial Boards of the IEEE Transactions on Signal Processing, the IEEE Transactions on Wireless Communications, the IEEE Signal Processing Letters, and the EURASIP Journal on Advances in Signal Processing. Currently, he is the Chair of the EURASIP Technical Area Committee on Signal Processing for Multisensor Systems, a Member of the IEEE Signal Processing Theory and Methods Technical Committee, a Member of the IEEE Big Data Special Interest Group, an Associate Editor of Foundations and Trends in Signal Processing, and the Editor in Chief of EURASIP Signal Processing.



    Andy Liaw
    (Merck Research Labs) [introductory]
    Machine Learning and Statistics: Better together

    Summary

    Machine Learning and Statistics have many intersections, yet there are many distinct differences. In this course, we will examine the differences and similarities to better understand where each side is coming from and where they are going. Based on these understandings we will look at ways that machine learning tasks can be enhanced with statistical thinking as well as methods. Finally we will learn about how these methods and tools are used in real life with examples drawn from pharmaceutical research and development areas.

    Syllabus

    • ⇲ ML and Statistics: similarities and differences
    • ⇲ Beyond prediction: estimating uncertainty
    • ⇲ Beyond prediction: interpreting models and predictions
    • ⇲ Example applications in pharmaceutical research and development

    References

    Pre-requisites

    Introductory-level Machine Learning, Basic Statistics

    Short Bio

    Andy Liaw has been doing research and applying Statistics and Machine Learning methods to drug discovery areas such as high throughput screening, pharmacology, cheminformatics, proteomics, and biomarkers for the past 20 years. He is the author of the R package randomForest and had made several contributions to the open source R software for Statistics and Data Science. He is currently Senior Principal Scientist in Merck Research Laboratories. He received his Ph.D. in Statistics from Texas A&M University.



    Abdelrahman Mohamed
    (Facebook AI Research) [introductory/advanced]
    Recent Advances in Automatic Speech Recognition [VIRTUAL]

    Summary

    Smart speakers, phone assistance, and voice-activated IoT devices are becoming pervasive around the world. Research efforts in Automatic Speech Recognition (ASR) over the past 50 years enabled these successes. This course describes the theory and practice of building modern speech recognition systems with three widely adopted ASR training approaches. In addition, recent advances in speech representation learning for ultra low-resource ASR training will be discussed in detail.

    Syllabus

    • Part 1: Anatomy of a speech recognition system
    • Part 2: Three approaches for ASR training
    • Part 3: Self-supervised speech representation learning

    References:

    • Jurafsky and Martin "Speech and Language Processing" 3rd edition, Chapter 26, 2020,
    • Hannun, "Sequence Modeling with CTC", Distill, 2017.
    • Graves, "Sequence Transduction with Recurrent Neural Networks", 2012 van den Oord et. al, "Representation Learning with Contrastive Predictive Coding"
    • Baevski et. al, "wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations"
    • Hsu et. al, "HuBERT: Self-Supervised Speech Representation Learning By Masked Prediction of Hidden Units"

    Pre-requisites

    Familiarity with linear algebra, probability and statistics, elementary machine learning.

    Short Bio

    Abdelrahman Mohamed is a research scientist at Facebook AI Research (FAIR). Before FAIR, he was a principal scientist/manager in Amazon Alexa and a researcher in Microsoft Research. Abdelrahman received his Ph.D. from the University of Toronto, working with Geoffrey Hinton and Gerald Penn as part of the team that started the Deep Learning revolution in Spoken Language Processing in 2009. He is the recipient of the IEEE Signal Processing Society Best Journal Paper Award for 2016. His research interests span Deep Learning, Spoken Language Processing, and Natural Language Understanding. Abdelrahman has been focusing lately on improving learned speech representations through weakly-, semi-, and self-supervised learning.
    Webpage: https://research.fb.com/people/mohamed-abdelrahman/ Google Scholar: https://scholar.google.ca/citations?user=tJ_PrzgAAAAJ&hl=en



    Hermann Ney
    (RWTH Aachen University) [intermediate/advanced]
    Speech Recognition and Machine Translation: From Statistical Decision Theory to Machine Learning and Deep Neural Networks

    Summary

    The last 40 years have seen a dramatic progress in machine learning and statistical methods for speech and language processing like speech recognition, handwriting recognition and machine translation. Many of the key statistical concepts had originally been developed for speech recognition and language translation. Examples of such key concepts are the Bayes decision rule for minimum error rate and sequence-to-sequence processing using approaches like the alignment mechanism based on hidden Markov models and the attention mechanism based on neural networks. Recently the accuracy of speech recognition and machine translation could be improved significantly by the use of artificial neural networks and specific architectures, such as deep feedforward multi-layer perceptrons and recurrent neural networks, attention and transformer architectures. We will discuss these approaches in detail and how they form part of the probabilistic approach.

    Syllabus

    • Part 1: Statistical Decision Theory, Machine Learning and Neural Networks.
    • Part 2: Speech Recognition (Time Alignment, Hidden Markov models, sequence-to-sequence processing, neural nets, attention models).
    • Part 3: Machine Translation (Word Alignment, Hidden Markov models, sequence-to-sequence processing, neural nets, attention models).

    References

      • Bourlard, H. and Morgan, N., Connectionist Speech Recognition - A Hybrid Approach, Kluwer Academic Publishers, ISBN 0-7923-9396-1, 1994.
      • L. Deng, D. Yu: Deep learning: methods and applications. Foundations and Trends in Signal Processing, Vol. 7, No. 3–4, pp. 197-387, 2014.
      • D. Jurafsky, J. H. Martin: Speech and Language Processing. Third edition draft, pdf; August 28, 2017.
      • Y. Goldberg: Neural Network Methods in Natural Language Processing. Morgan & Claypool Publishers, Draft, pdf; August 2016.
      • P. Koehn: Statistical Machine Translation, Cambridge University Press, 2010. In addition: Draft of Chapter 13: Neural Machine Translation, pdf, September 22, 2017.

    Pre-requisites

    Familiarity with linear algebra, numerical mathematics, probability and statistics, elementary machine learning.

    Short Bio

    Hermann Ney is a full professor of computer science at RWTH Aachen University, Germany. His main research interests lie in the area of statistical classification, machine learning, neural networks and human language technology and specific applications to speech recognition, machine translation and handwriting recognition.

    In particular, he has worked on dynamic programming and discriminative training for speech recognition, on language modelling and on machine translation. His work has resulted in more than 700 conference and journal papers (h-index 102, 60000+ citations; estimated using Google scholar). He and his team contributed to a large number of European (e.g. TC-STAR, QUAERO, TRANSLECTURES, EU-BRIDGE) and American (e.g. GALE, BOLT, BABEL) large-scale joint projects.

    Hermann Ney is a fellow of both IEEE and ISCA (Int. Speech Communication Association). In 2005, he was the recipient of the Technical Achievement Award of the IEEE Signal Processing Society. In 2010, he was awarded a senior DIGITEO chair at LIMIS/CNRS in Paris, France. In 2013, he received the award of honour of the International Association for Machine Translation. In 2016, he was awarded an advanced grant of the European Research Council (ERC).



    Jan Peters
    (Technical University of Darmstadt) [intermediate]
    [VIRTUAL] Robot Learning

    Summary

    In the 1980s, classical robotics already reached a high level of maturity and it was able to produce large factories. For example, cars factories were completely automated. Despite these impressive achievements, unlike personal computers, modern service robots still did not leave the factories and take a seat as robot companions on our side. The reason is that it is still harder for us to program robots than computers. Usually, modern companion robots learn their duties by a mixture of imitation and trial-and-error. This new way of programming robots has a crucial consequence in the field of industry: the programming cost increases, making mass production impossible. However, in research, this approach had a great influence and over the last ten years all top universities in the world conduct research in this area. The success of these new methods has been demonstrated in a variety of sample scenarios: autonomous helicopters learning from teachers complex maneuver, walking robot learning impressive balancing skills, self-guided cars hurtling at high speed in racetracks, humanoid robots balancing a bar in their hand and anthropomorphic arms cooking pancakes. Accordingly, this class serves as an introduction to autonomous robot learning. The class focuses on approaches from the fields of robotics, machine learning, model learning, imitation learning, reinforcement learning and motor primitives. Application scenarios and major challenges in modern robotics will be presented as well. We pay particular attention to interactions with the participants of the lecture, asking multiple question and appreciating enthusiastic students. We also offer a parallel project, the Robot Learning: Integrated Project. It is designed to enable participants to understand robot learning in its full depth by directly applying methods presented in this class to real or simulated robots. We suggest motivated students to attend it as well, either during or after the Robot Learning Class!

    Syllabus

    Contents Robot Model Learning Imitation Learning Optimal Control Robot Reinforcement Learning Policy Search Robot Inverse Reinforcement Learning

    References

    N.A.

    Pre-requisites

    Basic Machine Learning

    Short Bio

    Jan Peters is a full professor (W3) for Intelligent Autonomous Systems at the Computer Science Department of the Technische Universitaet Darmstadt. Jan Peters has received the Dick Volz Best 2007 US PhD Thesis Runner-Up Award, the Robotics: Science & Systems - Early Career Spotlight, the INNS Young Investigator Award, and the IEEE Robotics & Automation Society's Early Career Award as well as numerous best paper awards. In 2015, he received an ERC Starting Grant and in 2019, he was appointed as an IEEE Fellow. Despite being a faculty member at TU Darmstadt only since 2011, Jan Peters has already nurtured a series of outstanding young researchers into successful careers. These include new faculty members at leading universities in the USA, Japan, Germany and Holland, postdoctoral scholars at top computer science departments (including MIT, CMU, and Berkeley) and young leaders at top AI companies (including Amazon, Google and Facebook). Jan Peters has studied Computer Science, Electrical, Mechanical and Control Engineering at TU Munich and FernUni Hagen in Germany, at the National University of Singapore (NUS) and the University of Southern California (USC). He has received four Master's degrees in these disciplines as well as a Computer Science PhD from USC. Jan Peters has performed research in Germany at DLR, TU Munich and the Max Planck Institute for Biological Cybernetics (in addition to the institutions above), in Japan at the Advanced Telecommunication Research Center (ATR), at USC and at both NUS and Siemens Advanced Engineering in Singapore. He has led research groups on Machine Learning for Robotics at the Max Planck Institutes for Biological Cybernetics (2007-2010) and Intelligent Systems (2010-2021).



    José C. Príncipe
    (University of Florida) [intermediate/advanced]
    Cognitive Architectures for Object Recognition in Video

    Summary

    • I-Requisites for a Cognitive Architecture (intermediate)

      • • Processing in space
      • • Processing in time with memory
      • • Top down and bottom processing
      • • Extraction of information from data with generative models
      • • Attention
    • II- Putting it all together (intermediate)

      • • Empirical Bayes with generative models
      • • Clustering of time series with linear state models
    • III- Current work (advanced)

      • • Information Theoretic Autoencoders
      • • Attention Based video recognition
      • • Augmenting Deep Learning with memory

    Short Bio

    Jose C. Principe is a Distinguished Professor of Electrical and Computer Engineering at the University of Florida where he teaches advanced signal processing, machine learning and artificial neural networks (ANNs). He is Eckis Professor and the Founder and Director of the University of Florida Computational NeuroEngineering Laboratory (CNEL) www.cnel.ufl.edu. The CNEL Lab innovated signal and pattern recognition principles based on information theoretic criteria, as well as filtering in functional spaces. His secondary area of interest has focused in applications to computational neuroscience, Brain Machine Interfaces and brain dynamics. Dr. Principe is a Fellow of the IEEE, AIMBE, and IAMBE. Dr. Principe received the Gabor Award, from the INNS, the Career Achievement Award from the IEEE EMBS and the Neural Network Pioneer Award, of the IEEE CIS. He has more than 38 patents awarded over 800 publications in the areas of adaptive signal processing, control of nonlinear dynamical systems, machine learning and neural networks, information theoretic learning, with applications to neurotechnology and brain computer interfaces. He directed 97 Ph.D. dissertations and 65 Master theses. He wrote in 2000 an interactive electronic book entitled “Neural and Adaptive Systems” published by John Wiley and Sons and more recently co-authored several books on “Brain Machine Interface Engineering” Morgan and Claypool, “Information Theoretic Learning”, Springer, “Kernel Adaptive Filtering”, Wiley and “System Parameter Adaption: Information Theoretic Criteria and Algorithms”, Elsevier. He has received four Honorary Doctor Degrees, from Finland, Italy, Brazil and Colombia, and routinely serves in international scientific advisory boards of Universities and Companies. He has received extensive funding from NSF, NIH and DOD (ONR, DARPA, AFOSR).



    Björn W. Schuller
    (Imperial College London) [introductory/intermediate]
    Deep Signal Processing

    Summary

    This course will deal with deep learning for unimodal, multimodal, and multisensorial signal analysis and synthesis. Modalities mainly include audio, video, text, or physiological signals. Methods shown will, however, be applicable to a broad range of further signal types. We will first deal with pre-processing for denoising or dereverberation or package loss concealment. This will be followed by representation learning such as by convolutional neural networks or sequence-to-sequence encoder-decoder architectures as basis for end-to-end learning from raw signals or symbolic representation. Then, we shall discuss modelling for decision making such as by recurrent neural networks with long-short-term memory or gated recurrent units including handling dynamics by connectionist temporal classification. This will also include discussion of the usage of attention on different levels. We will further elaborate on the impact of topologies including multiple targets with shared layers, and how to move towards self-shaping networks in the sense of Automatic Machine Learning. In a last part, we will deal with some practical questions. These include data efficiency, such as by weak supervision with the human in the loop, data augmentation, active and semi-supervised learning, transfer learning, self-learning, or generative adversarial networks. Further, we will have a glance at modelling efficiency such as by squeezing networks. Privacy, trustability, and explainability enhancing solutions will include federated learning, confidence measurement, and diverse means of visualization. The content shown will be accompanied by open-source implementations of according toolkits available on github. Application examples will mainly come from the domains of Affective Computing, and mHealth.

    Syllabus

      1. Pre-Processing and Representation Learning (Signal Enhancement, Package Loss Concealment, CNNs, S2S, end-to-end)
      1. Modelling for Decision Making (Attention, Feature Space Optimisation, RNNs, LSTM, GRUs, CTC, AutoML)
      1. Data and Model Efficiency (GANs, Transfer Learning, Data Augmentation, Weak Supervision, Cooperative Learning, Self-Learning, Squeezing)
      1. Privacy, Trustability, Explainability (e.g., Federated Learning, Confidence Measurement, Visualization)

    References

    The Handbook of Multimodal-Multisensor Interfaces. Vol. 2, S. Oviatt, B. Schuller, P.R. Cohen, D. Sonntag, G. Potamianos, A. Krüger (eds.), 2018

    Pre-requisites

    Basic Machine Learning and Signal Processing knowledge.

    Short Bio

    Björn W. Schuller received his diploma, doctoral degree, habilitation, and Adjunct Teaching Professor in Machine Intelligence and Signal Processing all in EE/IT from TUM in Munich/Germany. He is Full Professor of Artificial Intelligence and the Head of GLAM at Imperial College London/UK, Full Professor and Chair of Embedded Intelligence for Health Care and Wellbeing at the University of Augsburg/Germany, co-founding CEO and current CSO of audEERING – an Audio Intelligence company based near Munich and in Berlin/Germany, and permanent Visiting Professor at HIT/China amongst other Professorships and Affiliations. Previous stays include Full Professor at the University of Passau/Germany, and Researcher at Joanneum Research in Graz/Austria, and the CNRS-LIMSI in Orsay/France. He is a Fellow of the IEEE and Golden Core Awardee of the IEEE Computer Society, Fellow of the ISCA, Fellow of the BCS, President-Emeritus of the AAAC, and Senior Member of the ACM. He (co-)authored 900+ publications (30k+ citations, h-index=83), is Field Chief Editor of Frontiers in Digital Health and was Editor in Chief of the IEEE Transactions on Affective Computing amongst manifold further commitments and service to the community. His 30+ awards include having been honoured as one of 40 extraordinary scientists under the age of 40 by the WEF in 2015. He served as Coordinator/PI in 15+ European Projects, is an ERC Starting Grantee, and consultant of companies such as Barclays, GN, Huawei, or Samsung.



    Sargur N. Srihari
    (University at Buffalo) [introductory]
    Generative Models in Deep Learning

    Summary

    Generative Modeling (GM) refers to building a model of data, p(x), we can sample from, e.g., x is an image. It requires building a distribution of data and latent variables. On the other hand Discriminative Modeling refers to tasks such as regression and classification, which estimate conditional distributions such as p(class|x). Even for prediction, GMs are useful due to: data efficiency and semi-supervised learning, model checking by sampling, and understanding. All GMs represent probability distributions: some allow distribution to be evaluated explicitly, others do not allow distribution to be evaluated but allow sampling. Generative Adversarial Networks (GANs) can learn high-dimensional, complex real data distributions without relying on any assumptions. They can simply generate realistic samples from latent space. This has led to various applications, such as image synthesis, image attribute editing, image translation, domain adaptation and others.

    Syllabus

      1. Definitions of Generative Models
      1. Boltzmann Machines and variants , e.g., RBMs, Deep Belief Nets
      1. Variational Autoencoders
      1. Generative Adversarial Networks (GANs)
      1. GAN variants: Wasserstein, DCGAN, Conditional GAN, f-GAN

    References

    Pre-requisites

    A course on Introductory machine learning covering the main topics such as described in https://cedar.buffalo.edu/~srihari/CSE574/index.html

    Short Bio

    Srihari is a SUNY Distinguished Professor in the Department of Computer Science and Engineering at the University at Buffalo, The State University of New York. He teaches a sequence of three courses in artificial intelligence and machine learning: (i) introduction to machine learning, (ii) probabilistic graphical models and (iii) deep learning. Srihari’s work led to the world’s first automated system for reading handwritten postal addresses. It was deployed by the United States Postal Service saving hundreds of millions of dollars in labor costs. A side-effect was that it led to the task of recognizing handwritten digits to be considered the fruit-fly of AI methods. Srihari also spent a decade developing AI and machine learning methods for forensic pattern evidence such as latent prints, handwriting and footwear impressions. In particular, quantifying the value of handwriting evidence-- to allow presenting such testimony in US courts.

    Srihari's honors include: Fellow of the IEEE, Fellow of the International Association for Pattern Recognition and distinguished alumnus of the Ohio State University College of Engineering Srihari received a B.Sc. from the Bangalore University, a B.E. from the Indian Institute of Science and a Ph.D. in Computer and Information Science from the Ohio State University.



    Johan Suykens
    (KU Leuven) [introductory/intermediate]
    [VIRTUAL] Deep Learning, Neural Networks and Kernel Machines

    Summary.

    Neural networks & Deep learning and Support vector machines & Kernel methods are among the most powerful and successful techniques in machine learning and data driven modelling. Universal approximators and flexible models are available with neural networks and deep learning, while support vector machines and kernel methods have solid foundations in learning theory and optimization theory. In this course we will explain several synergies between neural networks, deep learning, least squares support vector machines and kernel methods. A key role at this point is played by primal and dual model representations and different duality principles. In this way the bigger and unifying picture will be obtained and future perspectives will be outlined.

    A recent example is restricted kernel machines, which connects least squares support vector machines and kernel principal component analysis to restricted Boltzmann machines. New developments on this will be shown for deep learning, generative models, multi-view and tensor based models, latent space exploration, robustness and explainability. It also enables to either work with explicit or implicit feature maps and choose model representations that are tailored to the given problem characteristics such as high dimensionality or large problem sizes.

    Syllabus

    The material is organized into 3 parts:

    • Part I Neural networks, Support vector machines and Kernel methods
    • Part II Restricted Boltzmann machines, Generative restricted kernel machines and Deep learning
    • Part III Deep kernel machines and future perspectives

    In Part I a basic introduction is given to support vector machines (SVM) and kernel methods with emphasis on their artificial neural networks (ANN) interpretations. The latter can be understood in view of primal and dual model representations, expressed in terms of the feature map and the kernel function, respectively. Feature maps may be chosen either explicitly or implicitly in connection to kernel functions. Related to least squares support vector machines (LS-SVM) such characterizations exist for supervised and unsupervised learning, including classification, regression, kernel principal component analysis (KPCA), kernel spectral clustering (KSC), kernel canonical correlation analysis (KCCA), and other. Primal and dual representations are also relevant in order to obtain efficient training algorithms, tailored to the nature of the given application (high dimensional input spaces versus large data sizes). Application examples are given e.g. in black-box weather forecasting, pollution modelling, prediction of energy consumption, and community detection in networks.

    In Part II we explain how to obtain a so-called restricted kernel machine (RKM) representation for least squares support vector machine related models. By using a principle of conjugate feature duality it is possible to obtain a similar representation as in restricted Boltzmann machines (RBM) (with visible and hidden units), which are used in deep belief networks (DBN) and deep Boltzmann machines (DBM). The principle is explained both for supervised and unsupervised learning. Related to kernel principal component analysis a generative model is obtained within the restricted kernel machine framework. Furthermore, we discuss Generative Restricted Kernel Machines (Gen-RKM), a framework for multi-view generation and disentangled feature learning, and compare with Generative Adversarial Networks (GAN) and Variational Autoencoders (VAE). The use of tensor-based models is also very natural within this new RKM framework, and either explicit feature maps (e.g. convolutional feature maps) or implicit feature maps in connection to kernel functions can be used. Latent space exploration with Gen-RKM and aspects of robustness and explainability will be explained.

    In Part III deep restricted kernel machines (Deep RKM) are explained which consist of restricted kernel machines taken in a deep architecture. In these models a distinction is made between depth in a layer sense and depth in a level sense. Links and differences between Deep RKM and stacked autoencoders and deep Boltzmann machines are given. The framework enables to conceive both deep feedforward neural networks (DNN) and deep kernel machines, through primal and dual model representations. Feature maps and related kernel functions are taken then for each of the levels. By fusing the objectives of the different levels (e.g. several KPCA levels followed by an LS-SVM classifier) in the deep architecture, the training process becomes faster and gives improved solutions. Furthermore, deep kernel machines with the incorporation of orthogonality constraints for deep unsupervised learning is explained. Finally, future perspectives and challenges will be outlined.

    References

    • Belkin M., Ma S., Mandal S., To understand deep learning we need to understand kernel learning, Proceedings of Machine Learning Research, 80:541-549, 2018.
    • Bengio Y., Learning deep architectures for AI, Boston: Now, 2009.
    • Bietti A., Mialon G., Chen D., Mairal J., A Kernel Perspective for Regularizing Deep Neural Networks, Proceedings of the 36th International Conference on Machine Learning, PMLR 97:664-674, 2019.
    • Binkowski M., Sutherland D.J., Arbel M., Gretton A., Demystifying MMD GANs, ICLR 2018.
    • Eastwood C., Williams, C.K.I. , A framework for the quantitative evaluation of disentangled representations. In International conference on learning representations, 2018
    • Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y., Generative Adversarial Networks, pp. 2672-2680, NIPS 2014.
    • Goodfellow I., Bengio Y., Courville A., Deep learning, Cambridge, MA: MIT Press, 2016.
    • Hinton G.E., What kind of graphical model is the brain?, In Proc. 19th International Joint Conference on Artificial Intelligence, pp. 1765-1775, 2005.
    • Hinton G.E., Osindero S., Teh Y.-W., A fast learning algorithm for deep belief nets, Neural Computation, 18, 1527-1554, 2006.
    • Houthuys L., Suykens J.A.K., Tensor Learning in Multi-View Kernel PCA, in Proc. of the 27th International Conference on Artificial Neural Networks (ICANN), Rhodes, Greece, pp. 205-215, Oct. 2018.
    • LeCun Y., Bengio Y., Hinton G., Deep learning, Nature, 521, 436-444, 2015.
    • Liu F., Liao Z., Suykens J.A.K., Kernel regression in high dimensions: Refined analysis beyond double descent, International Conference on Artificial Intelligence and Statistics (AISTATS), 649-657, 2021
    • Mall R., Langone R., Suykens J.A.K., Multilevel Hierarchical Kernel Spectral Clustering for Real-Life Large Scale Complex Networks, PLOS ONE, e99966, 9(6), 1-18, 2014.
    • Mehrkanoon S., Suykens J.A.K., Deep hybrid neural-kernel networks using random Fourier features, Neurocomputing, Vol. 298, pp. 46-54, July 2018.
    • Montavon G., Muller K.-R., Cuturi M., Wasserstein Training of Restricted Boltzmann Machines, pp. 3718-3726, NIPS 2016.
    • Pandey A., Schreurs J., Suykens J.A.K., Generative restricted kernel machines: A framework for multi-view generation and disentangled feature learning, Neural Networks, Vol. 135, pp. 177-191, March 2021
    • Pandey A., Schreurs J., Suykens J.A.K., Robust Generative Restricted Kernel Machines using Weighted Conjugate Feature Duality, in Lecture Notes in Computer Science, Proc. of the Machine Learning, Optimization, and Data Science, LOD 2020, Siena, Italy., vol. 12565 of LNCS, Springer, Cham., 2020, pp. 613-624.
    • Pandey A., Fanuel M., Schreurs J., Suykens J.A.K., Disentangled Representation Learning and Generation with Manifold Optimization, arXiv preprint arXiv:2006.07046
    • Pandey A., Schreurs J., Suykens J.A.K., Robust Generative Restricted Kernel Machines using Weighted Conjugate Feature Duality, International Conference on Machine Learning, Optimization, and Data Science (LOD), 2020
    • Salakhutdinov R., Hinton G.E., Deep Boltzmann machines, Proceedings of Machine Learning Research, 5:448-455, 2009.
    • Salakhutdinov R., Learning deep generative models, Annu. Rev. Stat. Appl., 2, 361-385, 2015.
    • Scholkopf B., Smola A., Muller K.-R., Nonlinear component analysis as a kernel eigenvalue problem, Neural Computation, 10:1299-1319, 1998.
    • Scholkopf B., Smola A., Learning with kernels, Cambridge, MA: MIT Press, 2002.
    • Schreurs J., Suykens J.A.K., Generative Kernel PCA, ESANN 2018.
    • Suykens J.A.K., Vandewalle J., Training multilayer perceptron classifiers based on a modified support vector method, IEEE Transactions on Neural Networks, vol. 10, no. 4, pp. 907-911, Jul. 1999.
    • Suykens J.A.K., Vandewalle J., Least squares support vector machine classifiers, Neural Processing Letters, vol. 9, no. 3, pp. 293-300, Jun. 1999
    • Suykens J.A.K., Van Gestel T., De Brabanter J., De Moor B., Vandewalle J., Least squares support vector machines, Singapore: World Scientific, 2002.
    • Suykens J.A.K., Alzate C., Pelckmans K., Primal and dual model representations in kernel-based learning, Statistics Surveys, vol. 4, pp. 148-183, Aug. 2010.
    • Suykens J.A.K., Deep Restricted Kernel Machines using Conjugate Feature Duality, Neural Computation, vol. 29, no. 8, pp. 2123-2163, Aug. 2017.
    • Tonin F., Pandey A., Patrinos P., Suykens J.A.K., Unsupervised Energy-based Out-of-distribution Detection using Stiefel-Restricted Kernel Machine, arXiv preprint arXiv:2102.08443, to appear IJCNN 2021
    • Tonin F., Patrinos P., Suykens J.A.K., Unsupervised learning of disentangled representations in deep restricted kernel machines with orthogonality constraints, arXiv preprint arXiv:2011.12659
    • Vapnik V., Statistical learning theory, New York: Wiley, 1998.
    • Winant D., Schreurs J., Suykens J.A.K., Latent Space Exploration Using Generative Kernel PCA, Communications in Computer and Information Science, vol 1196. Springer, Cham. (BNAIC 2019, BENELEARN 2019), Brussels, Belgium, Nov. 2019, pp. 70-82.
    • Zhang C., Bengio S., Hardt M., Recht B., Vinyals O., Understanding deep learning requires rethinking generalization, ICLR 2017.

    Pre-requisites

    Basics of linear algebra

    Short Bio

    Visit: https://www.esat.kuleuven.be/stadius/person.php?id=16

    Johan A.K. Suykens was born in Willebroek Belgium, May 18 1966. He received the master degree in Electro-Mechanical Engineering and the PhD degree in Applied Sciences from the Katholieke Universiteit Leuven, in 1989 and 1995, respectively. In 1996 he has been a Visiting Postdoctoral Researcher at the University of California, Berkeley. He has been a Postdoctoral Researcher with the Fund for Scientific Research FWO Flanders and is currently a full Professor with KU Leuven. He is author of the books "Artificial Neural Networks for Modelling and Control of Non-linear Systems" (Kluwer Academic Publishers) and "Least Squares Support Vector Machines" (World Scientific), co-author of the book "Cellular Neural Networks, Multi-Scroll Chaos and Synchronization" (World Scientific) and editor of the books "Nonlinear Modeling: Advanced Black-Box Techniques" (Kluwer Academic Publishers), "Advances in Learning Theory: Methods, Models and Applications" (IOS Press) and "Regularization, Optimization, Kernels, and Support Vector Machines" (Chapman & Hall/CRC). In 1998 he organized an International Workshop on Nonlinear Modelling with Time-series Prediction Competition. He has served as associate editor for the IEEE Transactions on Circuits and Systems (1997-1999 and 2004-2007), the IEEE Transactions on Neural Networks (1998-2009), the IEEE Transactions on Neural Networks and Learning Systems (from 2017) and the IEEE Transactions on Artificial Intelligence (from April 2020). He received an IEEE Signal Processing Society 1999 Best Paper Award, a 2019 Entropy Best Paper Award and several Best Paper Awards at International Conferences. He is a recipient of the International Neural Networks Society INNS 2000 Young Investigator Award for significant contributions in the field of neural networks. He has served as a Director and Organizer of the NATO Advanced Study Institute on Learning Theory and Practice (Leuven 2002), as a program co-chair for the International Joint Conference on Neural Networks 2004 and the International Symposium on Nonlinear Theory and its Applications 2005, as an organizer of the International Symposium on Synchronization in Complex Networks 2007, a co-organizer of the NIPS 2010 workshop on Tensors, Kernels and Machine Learning, and chair of ROKS 2013. He has been awarded an ERC Advanced Grant 2011 and 2017, has been elevated IEEE Fellow 2015 for developing least squares support vector machines, and is ELLIS Fellow. He is currently serving as program director of Master AI at KU Leuven.



    Gaël Varoquaux
    (INRIA) [intermediate]
    Representation Learning in Limited Data Settings

    Summary

    The success of deep-learning hinges on intermediate representations: transformations of the data on which statistical learning is easier. Deep architectures can extract very rich and powerful representations, but it needs huge volumes of data. In this course, we will study the fundamentals of simple representations. Simple representations are interesting because they can be learned in limited data settings. We will also use them to provide didactic cases to understand how to build statistical models from data. The goal of the course is to provide the basic mathematical concepts that underly successful representation extracted in limited data settings.

    Syllabus

    — Shallow representations: what and why? — Matrix factorizations and its variants: — From PCA to ICA — Sparse dictionary learning: formulation and efficient solvers — Word vectors demystified — Fisher kernels: vector representations from a data model — Theory: from likelihood to representation — Encoding strings and text — Encoding covariances

    References

    • [1] Hyvärinen, A., & Oja, E. (2000). Independent component analysis: algorithms and applications. Neural networks, 13(4-5), 411-430.
    • [2] Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11(Jan), 19-60.
    • [3] Levy, O., & Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. In Advances in neural information processing systems (pp. 2177-2185).
    • [4] Jaakkola, T., & Haussler, D. (1999). Exploiting generative models in discriminative classifiers. In Advances in neural information processing systems (pp. 487-493).

    Pre-requisites

    — General knowledge of statistical learning — Basic knowledge of probability — Basic knowledge of linear algebra

    Short Bio

    Gaël Varoquaux is a computer-science researcher at Inria. His research focuses on statistical learning tools for data science and scientific inference. He has pioneered the use of machine learning on brain images to map cognition and pathologies. More generally, he develops tools to make machine learning easier, with statistical models suited for real-life, uncurated data, and software for data science. He co-funded scikit-learn, one of the reference machine-learning toolboxes, and helped build various central tools for data analysis in Python. Varoquaux has contributed key methods for learning on spatial data, matrix factorizations, and modeling covariance matrices. He has a PhD in quantum physics and is a graduate from Ecole Normale Superieure, Paris.



    René Vidal
    (Johns Hopkins University) [intermediate/advanced]
    Mathematics of Deep Learning

    Summary

    The past few years have seen a dramatic increase in the performance of recognition systems thanks to the introduction of deep networks for representation learning. However, the mathematical reasons for this success remain elusive. For example, a key issue is that the neural network training problem is nonconvex, hence optimization algorithms are not guaranteed to return a global minima. The first part of this tutorial will overview recent work on the theory of deep learning that aims to understand how to design the network architecture, how to regularize the network weights, and how to guarantee global optimality. The second part of this tutorial will present sufficient conditions to guarantee that local minima are globally optimal and that a local descent strategy can reach a global minima from any initialization. Such conditions apply to problems in matrix factorization, tensor factorization and deep learning. The third part of this tutorial will present an analysis of dropout for matrix factorization, and establish connections

    Syllabus

      1. Introduction to Deep Learning Theory: Optimization, Regularization and Architecture Design
      1. Global Optimality in Matrix Factorization
      1. Global Optimality in Tensor Factorization and Deep Learning
      1. Dropout as a Low-Rank Regularizer for Matrix Factorization

    References

    Pre-requisites

    Basic understanding of sparse and low-rank representation and non-convex optimization.

    Short Bio

    Rene Vidal is a Professor of Biomedical Engineering and the Innaugural Director of the Mathematical Institute for Data Science at The Johns Hopkins University. His research focuses on the development of theory and algorithms for the analysis of complex high-dimensional datasets such as images, videos, time-series and biomedical data. Dr. Vidal has been Associate Editor of TPAMI and CVIU, Program Chair of ICCV and CVPR, co-author of the book 'Generalized Principal Component Analysis' (2016), and co-author of more than 200 articles in machine learning, computer vision, biomedical image analysis, hybrid systems, robotics and signal processing. He is a fellow of the IEEE, IAPR and Sloan Foundation, a ONR Young Investigator, and has received numerous awards for his work, including the 2012 J.K. Aggarwal Prize for "outstanding contributions to generalized principal component analysis (GPCA) and subspace clustering in computer vision and pattern recognition” as well as best paper awards in machine learning, computer vision, controls, and medical robotics.



    Haixun Wang
    (Instacart) [introductory/intermediate]
    Abstractions, Concepts, and Machine Learning

    Summary

    Big data holds the potential to solve many challenging problems, and one of them is natural language understanding. As an example, big data has enabled the breakthrough in machine translation. However, natural language understanding still faces tremendous challenges. It has been shown that in areas such as question answering and conversation, domain knowledge is indispensable. Thus, how to acquire, represent, and apply domain knowledge for text understanding is of critical importance. In this short course, I will focus on understanding short text, which is crucial to many applications. Short texts do not always observe the syntax of a written language. As a result, traditional natural language processing methods cannot be easily applied. Second, short texts usually do not contain sufficient statistical signals to support many state-of-the-art approaches for text processing such as topic modeling. Third, short texts are usually more ambiguous. I will go over various techniques in knowledge acquisition, representation, and inferencing has been proposed for text understanding, and will describe massive structured and semi-structured data that have been made available in the recent decade that directly or indirectly encode human knowledge, turning the knowledge representation problems into a computational grand challenge with feasible solutions insight.

    Syllabus

    -    Big data and statistical inference
    -    The rise and fall of the semantic network 
    -    Knowledge of language
    -    Conceptual knowledge for text understanding
    -    Knowledge Extraction / Acquisition
    -    Knowledge Reasoning / Modeling
    -    Conclusion and Future work

    References

    - Kenneth Church, A Pendulum Swung Too Far, Linguistic Issues in Language Technology – LiLT Volume 2, Issue 4 May 2007
    - Gregory Murphy, The Big Book of Concepts, MIT Press
    - George Lakoff, Women, Fire and Dangerous Things: What Categories Reveal About the Mind, University of Chicago Press (1990)

    Pre-requisites

    Nothing.

    Short Bio

    Haixun Wang is an IEEE fellow, Editor in Chief of the IEEE Data Engineering Bulletin, and a VP of Engineering and Distinguished Scientist at Instacart. Before that, he was a VP of Engineering and Distinguished Scientist at WeWork, where he led the Research and Applied Science division. He was Director of Natural Language Processing at Amazon. Before Amazon, he led the NLP Infra team in Facebook working on Query and Document Understanding. From 2013 to 2015, he was with Google Research, working on natural language processing. From 2009 to 2013, he led research in semantic search, graph data processing systems, and distributed query processing at Microsoft Research Asia. His knowledge base project Probase has created a significant impact in industry and academia. He had been a research staff member at IBM T. J. Watson Research Center from 2000 – 2009. He was Technical Assistant to Stuart Feldman (Vice President of Computer Science of IBM Research) from 2006 to 2007, and Technical Assistant to Mark Wegman (Head of Computer Science of IBM Research) from 2007 to 2009. He received the Ph.D. degree in Computer Science from the University of California, Los Angeles in 2000. He has published more than 150 research papers in refereed international journals and conference proceedings. He served as PC Chair of conferences such as SIGKDD 2021, and he is on the editorial board of journals such as IEEE Transactions of Knowledge and Data Engineering (TKDE) and Journal of Computer Science and Technology (JCST). He won the best paper award in ICDE 2015, 10-year best paper award in ICDM 2013, and best paper award of ER 2009.