[intermediate/advanced] Deep Learning Language Models and Causal Inference
Deep learning is a branch of machine learning which is rooted in complex models such as neural networks with many layers (deep). Deep learning models have achieved amazing success in many areas including self-driving cars and natural language processing. Natural language processing (NLP) aims to process and analyze large amounts of natural language data for a computer to understand the contents of the documents. This course will introduce deep learning-based language models and present applications to NLP and causal inference, which is the process to identify the cause of certain effects from the documents.
- Introduction of language models.
- Deep learning language models.
- Deep learning-based NLP and causal inference.
Polosukhin, Illia; Kaiser, Lukasz; Gomez, Aidan N.; Jones, Llion; Uszkoreit, Jakob; Parmar, Niki; Shazeer, Noam; Vaswani, Ashish (2017-06-12). “Attention Is All You Need”.
Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (2018-10-10). “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”.
Brown, Tom B.; Mann, Benjamin; Ryder, Nick; Subbiah, Melanie; Kaplan, Jared; Dhariwal, Prafulla; Neelakantan, Arvind; Shyam, Pranav; Sastry, Girish; Askell, Amanda; Agarwal, Sandhini; Herbert-Voss, Ariel; Krueger, Gretchen; Henighan, Tom; Child, Rewon; Ramesh, Aditya; Ziegler, Daniel M.; Wu, Jeffrey; Winter, Clemens; Hesse, Christopher; Chen, Mark; Sigler, Eric; Litwin, Mateusz; Gray, Scott; Chess, Benjamin; Clark, Jack; Berner, Christopher; McCandlish, Sam; Radford, Alec; Sutskever, Ilya; Amodei, Dario (July 22, 2020). “Language Models are Few-Shot Learners”.
J. Pearl, “The Do-Calculus Revisited”, Keynote Lecture Aug. 17, 2012, UAI2012.
Mathematics and machine learning at the level of an undergraduate degree in computer science: basic multivariate calculus, probability theory, linear algebra, probabilistic graphical models, and neural networks.
Xiaowei Xu, a professor of Information Science at the University of Arkansas, Little Rock (UALR), received his Ph.D. degree in Computer Science at the University of Munich in 1998. Before his appointment in UALR, he was a senior research scientist in Siemens, Munich, Germany. His research spans data mining, machine learning and artificial intelligence. Dr. Xu is a recipient of 2014 ACM SIGKDD Test of Time award for his contribution to the density-based clustering algorithm DBSCAN, which is one of the most commonly used clustering algorithms.