
Zhi Tian
[intermediate] Communication-Efficient and Robust Distributed Learning
Summary
Many of the traditional AI applications are trained on data gathered in on place. Nowadays, data collection is increasingly done by massive sensing devices or distributed data centers. Transmitting data from all the distributed nodes to a central node not only incurs high communication costs, but also raises major concerns on data privacy and security during transmission and centralized storage. To circumvent these major drawbacks, researchers in both academia and industry are actively pursuing a distributed learning approach of federated learning, in which a number of distributed nodes collaboratively carry out a common learning task based on a shared deep learning model, without sharing their local private raw data. This lecture starts with an introduction of the basic federated learning paradigm, as well as some advanced implementations including federated learning over the air and fully decentralized federated learning in the absence of any central coordinator. It then delves into recent advances in improving the communication-efficiency and robustness of federated learning with provable convergence.
Syllabus
- Introduction to distributed federated learning
- Federated learning over the air
- Decentralized federated learning
- Communication-efficient strategies for federated learning
- Robust measures against byzantine attacks
References
H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agüera y Arcas, “Communication-efficient learning of deep networks from decentralized data,” Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), April 2017. Proceedings of Machine Learning Research (PMLR), pages 1273–1282, 2017.
Weiyu Li, Yaohua Liu, Zhi Tian, and Qing Ling, “Communication-Censored Linearized ADMM for Decentralized Consensus Optimization,” IEEE Transactions on Signal and Information Processing over Networks, vol. 6, no. 1, pp. 18-34, December 2019.
P. Blanchard, E. M. Mhamdi, R. Guerraoui, and J. Stainer. “Machine learning with adversaries: Byzantine tolerant gradient descent,” Advances in Neural Information Processing Systems, 2017.
Pre-requisites
Elementary concepts of linear algebra (vectors, matrices, norms). Basic concepts of machine learning (empirical risk minimization, loss functions, gradient descent).
Short bio
Dr. Zhi Tian is a Professor in the Electrical and Computer Engineering Department of George Mason University, USA, since 2015. Prior to that, she was on the faculty of Michigan Technological University from 2000 to 2014. She served as a Program Director at the US National Science Foundation from 2012 to 2014. Her research interest lies in the areas of distributed machine learning, wireless communications, and statistical signal processing. She was an IEEE Distinguished Lecturer for both the IEEE Communications Society and the IEEE Vehicular Technology Society. She served as Associate Editor for IEEE Transactions on Wireless Communications and IEEE Transactions on Signal Processing. She was General Co-Chair of the 2016 IEEE GlobalSIP Conference. She was a member-of-large of the Board of Governors of the IEEE Signal Processing Society for the term of 2019-2021. She received the IEEE Communications Society TCCN Publication Award in 2018.