Few-shot classification, the task of adapting a classifier to unseen classes given a small labeled dataset, is an important step on the path toward human-like machine learning. I will present what I think are some of the key advances and open questions in this area. I will then focus on the fundamental issue of overfitting in the few-shot scenario. Bayesian methods are well-suited to tackling this issue because they allow practitioners to specify prior beliefs and update those beliefs in light of observed data.
Large-scale machine learning training, in particular, distributed stochastic gradient descent (SGD), needs to be robust to inherent system variability such as unpredictable computation and communication delays. This work considers a distributed SGD framework where each worker node is allowed to perform local model updates and the resulting models are averaged periodically. Our goal is to analyze and improve the true speed of error convergence with respect to wall-clock time (instead of the number of iterations).
Machine learning today bears resemblance to the field of aviation soon after the Wright Brothers’ pioneering flights in the early 1900s. It took half a century of aeronautical engineering advances for the ‘Jet Age’ (i.e., commercial aviation) to become a reality. Similarly, machine learning (ML) is currently experiencing a renaissance, yet fundamental barriers must be overcome to fully unlock the potential of ML-powered technology. In this talk, I describe our work to help democratize ML by tackling barriers related to scalability, privacy, and safety.