Large-scale machine learning training, in particular, distributed stochastic gradient descent (SGD), needs to be robust to inherent system variability such as unpredictable computation and communication delays. This work considers a distributed SGD framework where each worker node is allowed to perform local model updates and the resulting models are averaged periodically. Our goal is to analyze and improve the true speed of error convergence with respect to wall-clock time (instead of the number of iterations).

# Seminars

## Seminar Series

## Recent Seminars

Machine learning today bears resemblance to the field of aviation soon after the Wright Brothers’ pioneering flights in the early 1900s. It took half a century of aeronautical engineering advances for the ‘Jet Age’ (i.e., commercial aviation) to become a reality. Similarly, machine learning (ML) is currently experiencing a renaissance, yet fundamental barriers must be overcome to fully unlock the potential of ML-powered technology. In this talk, I describe our work to help democratize ML by tackling barriers related to scalability, privacy, and safety.

Much of the prior work on scheduling algorithms for wireless networks focuses on maximizing throughput. However, for many real-time applications, delays and deadline guarantees on packet delivery can be more important than long-term throughput. In this talk, we consider the problem of scheduling deadline-constrained packets in wireless networks, under a conflict-graph interference model. The objective is to guarantee that at least a certain fraction of packets of each link are delivered within their deadlines, which is referred to as delivery ratio.

- 1 of 64
- next ›