Submodular functions model the intuitive notion of diminishing returns. Due to their far-reaching applications, they have been rediscovered in many fields such as information theory, operations research, statistical physics, economics, and machine learning. They also enjoy computational tractability as they can be minimized exactly or maximized approximately. The goal of this talk is simple. We see how a little bit of randomness, a little bit of greediness, and the right combination can lead to pretty good methods for offline, streaming, and distributed solutions.
08 Nov 2019
I will talk about finite sample expressivity, aka memorization power of ReLU networks. Recent results showed (unsurprisingly) that arbitrary input data could be perfectly memorized using a shallow ReLU network with one hidden layer having N hidden nodes. I will describe a more careful construction that trades of width with depth to show that a ReLU network with 2 hidden layers, each with 2*sqrt(N) hidden nodes, can perfectly memorize arbitrary datasets. Moreover, we prove that width of Θ(sqrt(N)) is necessary and sufficient for having perfect memorization power.
06 Sep 2019
- 1 of 62
- next ›