Abstract: The fast Johnson-Lindenstrauss transform has triggered a large amount of research into fast randomized transforms that reduce data dimensionality while approximately preserving geometry. We discuss uses of these fast transforms in three situations. In the first, we use the transform to precondition a data matrix before subsampling, and show how for huge data sets, this leads to great acceleration in algorithms such as PCA and K-means clustering. The second situation reconsiders the common problem of sketching for regression.
Abstract: The Intensive Care Unit (ICU) is playing an expanding role in acute hospital care, but the value of many treatments and interventions in the ICU is unproven, and high-quality data supporting or discouraging specific practices are sparse. Much prior work in clinical modeling has focused on building discriminating models to detect specific coded outcomes (e.g., hospital mortality) under specific settings, or understanding the predictive value of various types of clinical information without taking interventions into account.
Abstract: Ubiquitous sensors generate prohibitively large data sets. Large volumes of such data are nowadays generated by a variety of applications such as imaging platforms and mobile devices, surveillance cameras, social networks, power networks, to list a few. In this era of data deluge, it is of paramount importance to gather only the data that is informative for a specific task in order to limit the required sensing cost, as well as the related costs of storing, processing, or communicating the data.
Abstract: Submodular functions capture a wide spectrum of discrete problems in machine learning, signal processing and computer vision. They are characterized by intuitive notions of diminishing returns and economies of scale, and often lead to practical algorithms with theoretical guarantees.
In the first part of this talk, I will give a general introduction to the concept of submodular functions, their optimization and example applications in machine learning.
Abstract: Many machine learning tasks can be posed as structured prediction, where the goal is to predict a labeling or structured object. For example, the input may be an image or a sentence, and the output is a labeling such as an assignment of each pixel in the image to foreground or background, or the parse tree for the sentence. Despite marginal and MAP inference for many of these models being NP-hard in the worst-case, approximate inference algorithms are remarkably successful and as a result structured prediction is widely used.