Upcoming Events

Dec 04
11:00 AM
UTA 7.532
Abstract: A fundamental problem to all data-parallel applications is data locality. There are multiple levels of data locality, as a data block can be accessed in local memory or disk, within a rack or a data center, or across data centers. Scheduling with data locality is an affinity scheduling problem with an explosive number of task types and unknown task arrival rates. As a result, existing algorithms do not apply and the recently proposed JSQ-MaxWeight (Wang et. al. 2014) algorithm for two-level locality is delay-optimal only for a special heavy traffic scenario.

Recent Events

20 Nov 2015

Abstract: The fast Johnson-Lindenstrauss transform has triggered a large amount of research into fast randomized transforms that reduce data dimensionality while approximately preserving geometry. We discuss uses of these fast transforms in three situations. In the first, we use the transform to precondition a data matrix before subsampling, and show how for huge data sets, this leads to great acceleration in algorithms such as PCA and K-means clustering. The second situation reconsiders the common problem of sketching for regression.

18 Nov 2015

AbstractThe Intensive Care Unit (ICU) is playing an expanding role in acute hospital care, but the value of many treatments and interventions in the ICU is unproven, and high-quality data supporting or discouraging specific practices are sparse. Much prior work in clinical modeling has focused on building discriminating models to detect specific coded outcomes (e.g., hospital mortality) under specific settings, or understanding the predictive value of various types of clinical information without taking interventions into account.

16 Nov 2015

Abstract: Ubiquitous sensors generate prohibitively large data sets. Large volumes of such data are nowadays generated by a variety of applications such as imaging platforms and mobile devices, surveillance cameras, social networks, power networks, to list a few. In this era of data deluge, it is of paramount importance to gather only the data that is informative for a specific task in order to limit the required sensing cost, as well as the related costs of storing, processing, or communicating the data.

13 Nov 2015

Abstract: Submodular functions capture a wide spectrum of discrete problems in machine learning, signal processing and computer vision. They are characterized by intuitive notions of diminishing returns and economies of scale, and often lead to practical algorithms with theoretical guarantees.

In the first part of this talk, I will give a general introduction to the concept of submodular functions, their optimization and example applications in machine learning.

06 Nov 2015

Abstract: Many machine learning tasks can be posed as structured prediction, where the goal is to predict a labeling or structured object. For example, the input may be an image or a sentence, and the output is a labeling such as an assignment of each pixel in the image to foreground or background, or the parse tree for the sentence. Despite marginal and MAP inference for many of these models being NP-hard in the worst-case, approximate inference algorithms are remarkably successful and as a result structured prediction is widely used.