Multimedia

Abstract: Sampling is a standard approach to big graph analytics. But a good sample need to represent graph properties of interest with a known degree of accuracy. This talk describes a generic tunable sampling framework, graph sample and hold, that applies to graph stream sampling in which edges are presented one at a time, and from which unbiased estimators of graph properties can be produced in post-processing. The talk also describes the performance of the method on various types of graph, including social graphs, amongst others.

Speaker Bio: Nick Duffield is a Professor in the Department of Electrical and Computer Engineering at Texas A&M University. From 2013 until 2014, he was a Research Professor at DIMACS (the Center for Discrete Mathematics and Theoretical Computer Science) at Rutgers University, New Jersey, USA. From 1995 until 2013, he worked at AT&T Labs-Research where he was a Distinguished Member of Technical Staff and an AT&T Fellow.

Prof. Duffield works on the acquisition, analysis and applications of Big Data to communication networks and beyond.

We consider the task of summing (integrating) a non-negative function over a discrete domain, e.g., to compute the partition function of a graphical model. It is known that in a probabilistic approximate sense, summation can be reduced to maximization over random subsets of the domain defined by systems of linear equations modulo 2 (parity constraints). Unfortunately, random parity constraints with many variables are computationally intractable, while random parity constraints with few variables have poor statistical performance.

Speaker Bio: Dimitris Achlioptas joined the Department of Computer Science of UC Santa Cruz in 2006, following his Ph.D. from the University of Toronto and a 7 year stint at Microsoft Research, Redmond. In theory, his expertise lies in the interaction between randomness and computation and his work on that topic has appeared in journals including Nature, Science, and the Annals of Mathematics. For that work has received an NSF CAREER award, a Sloan Fellowship, and an ERC Starting Grant. In practice, he likes to think about scalability questions and holds over twenty US patents on topics ranging from load balancing and cache optimization to web search personalization.

Abstract: Fitting a low-rank matrix to data is a fundamental and widely used primitive in machine learning. For most problems beyond the very basic PCA, theoretically sound methods have overwhelmingly combined statistical models of the data with convex optimization. As the size and dimensionality of data increases, this approach is overly computationally wasteful, not least because it represents an nr dimensional object with n^2 parameters. 
 
In this talk we present several of our recent results in understanding and designing much faster non-convex algorithms, and characterizing their statistical performance.
 
Prof. Sujay Sanghavi Bio: Sujay Sanghavi is an Associate Professor in the Department of Electrical and Computer Engineering at The University of Texas at Austin. Dr. Sanghavi joined UT ECE in July 2009. He got his PhD from UIUC, and a postdoc from MIT. His research interests lie at the intersection of two central challenges in modern systems: large-scale networking and high-dimensional data analysis, with a focus on algorithm design and evaluation. He received the NSF CAREER award in January 2010.

Anonymous messaging platforms, such as Secret, Whisper and Yik Yak, have emerged as important social media for sharing one's thoughts without the fear of being judged by friends, family, or the public. Further, such anonymous platforms are crucial in nations with authoritarian governments, where the right to free expression and sometimes the personal safety of the message author depends on anonymity. Whether for fear of judgment or personal endangerment, it is crucial to keep anonymous the identity of the users who initially posted sensitive messages in these platforms.  

In this talk, we consider two types of adversaries, one who has a snapshot of the spread of the messages at a certain time and another with collaborating spies among the users tracking all the messages that they receive. We pose the problem of designing a messaging protocol that spreads the message fast while keeping the identity of the source hidden from the adversary. We present an anonymous messaging protocol, which we call adaptive diffusion, and show that it spreads fast and achieves (near) optimal performance in obfuscating the source. In the process, we discover interesting properties of the Polya's urn processes for enhancing privacy, and prove a concentration result of Galton-Watson processes to analyze the performance of the proposed protocol. 

Bio:  Sewoong Oh is an Assistant Professor of Industrial and Enterprise Systems Engineering at UIUC. He received his PhD from the department of Electrical Engineering at Stanford University in 2011. Following his PhD, he worked as a postdoctoral researcher at Laboratory for Information and Decision Systems (LIDS) at MIT. He was co-awarded the Kenneth C. Sevcik outstanding student paper award at the Sigmetrics 2010 and the best paper award at the SIGMETRICS 2015. 

Pages