Statistical inference of information measures beyond i.i.d.

Monday, October 08, 2018
EER 3.646 - Blaschke Conference Room

Recent years have witnessed significant progress in entropy estimation, in particular in the large alphabet regime. Concretely, there exist efficiently computable information theoretically optimal estimators whose performance with n samples is essentially that of the maximum likelihood estimator with n log(n) samples, a phenomenon termed ``effective sample size boosting''. Generalizations to processes with memory (estimation of the entropy rate) and continuous distributions (estimation of the differential entropy) have remained largely open. This talk is about the challenges behind those generalizations and recent progress in this direction. For estimating the entropy rate of a Markov chain, we show that when the mixing time is not too slow, the information theoretic limit is S^2/log(S) samples, where S is the size of the state space. In contrast, the empirical entropy rate requires S^2 samples to achieve consistency even if the Markov chain is memoryless. We propose a general approach to achieve the S^2/log(S) sample complexity, and illustrate our results through estimating the entropy rate of the English language from the Google 1 Billion Word Dataset. For differential entropy estimation, we characterize the minimax behavior over Lipschitz balls in both the compactly supported case and sub-Gaussian case without assuming the usual assumption that the density is bounded away from zero, and show that a fixed-k nearest neighbor estimator adaptively achieves the minimax rates up to logarithmic factors without knowing the smoothness of the density. The "effective sample size boosting" phenomenon holds in both the Markov case and the case of continuous distributions.


Assistant Professor
University of California, Berkeley

Jiantao Jiao is an Assistant Professor in the Department of Electrical Engineering and Computer Sciences at University of California, Berkeley. He received his B.Eng. degree in Electronic Engineering from Tsinghua University, Beijing, China in 2012, and his M.Sc. and Ph.D. degrees in Electrical Engineering from Stanford University in 2014 and 2018, respectively. His research interests are in statistical machine learning, high-dimensional and nonparametric statistics, mathematical programming, applied probability, information theory, and their applications. He is a co-founder of EduBrain (, building AI systems for education.