High-dimensional Models for Climate Data Analysis

Friday, April 20, 2012

Climate data presents unique challenges for machine learning due to
its spatiotemporal nature and high-dimensionality. In this talk, I
will discuss two applications of high-dimensional modeling for
climate data analysis. The first application is on abrupt change
detection, with emphasis on detecting significant droughts in the
past century. The problem is formalized as a graph-structured
linear program (GSLP), and solved using KL-ADM, a novel parallel
inexact alternating directions method with Bethe entropy based
augmentation. KL-ADM is provably guaranteed to solve GSLPs, and is
efficient in practice. When applied to precipitation data over the
past century, it detects all major droughts worldwide. The second
application is on predictive modeling of land variables based on
ocean variables. The problem is formalized as a high-dimensional
regression problem with hierarchical sparse regularization.
Consistency and rates-of-convergence of the estimator will be
discussed. When applied to land temperature and precipitation
regression problems from 9 different regions worldwide, the
proposed methodology is shown to consistently outperform baseline