High Dimension Datasets

Succinct Representations of Big Data: Binary Embeddings

Low distortion embeddings that transform high-dimensional points to low-dimensional space have played an important role in dealing with storage, information retrieval and machine learning problems for modern (large scale) datasets. Xinyang Yi and Profs. Constantine Caramanis and Eric Price develop novel algorithms with the best-known results for this important problem.

Bayesian Sparse Principal Component Analysis

Several real-life high dimension datasets can be reasonably represented as a linear combination of a few sparse vectors. Succinct representation of such data with a few selected variables is highly desirable for such cases. A Bayesian setup is useful because the limitation of knowing a limited number of  high dimensional data points can be alleviated by well-designed domain-specific priors.

Subscribe to RSS - High Dimension Datasets