Understanding Sparse JL for Feature Hashing
Speaker: MeenaTitle: Understanding Sparse JL for Feature Hashing
Date: 18 Nov 2019 5:30pm-7:00pm
Location: Maxwell-Dworkin 123
Food: Indian
Audience background: Basic probability background would be helpful
Talk style: Whiteboard talk
Abstract: Johnson-Lindenstrauss (JL) transforms are a standard dimensionality-reduction scheme to project high-dimensional vectors to a lower-dimensional space while preserving Euclidean distances. In fact, “feature hashing”, which is a commonly used technique in machine learning applications, boils down to a JL transform with specific parameter settings. Interestingly, there is a gap between the performance predicted by traditional theory results and the performance that is achievable in practice on feature vectors. In this talk, I will give an overview of traditional results on the performance of JL transforms in the theory literature as well as finer-grained results on the performance of feature hashing in the ML literature. I will then discuss my recent work on analyzing the performance of JL transforms on feature vectors that generalizes/unifies these results.