Using the SVD to find the needle in the haystack

Sitting on a data set with too many variables? The SVD can be a valuable tool when you're trying to sift through a large group of continuos variables. The image is CC by Jonas in China.

Sitting with a data set with too many variables? The SVD can be a valuable tool when you’re trying to sift through a large group of continuos variables. The image is CC by Jonas in China.

It can feel like a daunting task when you have a > 20 variables to find the few variables that you actually “need”. In this article I describe how the singular value decomposition (SVD) can be applied to this problem. While the traditional approach to using SVD:s isn’t that applicable in my research, I recently attended Jeff Leek’s Coursera class on Data analysis that introduced me to a new way of using the SVD. In this post I expand somewhat on his ideas, provide a simulation, and hopefully I’ll provide you a new additional tool for exploring data. Continue reading