I find it amusing that the average daily precipitation in Kitsap County from 2006 through 2009 correlates well with the number of lawyers in the Northern Mariana Islands.
There is also a strong correlation between the per capita consumption of cheese in the United States and the number of people who died by becoming tangled in their bedsheets — at least for the years 1999 to 2009.
As a science writer, I’m constantly reading reports that mention correlations, such as the correlation between smoking and lung cancer or the correlation between global warming and atmospheric carbon dioxide concentration. Finding such correlations is often a key step to explaining important observations, whether close to home or across the universe.
Now Tyler Vigen has flipped the idea of correlations around, looking for correlations between anything and everything, all for the sake of amusement. He calls his new website “Spurious Correlations.”
The examples above are taken from Tyler’s website, which includes a “discover” page that allows you to search out and graph your own correlations from a long list of independent variables. Try it; it’s fun. You can also sign up for an RSS feed to check out a new spurious correlation each day.
Vigen, a geospatial intelligence analyst for the Army National Guard and a graduate student at Harvard Law School, is not a mathematician.
As he tells NPR’s Scott Simon, it is easy to find correlations when the number of data points are quite small. The question becomes whether the correlations are statistically significant — and that’s where Vigen’s spurious correlations become nothing more than a chuckle.
In a YouTube video, Vigen states:
“The purpose of the Spurious Correlations I show is not to say the data is ambiguous and you can interpret it however you want. No! Statistical data can show correlations, and then it’s up to us rational thinkers to establish whether there is actually a connection between the variables or if it’s merely a coincidence.”
Since he is not a statistician, Tyler says he will leave it to others to produce a video to help people understand how to measure statistical significance. One book he recommends is Nate Silver’s “The Signal and the Noise: Why So Many Predictions Fail — but Some Don’t.” Silver, of course, is the guy who earned a reputation for baseball predictions using statistics before he moved into the political world, where he predicted the 2008 presidential election results for 49 of the 50 states.