Monday, February 11, 2013

Anscombe's Quartet

We often use statistical properties such as "average", "mean", "variance", "std. deviation" during performance measurement of applications/services. Recently a friend of mine pointed out that only relying on calculated stats can be quite misleading. He pointed me to the following article on Wikipedia.

http://en.wikipedia.org/wiki/Anscombe's_quartet

Anscombe's quartet comprises four datasets that have nearly identical simple statistical properties, yet appear very different when graphed.
By just looking at the data sets, its impossible to predict that the graphs would be so different. Only when we plot the data points on a graph, we can see the way the data behaves. Another testimony to the power of data visualization !