What is Anscombe’s Quartet?

Purpose of Anscombe’s Quartet

Anscombe’s quartet comprises a set of four datasets, having identical descriptive statistical properties in terms of means, variance, R-squared, correlations, and linear regression lines but having different representations when we scatter plots on a graph.

The datasets were created by the statistician Francis Anscombe in 1973 to demonstrate the importance of visualizing data and to show that summary statistics alone can be misleading.

The four datasets that make up Anscombe’s quartet each include 11 x-y pairs of data. When plotted, each dataset seems to have a unique connection between x and y, with unique variability patterns and distinctive correlation strengths. Despite these variations, each dataset has the same summary statistics, such as the same x and y mean and variance, x and y correlation coefficient, and linear regression line.

Anscombe’s quartet

Anscombe’s Quartet, comprising four datasets with nearly identical summary statistics, underscores the limitations of relying solely on numerical metrics.

This article explores the quartet’s datasets, emphasizing the importance of visualizing data for a comprehensive understanding.

Tags:

#data-science #python #Data Science #Machine Learning #Machine Learning #python

Purpose of Anscombe’s Quartet

What is Anscombe’s Quartet?

Anscombe’s quartet

Similar Reads

Contact Us