When I studied at the University of Chicago Booth School of Business, one of my favorite teachers was my statistics lecturer. He gave us some invaluable advice for putting data into a graph: the eyeball test.
Analysis is about comparison. But if you’ve ever tried to look at a list of numbers, you’ll know that comparing them is not as easy as it sounds. It’s easy to get lost in a sea of data, to feel adrift in numbers. The reason for this is simple: we tend to trust our brains over our own eyes. We overthink when we really need to step back, take a deep breath, and get a fresh look.
There’s a simple secret for how to do this.
Graphs help guide our eyes to the key points in data. Together, our eyes and our graphs are the strongest tools we have for comparison. Graphs help differences and relationships in data stand out visually. And one of the best graphs at our disposal is the scatter plot.
Let’s get a little perspective to see how graphs work by asking a question.
Does being rich make you live longer?
What does wealth have to do with life expectancy? Do wealthy people live longer because of improved hygiene and nutrition? Or does an abundance of money inevitably lead to a shorter, lazier life in the lap of luxury? Before collecting any data, it’s actually a good idea to predict the kind of graph we expect.
Without a prediction (or hypothesis), it’s easy to fall into the trap of trusting data too much, believing it’s smarter than we are. It’s easy to just look at a bunch of data and think, “That sounds about right!” But if we set out with certain expectations and the graph turns out differently, we can ask why. This may lead to an unexpected discovery.
Let’s compare the relationship between wealth and lifespan. The below graph uses GDP per capita* for wealth, and average life expectancy at birth for lifespan. The size of each bubble represents the size of each country’s population.
We can clearly see here that, for most countries, there is a linear relationship: average lifespan increases with wealth. This linear relationship is known as correlation or covariance in the world of statistics. The wealthier the country, the longer its people live.
The scatter plot is the king of graphs.
Everyone knows line graphs, pie charts, and bar graphs, which all show changes in one variable. The scatter plot is different. It shows the relationship between two variables – a truly revolutionary invention in the history of graphs.
If we just stare at numbers and ask “Why?” we’ll never see a causal relationship of how one causes another. However, a scatter plot allows us to see this because we can compare two variables so graphically. This is why scatter plots are so popular in science (used for 70 – 80% of graphs, according to some). They also work nicely for concepts such as economies of scale (the larger the scale, the lower the cost) or the experience curve (the more we make, the lower the cost).
The next time you find yourself adrift in data, try putting it into a scatter plot (an easy task with Excel). Then use your built-in secret weapon to interpret the data: your own eyes. Soon, you will no doubt catch sight of the shore.
*GDP per capita here refers to the average income per citizen