Monday, 3 October 2016

Avoiding y-axis issues

The issues with a y-axis that does not start at zero are well known so I won't go into detail on them here. To summarise:
  • A y-axis that doesn't start at zero is problematic with any chart type where the height above the axis encodes data, i.e. bar, column and area charts.
  • Line charts can be used with a non-zero axis, the key being to choose a scale that highlights the variation in the data. The height above the axis is not being read.
Sometimes we will want to plot data that breaks this rule. What do we do when we run into a y-axis problem? My answer is that it is often possible to reshape the data to work around the issue. Let me explain by way of a (very) made up example.

The Haggis is a noble beast*. It spends its life walking around the mountain it was born on in one direction. For that reason the legs on one side of its body are shorter than on the other - perfect for life on a slope. The Clockwise Haggis, with short right legs and long left legs, is native to Scotland and until recently has been thriving. However, the introduction of the continental Anti-clockwise Haggis has started to lead to a decline in numbers.

To illustrate the threat, naturalists produced the following chart:

Whilst the trend is stark - the native species is being replaced by the invader - the scale, with a non-zero minimum, is misleading, suggesting a much greater proportion of Anti-clockwise Haggis than is really the case.

Correcting the scale however, gives us a problem. The trend is all but lost in the greater population size of the Clockwise Haggis.



My solution to this problem is to change the metric that we use to tell the story. What we're really interested in is the relative change in the population of the two species. What happens if we chart that?


Taking this approach makes the worrying trend clear. Each quarter since Q2 2014 has seen the native population decline roughly in proportion to the growth of the introduced species. By changing the metric we chart we're able to tell a clear story without being open to any accusations of creating a misleading picture.

Inevitably, changing the metric will come at the cost of some information, in this case the size of the decline relative to the total population. This is unavoidable. If we feel that the additional information is important then we can consider using a second line chart to show this information.

No comments:

Post a Comment