In statistics, the correlation coefficient or Pearson’s correlation is a measure the strength and direction of association that exists between two continuous variables. The value of correlation, r, is always between +1 and –1. To interpret its value, see which of the following values your correlation r is closest to:
- Exactly –1. A perfect downhill (negative) linear relationship
- –0.70. A strong downhill (negative) linear relationship
- –0.50. A moderate downhill (negative) relationship
- –0.30. A weak downhill (negative) linear relationship
- 0. No linear relationship
- +0.30. A weak uphill (positive) linear relationship
- +0.50. A moderate uphill (positive) relationship
- +0.70. A strong uphill (positive) linear relationship
- Exactly +1. A perfect uphill (positive) linear relationship
Don’t make the mistake of thinking that a correlation of –1 is a bad thing, indicating no relationship. Just the opposite is true! A correlation of –1 means the data are lined up in a perfect straight line, the strongest negative linear relationship you can get. The “–” (minus) sign just happens to indicate a negative relationship, a downhill line. Most statisticians like to see correlations beyond at least +0.5 or –0.5 before drawing any conclusions. Additional information on Pearson’s correlation can be found here.
Putting it All to Use
This is all great knowledge, but how can we apply this in a business environment. When I worked for a casino, we kept a daily record of Coin In (gross sales), High Temp, Low Temp and Fuel price in an Excel spreadsheet and tried to find a correlation between these data points. Try find the correlation with this data.
With Power BI and R, we can make this association all that much faster and with slicer for interactivity to drill into the data points. I first downloaded the R Correlation Plot custom visual located here and then Excel spreadsheet from above imported into Power BI. I was then able easily identify the correlations in a matter of minutes.
Positive correlations are identified with increasingly dark blue circles, negative correlation is red. The greater the bigger and darker the circle. Selecting “Nov” from the calendar slicer, you will notice that the intersection between “Low Temp” has a positive correlation (blue circle), so the lower the temperature the less you will realize in coin in for the that month.
Selecting the “Jul” from the calendar slicer, and notice that the correlation is different.
In this case, the correlation is zero between temperature and coin in.
Hope this helps you quickly identify business insights in your organization.