When we calculate expected energy consumption, we don’t expect actual consumption to be an exact match. For any given stream of consumption there will always be differences between expected and actual, and the magnitudes of the error will vary, with a few bigger than the rest. The degree of dispersion will differ from one case to another. I know for example that if I were modelling electricity used in an air compressor using a formula based on air throughout, I would expect relatively little spread between expected and actual values; whereas estimating expected heating fuel consumption with a degree-day-based model I can expect much more random error. This is because compressing air is a simple and deterministic process whereas the relationship between weather and heating-fuel use is complicated by factors like occupant behaviour, changeable patterns of use, unmeasured influences (especially wind), and the proximity or otherwise of the weather station used.

We use the term ‘standard deviation’ (SD) to quantify the dispersion or degree of error in our estimates. I don’t need to go into detail here about how SD is calculated. The key thing is that when we look at our history of expected and actual consumptions, normally two-thirds of the expected values will have been in error by less than one SD. 95% will be within +/- 2 SD. To put it another way, when you see a new actual consumption that exceeds its expected value by 2 SD or more, it’s important because only 2.5% of results will fall outside the +2 SD boundary purely by chance.

I use this as a refinement on the overspend league table. Any overspend that falls within two standard deviations from expected can usually be disregarded, at least on the first occurrence. Of course persistent lower-level excess might still warrant investigation.

This brings me to the question of how accurate my expected-consumption model is. Specifically, there’s a tendency to reject regression models that do not meet a certain threshold in terms of “R-squared” value. If that means nothing to you, don’t worry, because it’s a wrong-headed approach anyway. As far as I am concerned, an expected-consumption model is sound if it uses an appropriate driving factor (or factors) and has a plausible physical form (such as a straight-line relationship when there’s a single driving factor). If a fundamentally sound model proves a bit inaccurate in use, it simply renders it able to detect only relatively large faults. But that’s better than nothing and by working to improve the model you will progressively reduce its SD and make it better at detecting smaller problems.