After over a year of blogging and conference presentations on the topic - much of which has focused on the technical rather than practical explanations - I want to draw these to a close here with some straightforward summaries for managers of agile teams. While I think most managers will benefit from looking more deeply at why this advice applies (and its limits based on the assumptions underlying it), I'm also clear that a summary of what to do is the most helpful.
My thinking has evolved over the year, since the preparation and publication of the Kanban guide to Kanban - Essential Kanban Condensed (downloadable here). I started this series of blogs in April 2016 to provide more details on the subject than was possible to include in the condensed guide, and since that time I've had the good fortune to discuss Cost of Delay in some depth with some key thinkers on the subject. For this I'm particularly grateful to Don Reinertsen [1,8], David Anderson, Joshua Arnold [4,7], Chris Matts and Dave Snowden [9], who have taken the time at various points this year to teach, cajole, contradict or endorse various things that I was saying. While it is certainly not possible to reconcile all the thoughts and published works of the authors who have used and modified Don Reinertsen's original work on the subject - it is now at least possible to summarise my own (interim) conclusions!
Part 1: Understanding Cost of Delay and its Use in Kanban
Part 2: Delay Cost and Urgency Profiles
Part 3: How to Calculate WSJF
Part 4: WSJF - Should you divide by Lead Time or Size?
Part 5: A "Qualitative" Formula for WSJF?
Part 6: Time is an Asset - Delay is a Cost (this article)
In this article I look first at some key terms we have used. I then ask, and hopefully answer, "why is Cost of Delay important?"; "can Cost of Delay be quantified?"; "when could we use WSJF?"; and finally "what next?".
Terminology
Three useful terms...Unfortunately there is not unanimity on terminology but these ones are important. The first 2 terms below follow Don Reinertsen's work. The third, a term Joshua Arnold used when correcting the dimensionality of SAFe's use of Cost of Delay, was discussed in the previous blog [10]. (Other terms are introduced in the previous blogs and are available in the glossary of Essential Kanban Condensed.)
Cost of Delay (CoD) - the rate of decay of value per period of delay. Units for example could be dollars per week. Due to the confusion possible with the next term in this list, I frequently use Urgency (U) as a synonym for CoD.
Delay Cost - the total loss of value due to a delay of known duration. For example, "The release was delayed by 7 weeks, which resulted in a Delay Cost of $150,000".
Time Criticality - a relative measure of how quickly all the value of an item would be lost. Units are the reciprocal of time. Usually this is used as an informal and relative term. It is useful though to compare the concepts of Time Criticality (which is independent of value) with CoD/Urgency, which quantifies value lost per time. For example eating the lettuce approaching its use-by date in the fridge may have the same Time Criticality, but very different Urgency, compared to paying the final demand on the mortgage on the house!
Three useful graphs:
Benefit Profiles - these show the benefit accrual rate expected from a defined piece of work, plotted against date, for example net pre-tax profits expected per week. There is not unanimity concerning this term. David Anderson frequently refers to these plots as "adoption curves" since for product releases the benefit accrual rate follows the adoption of the product by customers. Joshua Arnold often calls these graphs "urgency profiles", which unfortunately clashes with the definition below. (Since the graphs do not actually reveal urgency - this can only be determined by comparing one benefit profile with a subsequent one incorporating a delay - I would discourage this usage.)
Delay Cost Profiles - these are plots of delay cost against the delay (or release date, if you prefer). The gradient (first derivative) of these curves shows the CoD or Urgency.
Urgency (or CoD) Profiles - these plots should how the Cost of Delay is expected to vary over time. Of particular importance is: where there is a spike in CoD (as for example occurs if there is an external deadline); or where there is a continuing change in CoD (as for example occurs when there is expected loss of market share as well as loss of earning period due to the delay); or where step changes occur (as happens at the start and end of expected earning periods).
See the first blog in this series for examples of these three types of graph.
Why is Cost of Delay Important?
Traditional business cases for new work use estimates of cost and benefit to derive Return on Investment (yield) and a pay-back schedule (duration). However the rate at which value is lost due to delays is not taken into account. As a result we live with the consequences: arbitrary cost cutting by activity type (such as travel bans and contractor “holidays”), a failure to invest to reduce lead time, and an inability to trade-off cost for time. It also means the choice of initiatives undertaken, and the order they are implemented when they cannot or should not be carried out in parallel, is poorly informed. These discussions need accountants and software managers to share vocabulary (and goals)… and ultimately to evolve policies that improve outcomes. Cost of Delay needs to become part of the every day discussions of management.
Can Cost of Delay be Quantified?
My short answer to this is "yes, but it cannot be measured" (apologies to Don Hubbard, who would correct the definition of "measured" that I'm using here and say yes it can!). The point is that Cost of Delay is the difference between: something that cannot be measured until after the project has finished (life-time benefits for the work); and something that cannot be measured at all since it relates to something that will never occur (life-time benefits of the work if it had been released on a different date). Since we are estimating CoD it is worth having this in mind - not to prevent its use but to realise its limitations. Even after the fact we will not have a definitive answer about whether it was right.
However, the estimates of CoD do not have to be perfect, they just have to be better than the alternative. That is why it is useful. The alternatives are very often even poorer quantifications or merely vague assertions.
One further word of caution. Don Reinertsen applied CoD to sizeable initiatives such as new product launches and significant projects. Applying the same technique (Weighted Shortest Job First, or WSJF) to small items such as "epics" or even user stories may be a stretch. The uncertainties involved in estimating value and the rate of loss of value, when considering the life-time benefits of these small items, are likely to result in inaccuracies that invalidate the results.
So when could we use WSJF
If you have read the preceding articles to this blog carefully, you will have noted a number of key assumptions that relate to the validity of the WSJF formula (Cost of Delay Divided by Duration, or CD3). Some of these make WSJF inapplicable for some processes. An example might be where, by continual monitoring of expected delivery dates against the delay cost profile, we can readily and dynamically reorder work schedules to preserve maximum value-delivery. Balancing the right types of work and managing risk by monitoring "last responsible moments" is a way of using CoD without using the WSJF formula, which does not account for variable CoD. Other aspects (such as difficulty in predicting value realization) make the technique inapplicable at smaller scales (for example small work items).
Another aspect which affects the applicability of WSJF, is the nature of the domain where we are seeking to order work. Chris Matts has certainly influenced my thinking in this area. Referencing Dave Snowden's Cynefin model [11], Chris points out that in "complex" domains results are not predictable or plannable. Safe-to-fail experiments lead to better outcomes than pre-planned actions (though not necessarily the "best" outcome, which is unknowable in such domains). So rather than preceding delivery with long periods of analysis and estimation, it is better to deliver smaller items and then choose subsequent items based on customer outcomes. This is an important observation, and coincides of course with many other ways of looking at the problem. Reducing the size of value-bearing work items, and reducing the lead times, so that feedback and response can happen quickly, is the right way to proceed in such domains. To summarise this advice with regards WSJF - don't use it in complex (or unplannable) domains! This is not just applicable to WSJF of course. If you are doing lots of up front planning any way, maybe your domain is not "complex" in this sense - or maybe you should question other practices as well!
So when can we use WSJF? Well not all domains are "unplannable", even domains where continuous feedback and adjustment are required. Not all work items are so small that estimating value and the decay of value is futile or too time-consuming. The most obvious place to look is where Don Reinertsen originally proposed the technique. For sequencing larger plannable items, which due to resource constraints or other reasons, follow each other in sequence. Not only is the WSJF formula useful in such discussions, it provides a means to discuss and manage portfolios of work using financial criteria that our accountants can understand and validate. We need accountants to be involved in this discussion, not least because it the surest way to move management decision making in the direction of greater business agility.
Where next?
WSJF of course is only one aspect of the application of Cost of Delay. The wider use of Cost of Delay in business, including the involvement of accountants in developing sound ways to apply its quantification, will in the end result in better business decisions and improved value delivery. While the minutiae of formulae and assumption validity, and the difficulty of estimating value-delivery and its decay, are certainly roadblocks in improving our understanding and good business practice in this area, the stakes are too high not to persevere towards robust solutions.
If you have persevered through the machinations of this series of blogs, I thank you - and look forward to hearing your feedback and experience reports. The next generation of managers needs clearer explanations of these concepts so they become as well understood as pre-tax profits and balance sheets in driving correct business actions. Accountants and agile managers need to join together to develop the mechanisms and standards that will augment current accounting practice and produce the environment for better business decision making.
References
[1] Donald G. Reinertsen. The Principles of Product Development Flow, (United States: Celeritas Publishing. 2009)
[2] David J. Anderson and Andy Carmichael, Essential Kanban Condensed. (United States: Lean Kanban University Press. 2016)
[3] David J. Anderson. Kanban: Successful Evolutionary Change for Your Technology Business (United States: Blue Hole Press, 2010)
[4] Joshua Arnold and Özlem Yüce. “Using Cost of Delay: Experience Report – Maersk Line.” Black Swan Farming. (2013)
[5] Preston G. Smith and Donald G. Reinertsen. Developing Products in Half the Time. (United States: John Wiley and Sons. 1998)
[6] Ian Carroll. “No Correlation Between Estimated Size and Actual Time Taken.” IanCarroll.com. (2016)
[7] Joshua Arnold. "Qualitative Cost of Delay." Black Swan Farming. (2016)
[8] Donald G. Reinertsen, David J. Anderson and Andy Carmichael. Meeting Minute. (September, 2016)
[9] Kanban Leadership Retreat Dubai, www.leankanban.com (2017)
[10] Magennis, Troy. 2016. Better Backlog Prioritization (from random to lifetime cost of delay). https://github.com/FocusedObjective/FocusedObjective.Resources/raw/master/Canvas%20and%20Forms/Better%20Backlog%20Prioritization.pdf
[11] Greg Brougham. The Cynefin Mini-Book: An Introduction to Complexity and the Cynefin Framework, C4Media for InfoQ. (2015)
References
[1] Donald G. Reinertsen. The Principles of Product Development Flow, (United States: Celeritas Publishing. 2009)
[2] David J. Anderson and Andy Carmichael, Essential Kanban Condensed. (United States: Lean Kanban University Press. 2016)
[3] David J. Anderson. Kanban: Successful Evolutionary Change for Your Technology Business (United States: Blue Hole Press, 2010)
[4] Joshua Arnold and Özlem Yüce. “Using Cost of Delay: Experience Report – Maersk Line.” Black Swan Farming. (2013)
[5] Preston G. Smith and Donald G. Reinertsen. Developing Products in Half the Time. (United States: John Wiley and Sons. 1998)
[6] Ian Carroll. “No Correlation Between Estimated Size and Actual Time Taken.” IanCarroll.com. (2016)
[7] Joshua Arnold. "Qualitative Cost of Delay." Black Swan Farming. (2016)
[8] Donald G. Reinertsen, David J. Anderson and Andy Carmichael. Meeting Minute. (September, 2016)
[9] Kanban Leadership Retreat Dubai, www.leankanban.com (2017)
[10] Magennis, Troy. 2016. Better Backlog Prioritization (from random to lifetime cost of delay). https://github.com/FocusedObjective/FocusedObjective.Resources/raw/master/Canvas%20and%20Forms/Better%20Backlog%20Prioritization.pdf
[11] Greg Brougham. The Cynefin Mini-Book: An Introduction to Complexity and the Cynefin Framework, C4Media for InfoQ. (2015)