Skip to main content

Improving Waterfall Processes (a thought experiment)

Some recent discussions about whether Kanban is an agile method or not seems to me to miss the point somewhat. The Kanban method is about improving your process, so whether you end up with an agile process depends on two things:
  1. whether your process was agile to start with
  2. whether "more effective" maps to "more agile"
The first point started me thinking whether I could design a thought experiment by using Little's Law on that anathema of all non-agile processes - Waterfall! (Just say "no" I hear +Karl Scotland say. :-) More on that below, but let's deal with the second point first. 

Agility is the ability to change direction quickly in response to changing circumstances. A few businesses operate in very stable conditions and can optimise their processes to just these conditions. For them increasing throughput is usually the highest priority - agility may not be high on their priority list. However most of us working in knowledge-based disciplines find that the needs of our stakeholders change very rapidly. The ability to track this moving target and deliver new ideas faster is the key to being "more effective". For us agility does map to effectiveness.

So if we've decided being more agile is desirable in our context, how to go about it? Here are two possible approaches:
  1. Look for a checklist of "agile practices" and adopt them (you may need a consultant who's cleverer than me to advise you on which ones must be adopted together and in which order!)
  2. Look for a way to measure your "agility" then see if these measures improve as you make incremental changes to your current practices.
There is value in both approaches of course, since they are not mutually exclusive. However I know I am not alone in seeing failures in agile adoption initiatives where dependent practices are introduced in the wrong order (for example Scrum for team organisation, without automated build and test for regression testing). 

The second approach relies on having some measures of agility. Here are three from a recent David Anderson blog. They are cadences all measured in units of time:
  1. how often can an organization interact with its customers to make selection and prioritization decisions about what to work on next? (queue replenishment period)
  2. how quickly can the work be completed from making a commitment to do it? (lead time)
  3. how often can new completed work be delivered and operational? (release period)
Now let's go back to whether your process was agile to start with. Let's take a fairly "waterfall" process, typical of many larger organisations, and let's see if focusing on just these metrics could provide significant improvements in agility. This thought experiment relies on Little's Law, which we can express as:

Lead Time =  WIP / Delivery Rate, where

Lead Time is the time a work item spends in the process (TIP or time in process is an alternative term)
Delivery Rate is the number of items delivered per unit of time (also called Throughput)
WIP is the number of work items are in the process at any point in time.

The initial waterfall(-ish) process

Our initial scenario is a product development company releasing a new version of a product roughly every 6 months, say 120 working days. It's not as extreme as some waterfall projects where no customer deliveries are made until years after the requirements are defined, but nor is it very agile. 

Let's say this project releases 24 new "Features" (independently releasable functionality) which are selected at the start of the analysis phase for each 6-monthly release. Each Feature can be further broken down to around  10 User Stories. (Or whatever the waterfall equivalent is! Development tasks say.) There are 3 stages in our (very simplified) process: Analysis, Development, and Release. Testing is included in the Development phase (something they must have learned from a passing agile team!).
  • Analysis of an Feature takes a Business Analyst an average of 20 working days. The team has 4 Analysts so 24 Features will take 120 working days to specify.
  • Development of a User Story takes 2 developers (possibly a Programmer and a Tester) 5 elapsed days to complete. With a team of 20 developers, this phase will take around 120 working days to complete the 240 stories.
  • Release typically takes 10 days elapsed regardless of these size of the release. We assume however that development and analysis work may continue in parallel with this phase.
A View of the Features in Process
How do the 3 agility metrics look with this process?
  • queue replenishment period: this is around 120 working days. When a release comes out and the development team start on the next release, the analysts start on the release after that, so its 120 days between each of these requirement selection meetings.
  • lead time: from selection of the 24 Features, through their analysis (120), development (120) and release (10), the wait is 250 working days
  • release cadence: this is the 120 working days between releases
The average Delivery Rate for this process is 0.2 Features per working day.

Let's compare this with the ideal continuous process assuming the same productivity as above but releasing as frequently as possible. Now such a change cannot be made immediately - it's a major mistake to increase the frequency of releases without improving automated build and test for example. It's more important to select smaller incremental steps where the risk of breaking the system is lower but effectiveness (including agility) can be shown to improve. But for the purpose of our thought experiment let's jump forward, assuming the basic numbers are the same, but we've reached a stage where the process can run continuously. Let's also assume the 4 analysts working together could deliver one Feature spec in 5 days rather than the 20 taken by one lone analyst.

WIP Limits for the New Process
A new Feature will be started on average every 5 working days. Dividing our 20 developers into 4 teams will allow each team to deliver separate Features, averaging 20 development days per Feature. The average arrival rate of completed Features will be 1 every 5 days. Unless we can reduce the 10 days required for each release, we would have to release 2 Features at a time, but even so our agility metrics will show massive improvement.
  • queue replenishment cadence: as a new Feature is started every 5 days, we can select Features one at a time every 5 days (down from 120). If the business selected 2 at a time the period would be 10 days.
  • lead time: from selection of each Features, through its analysis (5), development (20) and release (10) the wait is 35 working days (down from 250!)
  • release cadence: this is now down to 10 working days between releases (down from 120). This is the minimum without addressing the release process itself - an obvious place to look for further improvement.
The average Delivery Rate is still 0.2 Features per working day (this is inherent in our assumptions above). 

The conclusion from this experiment? Agility can be measured and it can be improved radically by changing batch processes into continuous ones (or making batches as small as possible). Agile practices are needed to enable effectiveness and improved productivity at low batch sizes, but the pay-off is seen in the reduction in the times between queue replenishments and releases... and most especially in the reduction in lead time.

Footnote on Little's Law calculations:

The improvements in the 3 agile cadences in this thought experiment are based simply on adding up the times suggested by our scenario. We can derive these results using Little's Law instead, which is easier if the numbers don't drop out quite so easily as our simple scenario. First let's re-run the scenarios above.

Here's the initial waterfall process:

Average Delivery Rate = 0.2 Features per day
Average WIP = 24 Features in Analysis + 24 Features in Development + (24 in Release) * 10 days / 120 days
Thus Average WIP = 50 Features

This results in an Average Lead Time = 50 / 0.2 = 250 days

Here's the more continuous process:

Average Delivery Rate = 0.2 Features per day (still)
Average WIP = 1 Feature in Analysis + 4 Features in Development + 2 in Release
Thus Average WIP = 7 Features

This results in an Average Lead Time = 7 / 0.2 = 35 days

What would be the impact if we could reduce the 10 days of the release process down to 1 day (maybe by introducing techniques from Continuous Delivery). What would the new Lead Time be? Is it just 9 days less? Let's see.

With a 1 day release process, we can deliver Features one by one with a release on average every 2.5 days. So the figures become:

Average Delivery Rate = 0.2 Features per day (still our assumption)
Average WIP = 1 Feature in Analysis + 4 Features in Development + (1 in Release) * 1 day / 2.5 days
Thus Average WIP = 5.4 Features

This results in an Average Lead Time = 5.4 / 0.2 = 27 days

The reduction of 8 rather than 9 days is not easy to derive intuitively, proving the worth of Little's Law to continue the thought experiment and gauge the impact of improvements to other aspects of the process.

Why not try a few scenarios yourself?


Popular posts from this blog

"Plan of Intent" and "Plan of Record"

Ron Lichty is well known in the Software Engineering community on the West Coast as a practitioner, as a seasoned project manager of many successful ventures and in a number of SIGs and conferences in which he is active. In spite of knowing Ron by correspondence over a long period of time it was only at JavaOne this year that we finally got together and I'm very glad we did.

Ron wrote to me after our meeting:

I told a number of people later at JavaOne, and even later that evening at the Software Engineering Management SIG, about xProcess. It really looks good. A question came up: It's a common technique in large organizations to keep a "Plan of Intent" and a "Plan of Record" - to have two project plans, one for the business partners and boss, one you actually execute to. Any support for that in xProcess?

Good question! Here's my reply...

There is support in xProcess for an arbitrary number of target levels through what we call (in the process definitions) P…

Does your Definition of Done allow known defects?

Is it just me or do you also find it odd that some teams have clauses like this in their definition of done (DoD)?
... the Story will contain defects of level 3 severity or less only ... Of course they don't mean you have to put minor bugs in your code - that really would be mad - but it does mean you can sign the Story off as "Done"if the bugs you discover in it are only minor (like spelling mistakes, graphical misalignment, faults with easy workarounds, etc.). I saw DoDs like this some time ago and was seriously puzzled by the madness of it. I was reminded of it again at a meet-up discussion recently - it's clearly a practice that's not uncommon.

Let's look at the consequences of this policy. 

Potentially for every User Story that is signed off as "Done" there could be several additional Defect Stories (of low priority) that will be created. It's possible that finishing a Story (with no additional user requirements) will result in an increase in…

Understanding Cost of Delay and its Use in Kanban

Cost of Delay (CoD) is a vital concept to understand in product development. It should be a guide to the ordering of work items, even if - as is often the case - estimating it quantitatively may be difficult or even impossible. Analysing Cost of Delay (even if done qualitatively) is important because it focuses on the business value of work items and how that value changes over time. An understanding of Cost of Delay is essential if you want to maximise the flow of value to your customers.

Don Reinertsen in his book Flow [1] has shown that, if you want to deliver the maximum business value with a given size team, you give the highest priority, not to the most valuable work items in your "pool of ideas," not even to the most urgent items (those whose business value decays at the fastest rate), nor to your smallest items. Rather you should prioritise those items with the highest value of urgency (or CoD) divided by the time taken to implement them. Reinertsen called this appro…