Wednesday, July 29, 2015

Beyond Control Charts and Cumulative Flow Diagrams

Control Charts (CCs) and Cumulative Flow Diagrams (CFDs) are powerful ways to display information about a flow system, such as a Scrum or Kanban development process. Unfortunately the very fact that the charts display so much information means that it is often difficult to extract specific information from them. That is why it's useful to also plot some of the key attributes of the systems on their own - this allows us to look at these aspects specifically, alongside the rawer view of the data that you get from CCs and CFDs.

The graphic on the right shows a number of diagrams all of which were derived from very simple data about each item that flowed through this system:
  • when it arrived into the system; 
  • when it departed the system; and
  • whether the item was "delivered" or "discarded".
Note: I use the term "discard" here as a general term to include an exit from the system at any point in the system and for any reason. It includes aborting/abandoning the item after commitment, as well as postponing the item by moving it back to a part of the process upstream from the system under study. For the definition of this and other terms used here please see this Glossary.
The first diagrams in the graphic is the Control Chart - actually this is simply a scatter plot of the time each item stays in the system under study. I refer to this as "Time in Process - TiP - or alternatively "Time in _______" where the blank stands for whatever the process or part of the process is under study. For example it could be the Time in Preparation, Time in Development, Time in Acceptance, etc. The scatter plot highlights (in orange) the items which were not "delivered".

Below it is the CFD. Unlike some very stripy versions, this one has only 3 bands (as limited by the input data), corresponding to arrivals, all departures (including discards), and deliveries.

The remaining diagrams all highlight one or more aspects of the same data. Firstly the terms from Little's Law:
  1. Average Delivery Rate. This is measured in items per week, and the average is taken over 1 week. Note this only shows actually delivered items. Alternatively a plot of "Throughput" could have been used which includes all items that have passed through the system.
  2. Average Time in Process (TiP). This is measured in weeks and again the average is taken over 1 week.
  3. Average Work in Progress (WiP). This is measured in number of items, again averaged over one week. Care must be taken when calculating average WiP for a day, particularly on days when an item arrives in or departs from the system, to ensure that it is consistent with the calculations of average TiP.
In addition to these standard quantities from Little's Law a number of flow balance metrics are shown. These are:
  1. Net Flow. Simply the difference between the number arriving and departing over the previous week.
  2. Delivery Bias. This is a measure of the degree to which Delivery Rate is higher or lower than would be predicted by Little's Law for the given period (1 week in this case). If it is non-zero it indicates away from stability. Further discussion of this quantity is found here.
  3. Flow Debt/Credit. This is a measure of the degree to which the average TiP varies from that predicted by the CFD. This also indicates a degree of instability if it varies significantly from zero. See Dan Vacanti's book [vaca] for further discussion.
  4. Age of WiP Indicator. This compares the average age of the WiP with half the average Tip. It is another indicator of imbalance.
Recently I have been discussing these four quantities with colleagues and with Troy Magennis and Dan Vacanti as they show promise for predicting significant changes in the TiP, a very important aspect of the effectiveness of the system.

A spreadsheet containing the means to generate these diagrams from your data will shortly be made available from gitHub. Watch this space!

References
  • [vaca] Vacanti, Daniel S. "Actionable Agile Metrics for Predictability: An Introduction". LeanPub. (2015)

Friday, July 17, 2015

Postscript on Throughput and Delivery Rate

Most people use Throughput and Delivery Rate in Kanban systems as synonyms - including myself up to this point. I've changed my view however.

The canonical form of Little's Law in Kanban is as follows:
Delivery Rate = WiP / Lead Time    [ande] 
even though these days I more frequently express it this way:
Throughput = WiP / TiP 
Surely these two ways of writing the equation are entirely equivalent aren't they? Well maybe not.
Note: All these terms are defined in my Glossary Proposal (which has recently been updated to include the definition of Throughput). Feedback, comments and references to publications using the terms defined are welcomed and encouraged.
Throughput is the term +Daniel Vacanti uses (among many others), particularly in his excellent new book Actionable Agile Metrics [vaca], and it got me thinking about one of the problems with using Delivery Rate: what about the items which are not delivered but are discarded? If there are a significant number of these Little's Law as expressed in the first equation will not apply, unless we exclude discarded items from calculations of the historical averages for WiP and TiP. All very well except the WiP limits - a crucial control for Kanban systems necessarily contain items that may be discarded in the future.

Without trying to solve this problem, but rather clarify the terminology we use to describe it, I think it is useful to have differing definitions for the 2 terms:
Delivery Rate is the rate at which work items exit the system in a "complete" state (i.e. just delivered items)
Throughput is the rate at which work items exit the system whether discarded (this includes those which move back in the process to a state prior to the system under consideration) or completed (i.e. both delivered and discarded items)
Let me know if you find this distinction useful. Your feedback is essential in honing the Glossary Proposal to one that everyone finds helpful and acceptable.

References

  • [ande] Anderson, David J. Kanban, Blue Hole Press. (2010)
  • [vaca] Vacanti, Daniel S. "Actionable Agile Metrics for Predictability: An Introduction". LeanPub. (2015)

Thursday, July 16, 2015

Little's Inequality

The interesting thing about Little's Law, in spite of its very general applicability, and in certain cases mathematical certainty, there are many cases when it simply does not hold true. In those cases it's not so much Little's Law as Little's Inequality. Could the fact that specific instances of data do not follow Little's Law actually give us useful information about our Kanban and Scrum systems and point to the right kind of interventions to manage and improve their flow? That's what this blog is about.

Flow Metrics for a Kanban System over time
(WiP, TiP, DR, Delivery Bias, Net Flow)

In 1961 John Little published his proof of this general queuing theory equation [litt]:
L=λW 
L is the average number of items in the queue, λ is the average arrival rate, and W the average wait time
Since that time Little's Law has found numerous applications in the study of general flow systems from telecommunications to manufacturing, including in Kanban systems. Because throughput or delivery rate is the more significant attribute for management of such systems (and on average it is approximately equal to arrival rate), it is often expressed as follows:
Delivery Rate = WiP / TiP 
The overline indicates the arithmetic mean, WiP is the number of items in the system, and TiP the "time in process" from entering to leaving the system under consideration. See this Glossary for further explanation of the meaning of TiP, versus Lead Time or Cycle Time (and why I don't use Cycle Time!).
However when we look at data from actual Kanban systems, where the averages are over relatively short periods (say a week or a month), or where average arrival rate does not equal average delivery rate, it is Little's Inequality that applies, not his Law. Juggling the terms above we can express it in this way:
Delivery Rate  ( WiP / TiP )  0    ("Little's Inequality")
This is because if the system itself is trending in some way (technically the system is not stationary), or if the scope of the averages is not so wide that every item that entered has left the system - usually both these conditions are true for the periods we wish to analyse - then Little's Law does not apply exactly.

That might seem to imply Little's Law is not useful to us. However the degree to which the law is not true is very relevant and does give us important management information:
Delivery Rate  ( WiP / TiP ) < 0    More work is being taken on than is being delivered
Delivery Rate  ( WiP / TiP ) = 0    The system is balanced
Delivery Rate  ( WiP / TiP ) > 0    More is being delivered than new work being taken on
This looks like actionable data for managing flow in Kanban Systems - a number that shows bias in the system towards (or away from) delivering. Let's look at the set of graphs in the figure above that demonstrate this. The data set is from +Troy Magennis's SimResources website and uses his data, spreadsheet and some of the graphs he provides to assist with his forecasting models[mage]. You can find more about Troy's work at FocusedObjective.com and also download these spreadsheets to explore what your data reveals about your Kanban or Scrum systems.

Firstly the figure shows graphs of the 3 main variables in Little's Law: WiP, TiP and Delivery Rate, The next graph is a plot of Little's Inequality - labelled Delivery Bias, showing whether it is greater or less than zero at any point in time. Note that the formula above is normalised relative to the overall average delivery rate for the whole dataset (AvDR) so that the range (in this case between -1 and +1) is comparable with other datasets. The final graph in the set shows Net Flow, the difference between items completed and started - it is one of the metrics included in Troy's spreadsheet and it again provides a view of how balanced the system is.

As expected the graphs show a strong correlation between Net Flow and Little's Inequality for most of the range. Clearly if we're starting more than we're finishing we should expect both the inequality and the Net Flow to be negative. What's interesting is where they don't correspond, and why. Look at weeks 2015-11 and 2015-12. Why in week 11 are we finishing more than starting and yet we still have a negative value for Little's Inequality? The clue is in the TiP for these weeks. In week 11 the average TiP is much lower than in week 12. Perhaps this indicates the items closed that week were smaller in size - or maybe they were "expedited" at the expense of other items in progress. When the items that had been in process longer are closed the following week, Little's Inequality indicates more strongly that balance is being restored.

Little's Inequality as expressed above focuses on Delivery Rate, hence the label Delivery Bias. It could be re-expressed to focus on Time In Process as follows:
TiP  WiP / Delivery Rate      
This metric might be labelled Time in Process Bias. We want TiP values to be as low as possible, but a negative value of this metric is likely to indicate an issue to address, since it would indicate that the TiP of the work in progress is likely to be longer than the items recently delivered.

The time an item stays in the process is also the focus of a new metric from +Daniel Vacanti recently published in his Actionable Agile Metrics [vaca] (see also ActionableAgile.com). He also looks at the degree to which a given system follows an ideal flow through a metric he calls "Flow Debt" (roughly translated as delivering more quickly now at the cost of slower times later). Dan prefers the term Cycle Time to Time In Process and so defines Flow Debt as the difference between the "Approximate Average Cycle Time" (AACT) as observed on a Cumulative Flow Diagram and the "Average Cycle Time" (ACT). Comparing these 2 items gives an idea of whether Flow Debt  is being created or not. Flow Debt is accumulating when AACT>ACT. You can calculate AACT by looking at the time since the cumulative arrivals into the system equalled the current cumulative deliveries. ACT is calculated from the arithmetic mean of the actual times for delivered items in the period. Again the degree to which these quantities do not match indicates the degree to which the system is out of balance.

All these metrics - Little's Inequality (or Delivery Bias), Net Flow and Flow Debt - provide insight into the behaviour of Kanban systems based on the degree to which the system follows Little's Law over the period of study. Further experimentation and experience will show the best ways to use them in concert and the best ways to visualise the flow characteristics of the systems and how to intervene to improve them.

If you have data which you would like to analyse using these metrics do let me know. I'm happy to share spreadsheets and advice with anyone who contacts me. Equally check out Troy Magennis's and Dan Vacanti's web sites referenced above for more tools and insights.

References
  • [litt] Little, J. D. C. "A Proof of the Queuing Formula: L = λW," Operations Research, 9, (3) 383-387. (1961) 
  • [mage] Magennis, T. "Forecasting and Simulating Software Development Projects: Effective Modeling of Kanban & Scrum Projects using Monte-carlo Simulation", FocusedObjective.com. (2011)
  • [vaca] Vacanti, Daniel S. "Actionable Agile Metrics for Predictability: An Introduction". LeanPub. (2015)