Does Your Kanban System Produce Reliable Probabilistic Forecasts? Thin-Tailed vs Fat-Tailed Cycle Time Distributions
When it comes to product management, the most pressing question always seems to be “When will this be done?”. Probabilistic forecasting has been proven to be one of the most reliable methods to make future predictions while achieving maximum accuracy at a low cost. However, it doesn’t come without its challenges.
One of the prerequisites to making accurate probabilistic forecasts is maintaining a stable system with low variability in the delivery times. Whether or not you have achieved that phenomenon is directly exposed by the shape of your cycle time probability distribution.
Let’s dig deeper into what your cycle time distribution can tell you, and identify the most common traps you should avoid.
Cycle Time Analysis
When it comes to cycle time analysis, the Cycle Time Histogram is one of the most powerful tools at your disposal. The chart shows the frequency distribution of your delivery times. The horizontal axis displays your cycle times and the vertical axis shows the number of work items with the same cycle time.
In the histogram above, we can see that this team has completed 29 items in 1 day, 11 items in 2 days, 3 items in 3 days, and so forth.
By analyzing the frequency distribution of your cycle times, you’ll be able to determine whether there is too much variability in your process. A wide spread indicates that your cycle time varies significantly and your workflow is inconsistent. The histogram above displays a fat-tailed distribution. Systems with fat-tailed distributions are unstable and unpredictable.
Reading the Cycle Time Histogram
Let’s dig further into the characteristics of the Cycle Time Histogram.
Cycle Time Averages
Using the histogram, you can read the mean, median and mode average cycle times.
The Mode is the easiest average to calculate – this is the number that appears most often. In this case, a cycle time of 1 day is the mode. Since that’s the most commonly occurring cycle time, if you ask this team how much time they usually need to complete a task, the answer would be 1 day.
The Median shows the middle number of a data set. For example, the median here is 8 days. This means that half of the tasks completed so far have taken LESS than 8 days to be finished. However, the other half have been finished in more than 8 days.
The Mean is the average calculation that you are most likely to be familiar with. This involves adding up all of the values and dividing them by the number of instances in the data set. Here the mean is 16 days.
If there is too much variability in your system, the mean, the median and the mode values will significantly differ from each other. In fat-tailed distributions, the mode is unlikely to move at all, the median will only be affected a little bit and the mean will move considerably to the right, as the tail continues to grow.
Using the averages of your cycle time distribution to make future predictions is a fragile approach.
Firstly, the longer the tail of the distribution, the higher the difference between the averages. Furthermore, you need to perform additional analysis to verify the probability that comes with the mean and the mode averages. You may have a 30% or 50% or 80% chance of meeting that commitment. Even though it feels intuitive, would you commit to the most common delivery time (the mode) if it only comes with a 30% chance of meeting your commitment? Probably not.
Making Probabilistic Forecasts
The dotted vertical lines stretching across the graph are called percentile lines. We use percentiles to establish service level agreements and define the probability of meeting our commitments.
Using the percentiles on your Cycle Time Histogram, you can perform a probabilistic forecast. Essentially you define a range of cycle times and the probability that comes with each of them.
Here is what the probability forecast for this team would look like:
Now, what happens if a customer asks “When will this be done”? The answer will be “there is a 50/50 chance of finishing it within 8 days”. And if you want to give a truly confident answer, it should be “within 86 days”.
If your commitment is about 100 times bigger than the typical time of 1 day and it is 10 times bigger than the 50th percentile, do you think your stakeholders will be happy with your time to market?
Is Your Probability Forecast Reliable?
The accuracy of your forecast strongly depends on the shape of your distribution. In order to decide whether you can rely on your probability forecast, you should determine whether your distribution is thin-tailed or fat-tailed. To do that, simply divide your 98th percentile by your 50th percentile. If the result is greater than or equal to 5.6, this means that your frequency distribution is fat-tailed. If the result is less than 5.6 – it’s a thin-tailed distribution.
Further analysis is required to confirm a thin-tailed distribution. You also need to calculate the ratio between the 98th percentile and the mode. If the result is less than 16, it is a thin-tailed distribution.
Knowing which probability distribution you have in your Kanban system makes a vital difference in planning and risk management. It exposes the likelihood of meeting your customer’s expectations and building a reputation as a trustworthy service provider.
Thin-Tailed or Fat-Tailed Distribution?
Looking back into our initial example, let’s divide the 98th percentile by the 50th percentile. The result is 10.75. This is a fat-tailed distribution, and so it is unreliable and unpredictable.
Let’s analyze the cycle time histogram above. The different averages (the mode, the mean and the median) are very close to each other – 5 days, 6 days and 7 days respectively – and the tail runs to 24 days. So the ratio between the 98th percentile and the 50th percentile is 3.7. The 98th percentile divided by the most popular value (the mode) is 3.14. This is a thin-tailed distribution. This means that there is a low level of variability in the delivery workflow of this team and their system produces reliable probability forecasts.
Probability forecasts provide the transparency that will help you increase your credibility. Nevertheless, the reliability of your prediction will always depend on the stability of your system. Transforming your fat-tailed distribution to a thin-tailed one is a real challenge but it’s essential to meeting your customer’s expectations. In order to improve your predictability, you need to manage the flow of work effectively.
In our Sustainable Predictability digital course, we go deeper into the work management practices that enable stable systems and predictable delivery of customer value. By stabilizing your system, you improve the efficiency and consistency of your workflows which ultimately results in higher customer satisfaction.
Meet the Author
Sonya Siderova is a passionate product manager and a driving force behind Nave, a Kanban analytics suite that helps teams improve their delivery speed through data-driven decision making. When she's not catering to her two little ones, you might find Sonya absorbed in a good heavyweight boxing match or behind a screen crafting a new blog post.