How to Make Reliable Probabilistic Forecasts Without Sizing Your Stories Into Even Pieces
Story sizing into even pieces is a widely-spread activity considered as a prerequisite to making reliable future predictions. The concept of splitting your work items into even pieces artificially to be able to produce an accurate delivery forecast is not valid. In fact, resizing your stories is not only completely irrelevant to forecasting, but it can also have a negative effect on the goals you’re trying to achieve.
Why Sizing Your Stories Artificially Is a Bad Advice
Let’s explore the main reasons why you don’t need to split your items into even parts.
The Importance of Maintaining User Stories that Define Customer Value
Every story should represent a piece of customer value. Naturally, these items will come with different sizes and varying degrees of complexity. By trimming down a story artificially, the concept behind it is no longer relevant. Instead, it has become a measure of hours. The customer’s value has been split into units of time and has thus become meaningless to the customer.
Slicing Down Your Story Unnaturally Introduces Unnecessary Dependencies
Sizing your stories into even pieces creates unnecessary complexity and dependencies. If you break units of value into unnatural segments, there will inevitably be dependencies between these segments.
Dependencies lead to bottlenecks in your workflow. The more dependencies you introduce, the harder it will become to manage your process effectively.
Every Story Should Be Potentially Shippable
Each developed user story should produce potentially shippable product increments.
By trimming down your stories evenly, they are no longer potentially releasable and thus they can’t be delivered to the customer if that is required.
Splitting Your Stories Only Makes Sense From a Customer Perspective
When it comes to cutting your items into smaller pieces, that goal should always be perceived from a customer perspective and ultimately the new releasable increments won’t be even in size.
Taking a large user story and splitting its scope into multiple releasable pieces must be an intention. If your user story doesn’t make sense from a customer point of view and doesn’t clearly communicate the goal it has to achieve, you better don’t start it. It will only pile up into the rest of the work in progress and it will get stuck into the workflow. Instead, analyze it properly and extract the potentially shippable part out of it.
The most effective approach to do that is to fill the gap between your customers and your delivery team. You should work with your client to clearly specify the what and why and everyone in the team should understand it. It is the team that figures out the how part. Bring the expertise of everyone to brainstorm and come up with the most feasible option that solves your customer’s problem.
You still define your new stories in terms of value and the items are still potentially shippable without artificially introducing dependencies in your system.
The Difference Between Effort Time and Delivery Time
You’re probably wondering how you would make future predictions without slicing your stories into items with the same size. The answer is simple – making accurate delivery forecasts has nothing to do with the size of your items.
Let’s sort out the math problem together. The delivery time of your work depends on way more variables than the time your team actually needs to work on it.
Delivery Time Does Not Equate to Effort Time
When a work item enters your backlog it will spend some time before it gets started. Once in progress, it will have to go through all the process steps. Considering your team works on more than one item at a time, your work will wait until the people responsible for each activity in the workflow have the capacity to start working on it. Furthermore, your item will accumulate more waiting time due to any additional work that comes in between, any bottlenecks, any external blockers, and any defects moving back and forth in the process.
The Negative Effect of Waiting Time
Here at Nave, we’ve analyzed about 10 000 workflows and it turned out that on average 70% of the time, the work is just sitting and waiting in your workflow. It’s not the performance of the team that causes the delays, it’s their inability to move the work down the funnel because of internal or external dependencies.
Based on that research, our conclusion is that in a low flow efficiency environment, the diversity of your work item sizes have no impact on your delivery times. Improving your delivery speed boils down to how efficiently you manage your workflows to be able to reduce the waiting time to an optimal level.
In a higher flow efficiency environment, you’d have to pay attention to keeping the working practices and the skills and expertise to the individuals fairly similar to performing reliable delivery predictions. The size of your work items is not a criterion that affects the accuracy of your prediction.
Predicting Your Delivery Times
Performing probabilistic forecasts using your past performance data is one of the most reliable approaches to making future predictions because it takes into account all the components that make your delivery times, including the effort needed to complete your items as well as the waiting time in your system.
The Realm of Probability Forecasting
Let’s explore the approach of making reliable future predictions without trimming down your user stories into even pieces. The trick is to analyze what has happened in the past and base your prediction on your historical performance data.
You don’t have to split your stories into similar sizes to produce a reliable forecast. What you need is having a clear classification of your items by their priority and making sure they follow the established process policies.
Your past performance lays down on your Cycle Time Histogram. The chart shows the frequency distribution of the delivery times of the tasks in your workflow. The power of this diagram is that it represents the variability in your delivery system.
In order to identify whether your distribution is thin-tiled or fat-tailed, simply divide your 98th percentile by your 50th percentile. If the result is greater or equal to 6, this means that your frequency distribution is fat-tailed. If the result is less than 6 – it’s a thin-tailed distribution.
Let’s analyze the cycle time histogram above. The different averages – the mode, the mean and the median are very close to each other – 1 day, 2 days and 3 days and the tail runs to about 11 days. So the ratio between the most popular value and the 98th percentile is 5.5. This is a thin-tailed distribution. This means that there is a low level of variability in the delivery workflow of this team. Thin-tailed distributions depict good predictability and shorter or no delays.
The dotted vertical lines stretching across the graph are called percentile lines. We use percentiles to establish service level agreements and define the probability of different commitment points being met.
The priority of your items will be represented by classes of service (CoS). You should filter your data by CoS. It is highly likely that the 85th percentile for Standard tasks comes with a different cycle time than the 85th percentile for Expedites for example. That way you can provide different SLAs for different work items you’re committing.
By looking at the histogram, we now can say that we can deliver any item with a Standard priority in LESS than 6 days with an 85% certainty and LESS than 11 days with a 98% certainty.
If you look into the cycle time breakdown of a team for Standard items, you will see that the effort time tracked in the active states in the workflow represents about 60% of their delivery times and even though their stories have different sizes, they manage a stable system and make their commitments with high confidence.
The Challenge to Making Accurate Delivery Predictions
Now let’s explore the cycle time histogram below exposing the frequency distribution for items with Standard priority.
The first line here points to 1 day. That’s the mode in this cycle time distribution, it represents what’s happening in the most common scenario. This means that if you had that distribution and someone asked you when something would be done, the most popular response would be in less than a day. The 50th percentile points to 9 days. So in half the time, you actually delivered in less than 9 days.
However, the mean or the average of the data is 22 days. The tail of that frequency distribution runs to 98 days. In other words, the longest time that was needed to finish a ticket (excluding the outliers) is about 100 times bigger than the typical time of 1 day. And it’s 10 times bigger than the 50th percentile.
This is a fat-tailed distribution. Fat-tailed distributions mean poor predictability and potentially high impact from long delays. Fat-tailed distributions are fragile. If you’ve been asked “When will this be done” and you want to be truly confident, your answer should be in less than 98 days.
If you have a fat-tailed distribution and you’re maintaining an unstable system, any approach to making predictions will be unreliable.
Looking into the cycle time breakdown for this team, we can see that the time their work spent in the active states is around 60%, 45% of which it was blocked time (the red sections on the chart). This means that their actual effort time represents 15% of their total delivery time.
The accuracy of your probability forecast doesn’t depend on the size of your work items. It depends on the stability of your delivery workflow.
In a stable system, the most important factor will be the priority of the items. If your system is optimized for predictability, and multiple stories with different sizes are started, they will strictly follow their priority order. Smaller low priority items won’t be able to borrow time from bigger more complex tasks with a higher priority. The smaller items will have to wait in the workflow until the more urgent items are completed first.
If your team is not able to start new work as the WIP limit has been reached, they will have to collaborate with each other and “swarm” outstanding tasks to complete them faster. The focus is moved to the impediments in the system and their prompt resolution to enable a smooth flow of work.
In our Sustainable Predictability digital course, we go deeper into the approaches to optimize your system for predictability and we explore the methods and the tools to perform accurate delivery predictions in great detail.
Evaluating the size of your stories is a great approach to spark a conversation around the goal a certain item should achieve. Nevertheless, this approach is irrelevant to performing future predictions. Making reliable commitments and keeping these commitments is tightly coupled with how efficiently you manage the flow of work and ultimately how predictable your system is.
Meet the Author
Sonya Siderova is a passionate product manager and a driving force behind Nave, a Kanban analytics suite that helps teams improve their delivery speed through data-driven decision making. When she's not catering to her two little ones, you might find Sonya absorbed in a good heavyweight boxing match or behind a screen crafting a new blog post.
All too often, teams forget to move the cards across the Kanban board or simply ignore the board altogether. In tod… https://t.co/vMY9eYd8o1Follow
Lasting change happens in increments, over time, each new improvement building on the one before. Learn how applyin… https://t.co/47IiwwnYAJFollow
What makes the difference between an effective daily standup and one that just wastes your team’s time? Our short g… https://t.co/0CzaEMixLcFollow
The key to manage process bottlenecks is to fully recognize but not fight the symptoms. Instead, managers need to a… https://t.co/7wHoIaaw50Follow
Managing realistic goals is definitely a hot topic! Watch the recording of today's session, it will be available ju… https://t.co/wc6j9LZxhWFollow
Businesses with higher levels of team motivation perform better for a simple reason: they appreciate their employee… https://t.co/vQFDQKdyxwFollow
In today’s article, we’ve shed a spotlight on the exceptional success one of our customers enjoyed, after they inte… https://t.co/WMw25w3luCFollow
Our free webinar is fast approaching, and it’s not one you’d want to miss. On October 19th at 4pm UTC, we’ll talk a… https://t.co/2rMiHbLXFiFollow
Learn how to efficiently manage blocked work in Kanban to drive evolutionary change and deliver work in a quicker,… https://t.co/iuyaQh9picFollow
Join us on our free webinar “How to Set and Manage Realistic Goals Using Your Past Performance Data” to learn about… https://t.co/IqTd6jkF9UFollow
The dotted horizontal lines on the Cycle Time Scatterplot are called percentiles. We use percentiles to define the… https://t.co/3s5OFYSBW4Follow
You might have heard that in order to make reliable probabilistic forecasts you need to split your stories into eve… https://t.co/zLW9IqfGEzFollow
Do you want a better view of how long it takes items to complete? Have you ever wondered how to build a cycle time… https://t.co/v2scsUINGOFollow
Take your training and consultancy sessions to a whole new level. With Kanban analytics on their existing tools, yo… https://t.co/xsXfjmWSTzFollow
Start making reliable decisions and eliminating the bottlenecks caused by unclear priorities with a dynamic priorit… https://t.co/AyizsS9cVXFollow
Discover the main cause of failing to deliver on time and explore the 3-step product management guide to meeting yo… https://t.co/4cjO2aAhgzFollow
Immersive charts for your Jira projects. Get more done, faster, with Nave analytics on Jira. Start your free 14-day… https://t.co/ccm86jyPWGFollow
Find inspiration for your Kanban journey by exploring the Kanban conferences in the last 7 years put into a single… https://t.co/NFmdZYdBWBFollow