Story point estimation is a widely known technique that Scrum teams use to plan their work. However, this approach is not objective and has the potential to hinder your performance. There’s a very simple explanation that we can use to break down what seems to be a paradox.
Whatever you’re using to estimate your work – be it story points, hours or T-shirt sizes – and plan ahead for the next sprint, you should stop doing it once and for all.
Story point estimation is commonly misused by managers, so much so that even the inventor of story points states that using this method to guesstimate delivery times is ‘a weak idea’.
What Makes Story Point Estimation So Flawed
Let’s dig into the problem deeper now.
Whether you’re evaluating the size, the complexity or the time needed to finish your work, you’re still performing an assessment of effort within the scope of the work item. However, in reality, the time required to finish your work will depend entirely on the efficiency of your system. The contribution of your effort in isolation does not dramatically alter your overall delivery time.
Even if your team has plenty of experience and the necessary expertise to understand what is needed to solve a certain problem and predict the effort they need to finish it, you still can’t use their estimation to properly determine the scope of your sprint. This is due to the simple fact that the effort time is just one part of the whole picture. Using story points to estimate your work is irrelevant because effort time does not equate to delivery time.
The main problem with using that approach to plan your sprint is that story points are a measurement of effort. And, in general, the effort that your team makes to finish something only represents between 5% to 40% of your total delivery time.
If we break down the elements that contribute towards the time needed to deliver your work, at least 60% of it consists of waiting time in the system due to dependencies, bottlenecks, expedite requests and plenty of other sources of inefficiency. This is a realm that you can’t predict by intuition.
So, the massive flaw making story points estimation an unreliable approach to use during your Sprint Planning meetings is that it doesn’t take into account the waiting time in your workflow.
Let’s look into a couple of examples.
This is the Cycle Time Breakdown chart of a development team in a high maturity organization, with established management practices focused on continuous improvement.
The chart represents how much time their work spent in the different process states in their workflow. This team has adopted a WIP-limited pull system and their board has been designed with active and queue states for each process step. This practice enables them to measure the amount of time their work is staying and waiting in the system.
We can see that the effort time – which is being tracked in the active states (Development, Testing and Deployment) in their workflow – represents the exceptional 60% of their delivery times. Even though they maintain a high flow-efficiency system, they still report 40% of waiting time. This means that, if this team uses story point estimation to evaluate the work they can handle during the sprint, they will be at least 40% late.
The image above is not something that you’d often come across though. Few businesses have achieved such a high level of flow efficiency.
This team has invested plenty of time and energy into improving the way they manage their work. As a result, they have been able to achieve a predictable and consistent system, where they use their own past performance data to plan their work and make sure they keep their commitments.
Let’s look into this example:
By evaluating the cycle time breakdown for this team, we can see that the time their work spent in the active states is around 60%. However, 45% of this time was blocked time (indicated by the red sections on the chart).
This means that their actual effort time represents just 15% of their total delivery time. There is an 85% waiting time in this system, which means that, if they use story points to estimate their work, their actual delivery time will end up being 6 times longer.
How to Plan Your Work So You Actually Meet Your Commitments
The first step to make an objective decision on how much work your team can handle during the Sprint is to understand that waiting time exists and embrace the variability in your system. Instead of working with story point estimations and relying on pushing for more effort from your team to meet already unrealistic expectations, you may want to explore your capability.
Look into your past performance data to assess how many items you have completed per sprint in the past 3 to 6 months and use this information to determine how much work you can actually deliver.
The Throughput Histogram shows the number of your completed items in a certain period. We can see that the minimum number of items that were being delivered in a sprint was 8 items. The average is 16 items and, in the best case, this team managed to deliver 23 items in a sprint.
You can use this analysis in your next Sprint Planning meeting to look into the amount of work you actually finished and better understand your capacity. In the histogram above, we can see that this team can schedule at least 8 items. This is the absolute minimum for this team. They guarantee that they will deliver 8 items and that commitment comes with at least 95% certainty that this will happen.
Then, the difference between the average (mean) and min throughput is another 8 items (16 items – 8 items). The chance that another 8 items will be completed has dropped down to about 50%.
And finally, the difference between the max and the mean is another 7 items. These additional 7 slots are left with a likelihood of 5% that these will all be finished in the current sprint. Most probably, they will be rescheduled for the next iteration.
Use this information as a guide to determine the PBIs necessary to achieve your sprint goal. The greatest benefit of this approach is that it takes into account all the waiting time in your system and it represents your actual capability to deliver. It not only depicts how many items you’ve managed to finish during your sprints in the past, but it also takes into account all the waiting time in your system.
The Best Way to Improve Your Capability
Now, as a little insider’s tip, if you don’t feel completely delighted by the numbers that you see in the histogram, chances are you’re dealing with a low flow-efficiency system.
Your response should never be to push your team to work harder. That would only marginally improve the effort time in your workflow which would have a tiny impact on your overall performance. Last but not least, it will ultimately reduce your team’s motivation and engagement.
The easiest and cheapest area to focus on is reducing the waiting time in your system. Delve deeper into these sources of inefficiency, and target the obstacles that are slowing you down the most.
In our Sustainable Predictability digital course, we have developed a fully actionable 7-week program to help you identify and eliminate the sources of waiting time. As a result, you will see a drastic increase in your productivity.
Story point estimation won’t enable you to make accurate predictions. It’s as simple as that. Using your past performance data is a much more reliable and objective approach to plan your work and meet your commitments.
Furthermore, by using your past throughput to make accurate data-driven decisions, you’ve already removed some of the inefficiencies hindering your system. Rather than spending the time to evaluate the effort required to finish your work items during your Sprint Planning meetings, you will be significantly relieving that particular burden from your team. Instead, they will be able to spend their time doing what matters the most – delivering customer value.
Story Points Have a Positive Side as Well
The main benefit of using story points to evaluate the effort of your work is being able to spark a conversation around the scope and the complexity of the work that needs to be done. It could become a valid risk dimension when it comes to assessing what work you should prioritize next.
If two new tasks have the same urgency, market opportunity and profitability for example, then the effort of your work items could potentially be a valid consideration when deciding how to sequence your backlog items. You can use the effort of the work as a risk factor so that, if everything else is the same, you can just start with the least complex item. However, beyond this, using story point estimation as a commitment approach is not objective.
If you have the data use the data. If you don’t have the data, collect the data and start using the data.
We hope that the directions we’ve provided are an altogether more reliable alternative to planning your sprints, which will help you not only improve your performance but actually set about improving the reliability of your commitments.
Related posts
5 Comments
Leave a Comment Cancel reply
Meet the Author

Sonya Siderova is a passionate product manager and a driving force behind Nave, a Kanban analytics suite that helps teams improve their delivery speed through data-driven decision making. When she's not catering to her two little ones, you might find Sonya absorbed in a good heavyweight boxing match or behind a screen crafting a new blog post.
I read the article with interest and have to agree that in Kanban story-point estimation does not add much value.
In Scrum though, it can become part of a very acurate and predictable team-velocity measure.
It should however not be used in isolation and is a mere part of a bigger process of getting data and eventual insights.
I found that proper grooming along with story-point estimation and time-based burn-down give you that predictability.
Important note on burn-down though is that it should never be used as a individual performance measure as that introduce a “fear of failure” which immediately removes the honesty and subsequent value of the process.
Conrad, there is no point in looking for a solution to a problem you don’t have. I’ve seen teams that make reliable delivery commitments using their intuition, they just trust their gut feeling and they manage to hit their targets consistently.
My point is, if something works well for you, don’t touch it! If it doesn’t though, it’s worth considering an alternative.
I appreciate this approach and think I agree except for one nagging issue I can’t get past with Kanban, task sizing. I have read a number of articles here and elsewhere but haven’t found the answer. Do you assume each task/item is roughly the same size? My team is DevSecOps and that is not a valid assumption for us and the disparate requests that our team fields. Thanks for any insight!
Paul, you don’t have to slice your tasks into even pieces to make this work. The size of your tasks doesn’t affect the accuracy of your forecast.
We use probabilistic forecasting to make reliable data-driven commitments. And each commitment comes with a number and the probability to hit that target.
Here is an example. Let’s say you want to know how much time you need to deliver a work item, regardless of its size, and let’s say the 85th percentile on your Cycle Time Scatterplot points to 10 days.
Are we saying that the effort needed to complete any task is exactly 10 days? No, that’s not what we are saying.
Are we saying that we will deliver any type of work in exactly 10 days? No, that’s not what we are saying.
What we’re saying is that regardless of which work item we take, we’ll most probably finish it in LESS than 10 days. We commit that it won’t take more than 10 days because we have an 85% certainty that we’ll meet that goal!
Now, if you’re not delighted by the numbers your percentiles point to, this means that your delivery system is not optimized for predictability. If that’s the case, what you need to do is to introduce management practices and process policies that will enable you to manage your work effectively.
This is the only prerequisite to make probabilistic forecasting (and any other type of estimation) work.
Thanks for the response Sonya! Ok, so you are basically never fully ‘committing’ because every estimate has some % chance that the task in question will be an outlier. If we want tighter estimates, then managing task creation to at least the same order of magnitude as what we have seen in the past will help. If we are seeing 10-days @ 85% but some tasks that are 3 hours and others that are 3 weeks we won’t get good estimates for those. However, in aggregate, say for the next 10 stories, we may still be close. Sound right? Thanks!