Story sizing into even pieces is a widely-spread activity considered as a prerequisite to making reliable future predictions. The concept of splitting your work items into even pieces artificially to be able to produce an accurate delivery forecast is not valid. In fact, resizing your stories is not only completely irrelevant to forecasting, but it can also have a negative effect on the goals you’re trying to achieve.

Why Sizing Your Stories Artificially Is a Bad Advice

Let’s explore the main reasons why you don’t need to split your items into even parts.

The Importance of Maintaining User Stories that Define Customer Value

Every story should represent a piece of customer value. Naturally, these items will come with different sizes and varying degrees of complexity. By trimming down a story artificially, the concept behind it is no longer relevant. Instead, it has become a measure of hours. The customer’s value has been split into units of time and has thus become meaningless to the customer.

Slicing Down Your Story Unnaturally Introduces Unnecessary Dependencies

Sizing your stories into even pieces creates unnecessary complexity and dependencies. If you break units of value into unnatural segments, there will inevitably be dependencies between these segments. 

Dependencies lead to bottlenecks in your workflow. The more dependencies you introduce, the harder it will become to manage your process effectively.

Every Story Should Be Potentially Shippable

Each developed user story should produce potentially shippable product increments

By trimming down your stories evenly, they are no longer potentially releasable and thus they can’t be delivered to the customer if that is required. 

Splitting Your Stories Only Makes Sense From a Customer Perspective

When it comes to cutting your items into smaller pieces, that goal should always be perceived from a customer perspective and ultimately the new releasable increments won’t be even in size. 

Taking a large user story and splitting its scope into multiple releasable pieces must be an intention. If your user story doesn’t make sense from a customer point of view and doesn’t clearly communicate the goal it has to achieve, you better don’t start it. It will only pile up into the rest of the work in progress and it will get stuck into the workflow. Instead, analyze it properly and extract the potentially shippable part out of it.

The most effective approach to do that is to fill the gap between your customers and your delivery team. You should work with your client to clearly specify the what and why and everyone in the team should understand it. It is the team that figures out the how part. Bring the expertise of everyone to brainstorm and come up with the most feasible option that solves your customer’s problem. 

You still define your new stories in terms of value and the items are still potentially shippable without artificially introducing dependencies in your system. 

The Difference Between Effort Time and Delivery Time

You’re probably wondering how you would make future predictions without slicing your stories into items with the same size. The answer is simple – making accurate delivery forecasts has nothing to do with the size of your items.

Let’s sort out the math problem together. The delivery time of your work depends on way more variables than the time your team actually needs to work on it.

Delivery Time Does Not Equate to Effort Time

When a work item enters your backlog it will spend some time before it gets started. Once in progress, it will have to go through all the process steps. Considering your team works on more than one item at a time, your work will wait until the people responsible for each activity in the workflow have the capacity to start working on it. Furthermore, your item will accumulate more waiting time due to any additional work that comes in between, any bottlenecks, any external blockers, and any defects moving back and forth in the process.

story sizing - effort time vs delivery time

The Negative Effect of Waiting Time

Here at Nave, we’ve analyzed about 10 000 workflows and it turned out that on average 70% of the time, the work is just sitting and waiting in your workflow. It’s not the performance of the team that causes the delays, it’s their inability to move the work down the funnel because of internal or external dependencies. 

Based on that research, our conclusion is that in a low flow efficiency environment, the diversity of your work item sizes have no impact on your delivery times. Improving your delivery speed boils down to how efficiently you manage your workflows to be able to reduce the waiting time to an optimal level. 

In a higher flow efficiency environment, you’d have to pay attention to keeping the working practices and the skills and expertise to the individuals fairly similar to performing reliable delivery predictions. The size of your work items is not a criterion that affects the accuracy of your prediction.

Predicting Your Delivery Times

Performing probabilistic forecasts using your past performance data is one of the most reliable approaches to making future predictions because it takes into account all the components that make your delivery times, including the effort needed to complete your items as well as the waiting time in your system.

The Realm of Probability Forecasting

Let’s explore the approach of making reliable future predictions without trimming down your user stories into even pieces. The trick is to analyze what has happened in the past and base your prediction on your historical performance data.

You don’t have to split your stories into similar sizes to produce a reliable forecast. What you need is having a clear classification of your items by their priority and making sure they follow the established process policies.

Your past performance lays down on your Cycle Time Histogram. The chart shows the frequency distribution of the delivery times of the tasks in your workflow. The power of this diagram is that it represents the variability in your delivery system.

story sizing - thin-tailed distribution

In order to identify whether your distribution is thin-tiled or fat-tailed, simply divide your 98th percentile by your 50th percentile. If the result is greater or equal to 6, this means that your frequency distribution is fat-tailed. If the result is less than 6 – it’s a thin-tailed distribution. 

Let’s analyze the cycle time histogram above. The different averages – the mode, the mean and the median are very close to each other – 1 day, 2 days and 3 days and the tail runs to about 11 days. So the ratio between the most popular value and the 98th percentile is 5.5. This is a thin-tailed distribution. This means that there is a low level of variability in the delivery workflow of this team. Thin-tailed distributions depict good predictability and shorter or no delays.

The dotted vertical lines stretching across the graph are called percentile lines. We use percentiles to establish service level agreements and define the probability of different commitment points being met. 

The priority of your items will be represented by classes of service (CoS). You should filter your data by CoS. It is highly likely that the 85th percentile for Standard tasks comes with a different cycle time than the 85th percentile for Expedites for example. That way you can provide different SLAs for different work items you’re committing.

By looking at the histogram, we now can say that we can deliver any item with a Standard priority in LESS than 6 days with an 85% certainty and LESS than 11 days with a 98% certainty.

story sizing - thin-tailed distribution - cycle time breakdown

If you look into the cycle time breakdown of a team for Standard items, you will see that the effort time tracked in the active states in the workflow represents about 60% of their delivery times and even though their stories have different sizes, they manage a stable system and make their commitments with high confidence.

The Challenge to Making Accurate Delivery Predictions

Now let’s explore the cycle time histogram below exposing the frequency distribution for items with Standard priority.

story sizing - fat-tailed distribution

The first line here points to 1 day. That’s the mode in this cycle time distribution, it represents what’s happening in the most common scenario. This means that if you had that distribution and someone asked you when something would be done, the most popular response would be in less than a day. The 50th percentile points to 9 days. So in half the time, you actually delivered in less than 9 days. 

However, the mean or the average of the data is 22 days. The tail of that frequency distribution runs to 98 days. In other words, the longest time that was needed to finish a ticket (excluding the outliers) is about 100 times bigger than the typical time of 1 day. And it’s 10 times bigger than the 50th percentile. 

This is a fat-tailed distribution. Fat-tailed distributions mean poor predictability and potentially high impact from long delays. Fat-tailed distributions are fragile. If you’ve been asked “When will this be done” and you want to be truly confident, your answer should be in less than 98 days.

If you have a fat-tailed distribution and you’re maintaining an unstable system, any approach to making predictions will be unreliable. 

story sizing - fat-tailed distribution - cycle time breakdownLooking into the cycle time breakdown for this team, we can see that the time their work spent in the active states is around 60%, 45% of which it was blocked time (the red sections on the chart). This means that their actual effort time represents 15% of their total delivery time.

The accuracy of your probability forecast doesn’t depend on the size of your work items. It depends on the stability of your delivery workflow.

In a stable system, the most important factor will be the priority of the items. If your system is optimized for predictability, and multiple stories with different sizes are started, they will strictly follow their priority order. Smaller low priority items won’t be able to borrow time from bigger more complex tasks with a higher priority. The smaller items will have to wait in the workflow until the more urgent items are completed first. 

If your team is not able to start new work as the WIP limit has been reached, they will have to collaborate with each other and “swarm” outstanding tasks to complete them faster. The focus is moved to the impediments in the system and their prompt resolution to enable a smooth flow of work. 

In our Sustainable Predictability digital course, we go deeper into the approaches to optimize your system for predictability and we explore the methods and the tools to perform accurate delivery predictions in great detail.

Evaluating the size of your stories is a great approach to spark a conversation around the goal a certain item should achieve. Nevertheless, this approach is irrelevant to performing future predictions. Making reliable commitments and keeping these commitments is tightly coupled with how efficiently you manage the flow of work and ultimately how predictable your system is.

1 Star2 Stars3 Stars4 Stars5 Stars How helpful is this article? 5 votes