Monte Carlo Simulation Explained: Everything You Need to Know to Make Accurate Delivery Forecasts
Here is everything you need to know to make reliable delivery commitments in just a couple of minutes: the Monte Carlo simulation explained!
Getting started with Monte Carlo simulations as an alternative approach to making delivery forecasts can be challenging, especially if you’ve been stuck estimating your work using story points (or hours) for quite some time.
Over the past few months, I’ve received a lot of questions from our audience about this topic and so I thought it would be useful to bundle the 10 most frequently asked ones together and answer them for you.
The Monte Carlo Simulation Explained: How to Leverage the Most Reliable Approach to Forecasting
Today, we’ll explore the top 10 frequently asked questions (and answers) about Monte Carlo. Let’s dive in!
What is Monte Carlo simulation?
In project management, the Monte Carlo method or Monte Carlo simulation is a mathematical technique used for forecasting which takes into account risk, uncertainty and variability.
It runs a large number of random trials using your past throughput data to predict the throughput for a future time frame.
You define the start date and the number of tasks, and the simulation provides a range of delivery dates and the probability that comes with each date. For any date in the future, it uses the throughput of a random day in the past to simulate how many work items are likely to get done.
For example, say on February 14th you’ve had a throughput of 3 tasks. The simulation takes this number and assumes that this is how many tickets will be completed on February 22th. To project the probable throughput of February 23th, it takes the throughput of another random day in the past and so forth.
Тhe simulation is repeated tens of thousands of times before the results are presented in the form of a probability distribution with percentiles increasing from left to right. It provides a range of delivery dates and the probability that comes with each of them. The Monte Carlo simulation produces a probabilistic forecast based on your past performance data.
When to Use Monte Carlo Simulations?
With regard to forecasting, Monte Carlo simulations come in two forms: calculating the delivery date of a number of items to be completed, or the amount of tasks to be finished in a given period.
We use this technique to answer the two most challenging questions in project management:
- “When can we finish X number of tasks?”. Monte Carlo will give you the delivery date of your project and the level of certainty that this will happen. Let’s say that you know (at your best) that the scope of the project is about 100 tasks. You can use the Monte Carlo simulation to give your client a probable delivery date and the confidence level to hit that target.
- “How many tasks can we finish in X number of days?”. With Monte Carlo, you will be able to decide how many items can be completed within a certain timeframe. For example, say you know your next release is planned for June 15th and you want to know how many new features will be ready by then. You input your start date and end date and the simulation will give you a range of outcomes and the probability that comes with each of them.
No guesswork or subjective estimating is involved – just data-driven probability-based future predictions calculated using your own historical performance.
How to Interpret Monte Carlo Simulation Results?
Monte Carlo uses a computational algorithm to simulate the process thousands or even millions of times. The result is a histogram showing all the possible outcomes and the likelihood that each outcome will occur.
So, how do you read that histogram?
In this example, we set a backlog of 40 tasks and we want to start working on it on March 1st. The simulation tells us that there is an 85% probability that we can finish all the backlog items by July 5th. The further you go in time, the greater the certainty of completing all the tasks.
Are we saying that these exact 40 tasks in our backlog will be delivered by July 5th? No, we aren’t. What we are saying is that we can deliver any 40 work items by July 5th and there is an 85% chance that we can meet that goal.
Probabilistic forecasting enables you to make reliable delivery commitments using your own past performance data. The question, “When will this be done?” is not that interesting anymore. The charts already provide that answer. The question now becomes, “How much risk are you willing to take?”.
Are you willing to commit to the delivery date that comes with the 50th percentile (which, by the way, comes with the same confidence level as flipping a coin)? Or, would you prefer to make a commitment with more confidence and go with the 85th, even the 95th percentile so that you increase the probability of delivering on time?
With Monte Carlo, you don’t have to estimate the relative complexity of your work anymore! The only thing you have to do is decide upon the level of risk you’re willing to manage.
Do You Need to Slice Your Items into Even Sizes for Monte Carlo to Work?
Story sizing into even pieces is a widely-spread activity, which is often considered to be a prerequisite to making reliable future predictions. This is one of the biggest Monte Carlo misconceptions out there.
The size of your work items doesn’t affect the reliability of your forecast because your historical data (the basis of your forecasts) contains work items of different sizes.
The main prerequisite to making accurate delivery forecasts lies in maintaining a predictable workflow.
In a predictable system, we only choose the level of confidence we want to work with. If you know that the nature of the work is complex, there are plenty of unknowns and you have never done this kind of work before, then commit to a higher percentile (95%, 98%). That way, you have a high probability of meeting your goal.
If the work is easy, your delivery workflow is stable and you don’t expect any obstacles along the way, go with a lower percentile (70%). Remember, forecasting is all about managing risks effectively and it is up to you to decide what’s the level of risk you’re willing to live with.
The size of your work can only be a prioritization criterion and it doesn’t impact the accuracy of your forecast in any way.
Does Monte Carlo Account For Story Splitting Altogether?
Let’s take a step back. Just because you have 100 stories in your backlog, this doesn’t mean that these exact 100 stories will be delivered on the date you’ve committed.
That’s not what the Monte Carlo simulation is telling you. What the simulation is telling you is “If you have 100 items, they will be done by date X and there is Y% certainty that you’ll hit that target”.
You’ll probably split your stories, some of them will drop off, more will be added, you’ll discover defects and additional work will inevitably come in between. You can take any 100 items you want, the result of the simulation will still be valid.
Story splitting is about determining whether something is more complex than we initially assumed. If you split your initial story into 4 other stories, that doesn’t necessarily mean that you’ll work on all 4 new stories.
What Monte Carlo is telling you is that you have 100 free slots to deliver on your commitment. Now, it’s up to you to decide, in a continuous manner, how to best fill these slots to meet your customer’s expectations.
Does the Method Consider Your Current Work in Progress?
Yes, absolutely. Monte Carlo doesn’t explicitly specify whether or not your 100 stories have been started yet. These 100 stories may include the work that’s currently in progress as well.
Remember, the Monte Carlo method doesn’t tell you which items will be delivered. It is up to you to prioritize the work effectively, ideally based on cost of delay.
Often, I see teams who work towards a release date use Monte Carlo to figure out how many work items in progress will make it by the deadline. They use this analysis to prioritize the work that brings the highest value and they deprioritize tasks that won’t be delivered in the current release anyway.
The number of items you add to the simulation accounts for both items already in progress and items that haven’t been started yet.
What Data Is Needed for Monte Carlo Simulation to Produce Reliable Outcomes?
The fact that probabilistic forecasts are based on your past performance doesn’t mean that you need a ton of data in order to come up with reliable delivery predictions. Whether you have been collecting data from the very beginning of your board creation, or you are just getting started with new teams, this is beside the point.
If your delivery system is optimized for predictability, then you won’t actually need any more than 20 or 30 completed items to come up with accurate results. It’s not about quantity – it’s all about taking control of your management practices and ensuring you deliver results in a consistent manner.
More importantly, you need to use data from the past that reflects your current conditions. If you’ve recently changed your system design, introduced new process policies, or if there are new people joining or leaving the team, then you have to observe how these changes affect your performance.
And if the impact is significant, then it would be better to only work with the data collected after the changes have been implemented.
How Does the Scaling Factor in Monte Carlo Work?
Now, what if you don’t have data that reflects your future conditions? Let’s say that you need to forecast the delivery date of a project that takes place in December when everyone is taking time off for the holidays, and you don’t have data that accounts for that situation.
If that’s the case, the best thing that you can do is scale down your daily throughput accordingly, so that the simulation accounts for the changes in your performance.
This is where the scale factor in Monte Carlo comes into play.
The scale factor is used for high uncertainty scenarios where you expect drastic changes in the performance of your team, but you don’t have data to account for that.
A 0.5 scale will mean that you anticipate your daily throughput to be twice lower, 2.0 means twice as much as the typical throughput rate.
In the example above, we expect the throughput to decrease by 20%, so we set the scale factor to 0.8. The simulation now tells us that if we have a scope of 20 tasks and we initiate our project on December 1st, there is an 85% chance of delivering on February 7th.
Remember, only use the scale factor if you don’t have the data to work out your scenario. If you do and it represents your current setup, by all means, use that data instead.
Should We Filter Out Certain Data When Running the Simulation?
Here is a $700/hour consulting answer. It depends.
Let me explain. The main goal of the Monte Carlo simulation is to answer the question: “When will this be done?”. This is a customer’s question. In order to provide a reliable answer, you need to think about your work from a customer’s perspective.
How do you define the work that you’ll deliver to your client? If you only have tasks in your system, and that’s how you define the concept of customer value, then you shouldn’t filter out any data.
However, if your scope is defined in terms of stories, and your customer expects a commitment on a story level, you should filter your data by stories to generate the forecast and leave out all other work item types.
What Are the Assumptions Required to Be Made in Monte Carlo Simulations?
The size, nature or complexity of your work doesn’t affect the accuracy of your forecast. The amount of data you have collected is not a factor determining the dependability of the Monte Carlo method.
The only prerequisite to producing reliable delivery forecasts is to optimize your workflow for predictability.
The sole requirement for Monte Carlo (and any other approach to forecasting) to work and give you reliable answers is to use the data produced by a predictable delivery system.
It’s all about taking control of your management practices to ensure you deliver customer value in a consistent manner.
If your delivery system doesn’t produce the results you are hoping for and you’d like to explore the proven roadmap to optimize your workflows for predictability, I’d be thrilled to welcome you to our Sustainable Predictability program!
So, these were the questions that I’ve been asked about Monte Carlo simulations the most often. I hope the answers will help you leverage this fantastic approach to making reliable delivery commitments.
And if you have any others, don’t hesitate to post them below. I’ll address each and every one of them. I wish you a productive day ahead!
Meet the Author
Sonya Siderova is a passionate product manager and a driving force behind Nave, a Kanban analytics suite that helps teams improve their delivery speed through data-driven decision making. When she's not catering to her two little ones, you might find Sonya absorbed in a good heavyweight boxing match or behind a screen crafting a new blog post.