Types of Frequency Distribution
Project forecasting is not a straightforward process – it’s a scientific and statistical exercise that deals with numerous interconnected variables. The value of these variables can have significant impact on your final prediction. Knowing the frequency distribution of these values enables you to make data-driven decisions.
The type of frequency distribution is especially important when making predictions with the Monte Carlo simulation. In this article, we’ll explain the frequency distribution shapes that you will encounter most frequently.
The normal distribution, also known as a Gaussian distribution or “bell curve” is the most common frequency distribution. This distribution is symmetrical, with most values falling towards the centre and long tails to the left and right. It is a continuous distribution, with no gaps between values.
Normal distributions are found everywhere, for both natural and man-made phenomena. This could include time taken to complete a task, IQ test results, or the heights of a group of people. In project management, when performing estimations while you have no further information about the type of frequency distribution, it is usually best to assume a normal distribution.
When a normal curve slopes to the left or right, it is known as a skewed distribution. The location of the long tail – not the peak – is what gives this frequency distribution shape its name. A long tail on the right is referred to as right-skewed or positively skewed, while a long tail on the left is referred to as left-skewed or negatively skewed.
Positively skewed distributions are common in situations where there is a fixed lower boundary. For example, delivery of a component – if most deliveries happen within 3 days, the minimum value is 0, but the long tail could stretch far to the right if some deliveries are late.
Negatively skewed distributions are less common in general, but still appear when fixed or near-fixed upper boundaries are in play. For example, a company that guarantees all orders will be delivered within 1 week will most likely see some faster deliveries, but most values clustering close to the 1 week point.
One important fact about skewed distributions is that, unlike a bell curve, the mode, median and mean are not the same value. The long tail skews the mean and median in the direction of the tail. There is a very easy way to calculate the different average values using a histogram diagram. If you rely on average values to make quick predictions, pay attention to which average you use!
All of the frequency distribution types that we’ve looked at so far have been unimodal – values cluster around a single peak. A bimodal distribution occurs when two unimodal distributions are in the group being measured. When more than two peaks occur, its known as a multimodal distribution.
This distribution shape happens frequently when the measured data can be split into two or more groups. One example would be the throughput of all of your team’s tasks. If your team are using Classes of Service to tackle emergency tasks faster than regular tasks, you will most probably see a bimodal distribution.
If you spot a bimodal frequency distribution, it’s worth checking if you can split the measured data into sub-groups to see the shape for each group.
WE UNCOVER THE EFFICIENCY OF YOUR WORKFLOW
Optimize your performance with Kanban analyticsSee a dashboard with your data
In a uniform or rectangular distribution, every variable value between a maximum and minimum has the same chance of occurring. The probability of rolling a certain number on a dice or picking a certain card from the pack is described by this frequency distribution shape.
This frequency distribution appears at the start of every project. A uniform distribution assumes that all samples from its population are equally probable. When rolling a die, all numbers on the die have an equal chance of coming up on each throw. Let’s say you have nineteen samples from a uniformly distributed population. In a uniform distribution there is a very high probability that the next sample will be between the min and the max of the previous samples. That means that you have a fairly good understanding of the range of your uniform distribution after having collected only twenty data points.
Some data sets have nearly all their frequency values clustered to one side of the graph. This frequency distribution shape is known as logarithmic. A common example of this in real life is found in distributions of wealth and income, with large numbers of people at the bottom but extreme outliers extending the tail to the right.
This distribution type is often known as a Pareto distribution, named after famous Italian economist and sociologist Vilfredo Pareto. You’ve almost certainly heard of his 80-20 rule. For example, 80% of the wealth of a society is held by 20% of society, 80% of revenue comes from 20% of clients and 80% of productivity comes from 20% of your team.
While the percentages are not always 80-20, this pattern appears mainly in financial estimation models.
The PERT and triangular frequency distribution types are both modelled from the same 3 values – a minimum, a maximum and a mode. This distribution type is especially useful when only a small amount of past performance data is available. It uses only three values as the inputs – a, m and b.
While the triangular distribution is a simple shape made using straight lines between each of the 3 values, the PERT distribution assumes that the long tail values appear less frequently. The frequency distribution shape generated from these three values is then used to estimate likely completion times.
Understanding the frequency distribution of your data is important for both input and output of your forecasts. Realistic outputs are simply impossible without accurate inputs. Calculations that rely on subjective estimates are risky – we recommend to always draw from your past performance data.
What is the frequency distribution of your data? Have you used histogram diagrams to analyse it? Do you use histograms to make your estimations? Tell us about your experience in the comments!
Meet the Author
Sonya Siderova is a passionate product manager and a driving force behind Nave, a Kanban analytics suite that helps teams improve their delivery speed through data-driven decision making. When she's not catering to her two little ones, you might find Sonya absorbed in a good heavyweight boxing match or behind a screen crafting a new blog post.
Take your training and consultancy sessions to a whole new level. With Kanban analytics on their existing tools, yo… https://t.co/4G3X9FehF5Follow
Learn how to make accurate, data-driven predictions to stay on track, meet deadlines and keep a high level of custo… https://t.co/zPBlWxv5X7Follow
The Flow Efficiency Chart shows your average flow efficiency, as well as how trends have been moving over time. Ide… https://t.co/Eff9ITFFnrFollow
Learn more about the difference between thin-tailed and fat-tailed distributions and the approaches to evaluate you… https://t.co/oHCaDCa4WXFollow
Last chance to get 60% off! We list our Sustainable Predictability digital course at the lowest price ever! The off… https://t.co/Af5U0kiBIEFollow
Get straight to the essence of your Azure board data and analyze your processes with our immersive data-visualizati… https://t.co/MpB4kgNiCeFollow
Kanban can help you run your business better, make your processes more efficient and empower your team to accomplis… https://t.co/RwXHnb4UcEFollow
A Cycle Time Histogram with a big hump on the left and a very long tail to the right indicates that your cycle time… https://t.co/mFKXLpx4HhFollow
Service level agreements define the responsibilities of a service provider to their customers. Defining SLAs are im… https://t.co/s7HeXDvfkWFollow
Today, we’ll explore the consequences of moving cards backward has on your performance, as well as the most effecti… https://t.co/IZmafKMe9YFollow
Value stream mapping is a visual technique that depicts the lifecycle of your product and finds and eliminates wast… https://t.co/fyJZvdPVCxFollow
Our digital course Sustainable Predictability is listed for $397 until Nov 30th. Take advantage of the 60% discount… https://t.co/8bBbiaWPa3Follow
Start making reliable decisions and eliminating the bottlenecks caused by unclear priorities with a dynamic priorit… https://t.co/hVpa8sCtR9Follow
Take your team to a whole new level with Nave's Kanban analytics for Trello. Picture what's going on behind your da… https://t.co/BhnrABnsPBFollow
In our latest article, we’ll take you through the key steps to reducing the impact that blockers have on your deliv… https://t.co/10L6MoruB4Follow
30% discount on all annual plans until the 30th November! Subscribe now with a coupon code NAVEBLACK20… https://t.co/dnSM2KzS5cFollow
The dotted horizontal lines on the Cycle Time Scatterplot are called percentiles. We use percentiles to define the… https://t.co/nlUcIGRDm3Follow
High pressure over long time periods leads to your team suffering from burnout and its symptoms. Learn more about w… https://t.co/hcYg29OE3YFollow
Successful project managers are effective leaders whose decisions will drive a business forward. Here are the top 5… https://t.co/gDVZffzmDbFollow