Frequency Distribution: Histogram Diagram
Histograms are a great way to visualise data and track key performance indicators because they are so clear and simple to read. They are the preferred method of presenting large amounts of data in a simple straightforward manner. What is a histogram and how does it help you analyse your data?
What is a histogram diagram?
The histogram is a graph that is often used in mathematics and statistics. Histograms are used to measure how frequently values or value ranges appear in a set of data. The horizontal axis typically displays the measured value – either a continuous numerical variable, such as height, distance or time or a discrete, countable value, such as number of items. The vertical axis shows the frequency that this value or value range appears.
Histogram divisions can either be discrete number blocks (1, 2, 3) or, in the case of a range, class intervals or bins (0-10, 10-20, 20-30). The most important thing to remember is that there should be no gaps between the numbers or number ranges – every section of the value range is displayed along the horizontal axis.
For a continuous measured variable, class intervals can be a judgement call or worked out through trial and error. They should be chosen so that the shape of the graph resembles a distribution curve similar to the histograms shown above.
Bar charts and histograms: categorical and quantitative
Histogram diagrams have characteristics in common with traditional bar charts – they both measure frequency and use a similar layout. However, there is a key difference:
- Bar charts measure categorical data: data that can be split into different categories or types
- Histograms measure continuous, quantitative data: data that can be counted
Bar charts are certainly a useful tool to visualise the size of each category, but histograms are a better way to display frequency distribution over a range. Histograms also allow us to better analyse the data set and find its mean, median and mode.
How to read histogram diagrams
Averages: mean, median and mode
Averages can be calculated in three ways. The different methods can give the same or different values depending on the data set involved. Consider this simple dataset:
1, 2, 2, 3, 3, 4, 5, 5, 5, 8
- The mean is the sum of all of the values in the data set divided by the total number of values. For this dataset, the mean is 3.8. When an average is referred to without specifying if it is a mean, median or mode value, it is almost always the mean.
- The median refers to the middle value of the dataset. If there is an even number of values, the midpoint between the two closest values is taken. For this dataset, the median value is 3.5.
- The mode is simply the value that appears most often. For this dataset, the mode is 5.
Three methods of calculation, three different averages. The goal of an average is to work out the central tendency of your data – the value your data clusters around. Looking at the shape of your frequency distribution shows you which average best reflects this central tendency.
WE UNCOVER THE EFFICIENCY OF YOUR WORKFLOW
Optimize your performance with Kanban analyticsSee a dashboard with your data
Frequency Distribution Shapes
The most common frequency distribution type is the normal distribution (also known as a Gaussian distribution or bell curve). This symmetrical shape shows values clustering around the central peak with fewer instances further away. In a normal distribution, the mode, median and mean are the same value.
Datasets can also be skewed to the left (negative) or right (positive). Instead of clustering symmetrically around a central value, much higher or lower values skew the shape of the graph. In these cases, the mode, median and mean are different. For skewed data, the best reflection of the central tendency is the median.
Some histograms will show two peaks. This is known as a bimodal distribution. This distribution indicates that there are two overlapping groups in your dataset. We recommend trying to separate the groups to get a clearer picture of the data.
One of the most important metrics in the Kanban method is the productivity of your team. It is measured by the number of work items delivered over a time period (day, week, month). This metric is known as throughput. The most efficient manner to visualise how throughput varies over time is using Throughput histogram. Tracking your team productivity over time will enable you to measure and improve your capacity to deliver.
Do you use histograms to monitor KPIs? What patterns do you notice in your data? Tell us about your experience in the comments!
Meet the Author
Sonya Siderova is a passionate product manager and a driving force behind Nave, a Kanban analytics suite that helps teams improve their delivery speed through data-driven decision making. When she's not catering to her two little ones, you might find Sonya absorbed in a good heavyweight boxing match or behind a screen crafting a new blog post.
Our digital course Sustainable Predictability is listed for $397 until Nov 30th. Take advantage of the 60% discount… https://t.co/8bBbiaWPa3Follow
Start making reliable decisions and eliminating the bottlenecks caused by unclear priorities with a dynamic priorit… https://t.co/hVpa8sCtR9Follow
Take your team to a whole new level with Nave's Kanban analytics for Trello. Picture what's going on behind your da… https://t.co/BhnrABnsPBFollow
In our latest article, we’ll take you through the key steps to reducing the impact that blockers have on your deliv… https://t.co/10L6MoruB4Follow
30% discount on all annual plans until the 30th November! Subscribe now with a coupon code NAVEBLACK20… https://t.co/dnSM2KzS5cFollow
The dotted horizontal lines on the Cycle Time Scatterplot are called percentiles. We use percentiles to define the… https://t.co/nlUcIGRDm3Follow
High pressure over long time periods leads to your team suffering from burnout and its symptoms. Learn more about w… https://t.co/hcYg29OE3YFollow
Successful project managers are effective leaders whose decisions will drive a business forward. Here are the top 5… https://t.co/gDVZffzmDbFollow
Rely on data, not instincts. Nave Power-Up for Trello builds immersive analytical charts over your Trello boards. T… https://t.co/4wgoF3Sy2aFollow
Optimizing workflow efficiency is about eliminating bottlenecks and improving the predictability and stability of y… https://t.co/VmsnvqEH9kFollow
Use the percentiles on the Throughput Histogram to define the number of tasks you can commit to on a daily, weekly… https://t.co/9LcMvOs0U6Follow
Businesses that are truly fit for purpose are positioned for long-term survival and success. Here's an in-depth rev… https://t.co/k8vtaiHc6sFollow
Learn more about how to become more efficient with our expert tips and guidelines for Kanban teams.… https://t.co/wFrpIWBLyxFollow
Businesses with higher levels of team motivation perform better for a simple reason: they appreciate their employee… https://t.co/aFpd3fLT6tFollow
The Cycle Time Breakdown Chart can be of great help when it comes to bottleneck analysis. Identifying the root caus… https://t.co/TkASBXrTuMFollow
WIP limits can become a challenge without first considering the maturity of your team and the current state of your… https://t.co/Ly2oe23XwhFollow
Measure and improve your delivery performance with our advanced Kanban analytics for Asana projects. See a dashboar… https://t.co/3GfshR3MeJFollow
Product teams often argue which is better, but the focus should be on how to merge Kanban and Scrum together to del… https://t.co/kSUCdCTA9tFollow
Using a probabilistic approach of making future predictions enables us to define service level agreements with more… https://t.co/BhFmeeWt1DFollow