A histogram is a graphical way of presenting a frequency distribution of quantitative data organised into a number equally spaced intervals or bins (e.g. 1-10, 11-20…).
The interval range is selected to reduce the amount of information while still providing enough variability to picture the shape of the distribution.
The intervals are displayed on one axis (often the x-axis), against which are plotted the frequency of a particular piece of data which falls within that interval (often on the y-axis). Histograms can also be constructed by plotting relative frequencies, percentages, or proportions on the y-axis. Unlike the column chart, a histogram is only appropriate for variables whose values are numerical and measured on an interval scale. A histogram is essentially a column chart with no space between the columns (as would be the default in Excel).
Histograms generally make analysis of large data sets (>100 observations) easier due to the preparation required for a stem and leaf plots in a large dataset. They can begin to show the central tendency and dispersion of a data set and can also help detect any unusual observations (outliers) or gaps. However, the histogram’s utility is certainly not limited to large datasets.
>Many Eyes example
In this Many Eyes example, the simple histogram is enhanced by the addition of the blocks to identify the breakdown of a third variable. Insightful in many ways, but be sure the extra dimension of data is needed to deliver the message.
Frequency Distribution of Height of 25 Students
Source: Centre of Rural Studies, UVM, 2004
Advice for choosing this method
Histograms are really only used for continuous data grouped into intervals.
Only add additional elements if they assist in delivering the message of the visualisation.
Advice for using this method
How to group the data can be a challenge. Sometimes obvious groupings like 1-5, 6-10 and so on can improperly skew the data. In those cases, you may want to look for natural breaks in the raw data. For example, if there was a large gap in your raw data between 6 and 10, you might choose to divide at that gap. Using natural breaks can also be tricky, however, because rather than continuous groups like 1-6, 7-13 it would be irregular like 1-6, 10-14. So be sure to clearly label the beginning and end of each group.
Excel does not have a histogram chart option. See resources below on how to prepare a histogram in Excel.
School of Psychology, University of New England (2000). Histograms and Bar charts. In: Chapter 4 Analysing the data (Research Options and Statistics course). http://www.une.edu.au/WebStat/unit_materials/c4_descriptive_statistics/histograms_barcharts.html (archived link)
Easton, V. J., & McColl, J. H. (1997). Statistics Glossary. http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html#hist (archived link)
'Block histogram' is referenced in:
- Rainbow Framework :