| Histogram |
What is it?
A histogram is a bar graph of raw data that creates a picture of the data distribution. The bars represent the frequency of occurrence by classes of data. A histogram shows basic information about the data set, such as central location
, width of spread
, and shape. Use histograms to assess the system’s current situation and to study results of improvement actions. The histogram’s shape and statistical information help you decide how to improve the system. If the system is stable, you can make predictions about the future performance of the system. After improvement action has been carried out, continue collecting data and making histograms to see if the theory has worked.
Descriptive statistics, such as chi-square
, kurtosis
, and skewness
can help you interpret the histogram and can show you if the data distribution is normal.
Histograms and descriptive statistics can be created easily with software like CHARTrunner and SQCpack.
What does it look like?
The vertical axis of the histogram below shows the frequency of occurrence. The horizontal axis shows the cell values. This chart also shows descriptive statistics used to describe the distribution, including skewness
, kurtosis
, and chi-square
. Specification limits have also been drawn on the chart.
When is it used?
Histograms illustrate the process distribution and are used to make predictions about a stable process. If the system is unstable, the histogram will have little predictive value.
Use histograms when you can answer yes to these questions:
Getting the most
Histograms provide three very important pieces of information about distributions of data values: shape, central location (the middle), and spread (how different the values are from each other and from the middle). Getting the most from this tool means being able to apply these statistical concepts.
Histograms show how data can pile up; in any distribution of values, some values will occur more frequently than others. The peaks on the histogram show where there is similarity among the data. This is the central location, which is measured by mean, median, and mode. While these statistics provide valuable information about the process, central location alone does not provide a complete picture of the process. When you consider the spread of the data, you will see its extremes. The shape of the histogram can show if the system leans toward one extreme or the other, or if there are multiple peaks.
When you use a histogram for prediction, the system must be stable. If not, the central location
, spread
, and shape may vary dramatically in histograms created from data taken at different times and will not be an accurate reflection of the process. If you are not using histograms to make predictions, stability is not required.