Conclusions based on this frequency distribution will be flawed because Lewis N. Clark has not bothered to determine appropriate intervals for his data, and has indeed created two different intervals, one of 5 years (most categories, including 20 to less than 25, for example), and one of 10 years (e.g., 75 to less than 85). The number of users in the 10year interval groups would of course be larger than if that had been distributed into 5year groups. His conclusion is therefore not correct.
Setting up frequency distributions and histograms so that data can be analyzed accurately involves several steps.
Data for a frequency distribution can be collected for a histogram or can come from the data entry section of a control chart. Once the data is ready, the first step is determining the number of classes, or subdivisions, in the distribution. This is done by counting the number of data points in the data set. If there are 50 or fewer, 57 classes are appropriate. For other numbers: from 50100 data points should have 610 classes; from 100250, 712 classes; more than 250, 1020 classes.
In the example, there are 500 data points, so we might decide to have 20 classes (this represents only a rough estimate at this point; it can be changed later).
Class width and boundaries must then be determined. The width of the class is determined by the range of data points in each class, done by dividing the range of the data set by the number of classes (in this case, 20). Range is found by subtracting the smallest value in the data set (2) from the largest (98).
In this example, the highest value in the data set is 98, and the lowest is 2:
= 96
The class width for this example is calculated as follows:
Class width = range of data set
Number of classes
= 4.8
This number should be rounded to an easytoworkwith number, such as 5.
This information generates the frequency distribution shown in the question above. Click here for another example from Practical Tools for Healthcare Quality (Sandra and O. Byron Murray).
