Vol. 5, No. 1
from Professor Cleary
Walker Runn is once again
applying the wrong tool to the data. This set of data should be analyzed
with ANOVA (Analysis of Variance), rather than t-test. This will compare
three or more processes to determine whether they are essentially the same
or not. Consider this data:
This data suggests that the means of the three production lines are different from one another. The question, however, is similar to last month’s: Are they different enough from one another to indicate that the processes themselves are different?
Runn’s method of comparing two lines at a time is flawed. ANOVA examines the three production lines as a group, and assumes that they are all coming from the same process. It estimates the variance of the data set in two ways, then compares these variances to each other.
The first estimate is called the “variance within.” It is essentially equal to the average variance of the three samples.
The second is called the “variance between”, and it is related to the central limit theorem.
Where nj = size of the jth sample
Now we are ready to test the hypothesis that the three production lines are essentially the same process:
(Interpretation: Lines 1, 2, and 3 are the same; this represents the null hypothesis.)
(Interpretation: One wants a 5% probability of rejecting the null if it is actually true. This is known as a Type I error.) At least one line is different from the others; this represents the alternate hypothesis.
Calculate the statistical F-value. This is done by taking a ratio of the variance between to the variance within, divided by the appropriate degrees of freedom.
Compare the above-calculated F value to the tabular F value in the back of a statistics textbook. In this case, the calculated F value (4.114) is greater than the tabular F value (3.68), so the null hypothesis is rejected. Thus, one can conclude that at least one of the three production lines is different from the others.
If you’re finding cobwebs in your head as you try to remember Statistics 201, there’s always DOEpack, which will do these calculations for you.
What do you think that “p = .038” in the printout means? (If you think you know the correct answer, e-mail me firstname.lastname@example.org. Your name will be included in a drawing of all the correct answers. If your name is drawn, you will win a copy of DOEpack.)
Copyright 2003 PQ Systems.
Please direct questions or problems regarding this web site to the Webmaster.