January  2003 Vol. 5, No. 1
Quality Quiz from Professor Cleary

Congratulations!
You're right!

Walker Runn is once again applying the wrong tool to the data. This set of data should be analyzed with ANOVA (Analysis of Variance), rather than t-test. This will compare three or more processes to determine whether they are essentially the same or not. Consider this data:
 Line 1 Line 2 Line 3 12 14 10 14 17 7 10 9 11 17 16 9 11 15 5 9 11 12 S1=2.927 S2=3.077 S3=2.608

This data suggests that the means of the three production lines are different from one another. The question, however, is similar to last month’s: Are they different enough from one another to indicate that the processes themselves are different?

Runn’s method of comparing two lines at a time is flawed. ANOVA examines the three production lines as a group, and assumes that they are all coming from the same process. It estimates the variance of the data set in two ways, then compares these variances to each other.

The first estimate is called the “variance within.” It is essentially equal to the average variance of the three samples.

 Variance within =

m = number of samples

m = 3 (in this case)

The second is called the “variance between”, and it is related to the central limit theorem.

 Variance between =

Where nj = size of the jth sample

 mean of the jth sample (12.17, 13.67, 9, in this case) n = total of all samples (18 in this case) the average of all the data (11.61 in this case)

Now we are ready to test the hypothesis that the three production lines are essentially the same process:

Step #1:

H1:otherwise

(Interpretation: Lines 1, 2, and 3 are the same; this represents the null hypothesis.)

Step #2:

(Interpretation: One wants a 5% probability of rejecting the null if it is actually true. This is known as a Type I error.) At least one line is different from the others; this represents the alternate hypothesis.

Step #3:

Calculate the statistical F-value. This is done by taking a ratio of the variance between to the variance within, divided by the appropriate degrees of freedom.

 = = 4.114

Step #4:

Compare the above-calculated F value to the tabular F value in the back of a statistics textbook. In this case, the calculated F value (4.114) is greater than the tabular F value (3.68), so the null hypothesis is rejected. Thus, one can conclude that at least one of the three production lines is different from the others.

If you’re finding cobwebs in your head as you try to remember Statistics 201, there’s always DOEpack, which will do these calculations for you.

What do you think that “p = .038” in the printout means? (If you think you know the correct answer, e-mail me mike@pqsystems.com. Your name will be included in a drawing of all the correct answers. If your name is drawn, you will win a copy of DOEpack.)