Quality Quiz from Professor Cleary
Congratulations:
"B" is correct.
Click here for a more complete video explanation
Armand is confused about the way in which this problem should be addressed. You may remember from a previous quiz that there are four types of data:
- Nominal data allows for distinguishing between items (e.g., male/female).
- Ordinal data represents an ordering of items (e.g., small/medium/large).
- Interval data is a way to measure differences among items (e.g., weight of bottles).
- Proportional data allows measurement of proportion or ratio of items (e.g., 50% of class is female).
Armand has “interval” confused with “proportional” data. If 20 out of 1,000 bottles have missing labels, this represents 2% of the total. With this data, the best estimate of future missing labels would be 2% as well. This is called a point estimate.
But with only one sample, it is not possible to identify the true proportion of label errors as 2%. It is possible to build a confidence interval for the true proportion of bottles with missing labels.
To do this, one must enter the world of the Binomial Central Limit Theorem. I often use the following example to introduce the concept to my university students: The dean’s office has indicated to me that 50 percent of the students in the college are female, and I ask whether in a sample of 100 students, could I expect that exactly 50 would be female. The response, of course, is that this is possible, but not likely. Some samples would be more than 50 and others might be less. But the sample proportion would center around 50 and tail off above and below that number.
My next question is what we might say about a distribution (or histogram) if we took 100 samples of the size 100 (n=100) and recorded the sample proportions P or , then took these 1000 P’s (P 1 , P 2 , P 3 , P 4, P 5 , P 6 …. P 1000 ) and created from this data a histogram of sample proportions.
Eventually, students articulate three discoveries:
The mean of that distribution () will be close to .50.
The shape of the distribution will be similar to a normal distribution.
There will be some level of variability in that distribution. At this point I remind them of the binomial distribution learned in an earlier lesson, when we found the standard deviation to be equal to:
So for our case, it would be:
The parameters of this distribution:
mean
shape normal
variability
= .05
Let us now use the Quincunx simulation from Quality Gamebox (Figure 1) to validate point #2, that is, the normality of the binomial distribution.
Figure 1
As the balls fall down, each one has a 50% chance of going left and a 50% chance of going right. This is truly a binomial condition (Figure 2).
Figure 2
The next figure (Figure 3) shows all the balls being dropped, forming a normal distribution:
Figure 3
From this, we can summarize the binomial central limit theorem:
Three things can be said about a distribution that takes many sample proportions form a binomial distribution and uses the sample proportions to form a new distribution.
Its mean will be close to the mean of the binomial distribution from which samples were selected:
The variability of that distribution will be:
The shape of the distribution will be normal.
Click here to register to win a free Quality Gamebox program.
Copyright 2007 PQ Systems.
Please direct questions or problems regarding this web site to the Webmaster. |