Sorry -- I've not gotten your exams graded yet. You'll get them back on Tuesday next week.
We'll have a quiz next Thursday on the material we cover today.
Section 5.2: The Sampling distribution of the Sample Proportion
Wouldn't it be wonderful if we could use the same ideas
that we use for estimating and making inferences about
the sample mean for other statistics? It turns
out that we can!
We need to know
that the statistic is distributed approximately normally,
the mean of the statistic, and
the standard deviation of the statistic.
Then we can do the same kinds of normal table
calculations that we do for the sample mean.
Definition: Consider a sample of qualitative data for which
one category, or attribute, is of interest. To describe such a sample,
we will use the proportion of the sample having the attribute of
interest. This statistic is denoted by the letter p.
Properties of the sampling distribution of the sample proportion:
The mean of the sampling distribution of p, denoted
, is equal to the true proportion of the population
.
The standard deviation of the sampling distribution of p, denoted
, is
where n is the sample size.
Here's the key thing: the sampling distribution for
p is approximately normal for a large sample size n ("large"
generally taken as greater than or equal to n=30at the very
least, but in this case it depends as well on the value of
) as well.
Our choice of n to ensure normality is made so that
three standard errors from the estimated
are completely contained inside the interval [0,1]:
and
Larger n makes
smaller, so we take a value of n as large as
necessary.
See p. 186 for the strategy
See Figure 5.7 and 5.8, p. 182-183
Calculating probabilities for p involves computing
Z-scores, as usual
Exercise #5, p. 202
Chapter 6 (section 6.1 and 6.2)
First of all, this chapter deals only with
,
sampled from a population with known
and unknown
.
What do we know about
?
We have two major objectives:
estimate
, with given confidence (section 6.1);
find out what the sample size should be to estimate
, with given precision (section 6.2).
How do we go about achieving these objectives?
Estimating with given confidence (Section 6.1)
What does it mean to estimate something "with confidence"?
"confidence interval" -- what should that mean?
z-interval for (see p. 214)
Here's a table of common choices for z:
Desired Confidence
90%
95%
99%
???
z-value
1.645
1.960
2.576
get it from the z-table!
See p. 221 for conditions under which we may do this.
What if we require a 95% confidence interval with a known
precision?
That is, we want to know that
lies within a certain interval
with 95% confidence.
Then we will have to design our experiment well! In
particular we will have to select an appropriate sample size.
We know that the larger the sample size, the "tighter" the
normal curve of the distribution of
.
Hence n determines how tight the curve needs to be,
by the equation
, where the value of z corresponds to the
desired confidence (in this case, 95%), and E
corresponds to the "half-width" of the interval.
Example:
#4, p. 227
Notice that it's not necessary to know the mean
value in order to calculate the sample size!
Only the margin of error E, z,
and the known value of