So far we've pretty much focused on descriptive statistics, which are ways of describing distributions (e.g., means and standard deviations).
The goal of inferential statistics is to make claims about population parameters based on sample statistics.
Typically we can't measure the entire population of individuals that we're interested in. So instead we select a sub-set of individuals, which we call a sample. Then we measure our sample, and use those measurements to make estimates of population values (which correspond to those measurements, e.g., means and standard deviations). Our best estimate for the mean of the population will be the mean of our sample.
It sounds simple and straight forward, but consider the following:
|
Suppose that you take 3 different samples from the same
population. They are going to be different from one another. They will
have
different shapes, different means, and different variability. So how do
you
figure out what the best estimate of the population mean is?
How many possible samples can we take? Infinite (if we are sampling with replacement) |
|
Luckily for us, the huge set of possible samples forms a simple, orderly, and predictable pattern (a sampling distribution). Because of this, we are able to base our predictions about sample characteristics on the distribution of sample means.
I consider this distribution to be virtual, that is we don't actually compute all possible samples of sample size n, but instead generate it based on properties of the population.
Because this population is so small we actually can know the mean (and variability): m = 5, but suppose that we didn't, and wanted to be able to make an estimate based on sampling
step 1: pick a sample size: for this example we'll pick samples
of n = 2
- we'll talk more about sample size a little later, but typically
the bigger your sample size, the more likely that your samples will be
similar to
one another (and to the population as a whole)
step 2: now consider all of the possible samples that you could get, and look at their distribution
__________________________________ scores sample mean sample first second ( |
|
|
Does this make sense? It should. If you select a bunch of samples from the same population, the most of the means should "pile up" near the population mean m (if they don't then you must have some kind of bias in your sample)
mean:
. It is "expected" because it should
be a value near the
population mean m . (note:
often symbolized as: m
Look at our example, the expected value of
(the mean of the sample means) is:
2 + 3 + 4 + 5 + 3 + 4 + 5 + 6 + 4 + 5 + 6 + 7 + 5 + 6 + 7 + 8 = 80 = 5.0 16 16
variability: the standard deviation of the distribution of
sample means is called the standard error of
=
= standard distance
between
and m .
and the population mean m .
the major purpose/use of the standard error of
is that it tells us how well the
sample
mean estimates the population mean.
In other words, how big the
sampling error is.
Computing the standard error
___________________________________________________ scores sample mean sample first second (std dev = sqrt(SS/N) = sqrt(40/16) = sqrt(2.5) = 1.58) (
- m) (
- m)2 1 2 2 2 -3 9 2 2 4 3 -2 4 3 2 6 4 -1 1 4 2 8 5 0 0 5 4 2 3 -2 4 6 4 4 4 -1 1 7 4 6 5 0 0 8 4 8 6 1 1 9 6 2 4 -1 1 10 6 4 5 0 0 11 6 6 6 1 1 12 6 8 7 2 4 13 8 2 5 0 0 14 8 4 6 1 1 15 8 6 7 2 4 16 8 8 8 3 9 SS = 40
the short way
large s big differences from the pop mean |
small s small differerences from the pop mean |
2) the size of the sample - the larger your sample size (n), the more accurately the sample represents the population. This is known as the law of large numbers .
|
- If I randomly selected 1 student, how accurately will that student's score predict the population's score? |
|
- Suppose that I take 5 students. Are things more accurate? |
|
- what about 100 students? |
these two characteristics are combined in the formula for standard error.
standard error of
=
=
Returning to our example. The standard deviation of our original population is s = 2.24.
So: 2.24/sqrt(2) = 1.58
In our example we've simplified things greatly. We have a really small population, and we took a pretty small sample. Most situations, however, will be much more complex. Lucky for us there are some properties of means of samples that will help us out.
All of these properties (shape, mean, variability) are covered in the Central Limit Theorem
So, when n is large (at or above 30):
~ N (m ,
)
Normal distribution is a commonly found distribution that is symmetrical and unimodal. It is defined by the following equation:
Y =![]()

A few things to note about Normal Distributions.
| This is sometimes referred to as the 68-95-99.7 rule. In the normal distribution with mean m and a standard deviation s: | |
|
|
![]()
|
Using the unit normal table.
|
|
So by using the table, we can an ask about different areas under the curve. We can also go in both directions. That is, from the table of z-scores to probabilities and/or from probabilities to z-scores.
Notice that z = 1.0 = .5000 + .3413 = the median + the 34.13% that we mentioned before So by using the table, we can an ask about different areas under the curve. And similar to last chapter, we can go in both directions. That is, from the table of z-scores to probabilities and/or from probabilities to z-scores.
Let's return briefly to our simplified example.
look at our distribution of sample means, we find that 1 out of 16 have a mean greater than 7. So that's our answer: 1/16 = .0625 = 6.25%
Here is the "best" way to find a probability from the table:
Examples:
| What is the probability of having an IQ of 130 or above? p(X > 130)?
z = (130 - 100)/15 = 2.0 --look at the table--> need Column C p = 0.0228 |
|
| What is the probability of having an IQ of 85 or less? p(X < 70)?
z = (70 - 100)/15 = -1.0 --look at the table--> need Column C p = 0.1587 |
|
| What IQ score do you need to have to be in the top 5% of the
population?
The upper-tail is needed. |
|
Sometimes we need to find the probability that X will fall between two scores rather than simply above a score or below a score.
| What is the prob. of scoring between 300 and 650 on the SAT?
recall: m = 500, s =100 p(z < (650 - 500) = p(z < 1.5) = 0.9332 100 p(z < (300 - 500) = p(z < -2.0) = 0.0228 100 the .9332 from 650 includes the lower tail, so we determine the proportion in the lower tail, and subtract that p(300 < z < 650) = .9332 - .0228 =.9104 |
|
| What is the prob. of scoring lower than 300 or higher than 650 on the
SAT? recall: m = 500, s =100 p(z > (650 - 500) = p(z > 1.5) = 0.0668 100 p(z < (300 - 500) = p(z < -2.0) = 0.0228 100the two numbers both reflect the proportions in the tails, so we just need to add them together p(300 < z < 650) = .0668 + .0228 =.0896 |
|
| What is your percentile rank if you have an IQ of 130?
for IQ scores m = 100, s =15 |
|
| What is the interquartile range for the SAT?
recall: m = 500, s =100 |
|
Note there is a short-cut for figuring out the IQR. Since the range is always + .67s, then you can compute the IQR as being (2)(.67)(m)
Now let's bring Samples back into the picture.
First we need to get the distribution of the samples (note: we'll
assume a normal
distribution even though n is less than 30.)
~ N
(m,
) = N(100, 5)
Now we need to figure out the z-score that corresponds to this sample mean:
the z-score pretty much looks like what we've used before:
Z
= 
P(
>
112) = P(Z
> (112 - 100)/ 5 ) = P(Z
> 2.4) = 0.0082
|
- at first it looks wrong - it seems like 112 should be less than a z = 1, because 115 is where z should equal 1 |
|
- however, we must remember that this isn't the correct distribution
to be looking at, we need to look at the distribution of sample means.
-we know that the distribution of sample means has a standard error = 5 and a mean = 100. - So 112 should have a z >2 |
Example:
~ N
(m,
) = N(100, 3)
Now we need to figure out the mean that corresponds to this range:
= Z
+ m = (= Z
) + m
step 1: look at unit normal table for 90%
step 2:
= 1.28 * + 100 = (1.28)(3) + 100 =
103.84
Suppose that we asked the same question for a smaller sample, n = 16? How does the answer change?
step 1: look at unit normal table for 90%
step 2:
= 1.28 * + 100 = (1.28)(3.75) + 100
= 104.80
so, for a group of 16 people, they'd have to have a mean of over 104 to be in the top 10%
What about other sample sizes?
n = 9
=
1.28 * + 100 = (1.28)(5) + 100 = 106.40
=
1.28 * + 100 = (1.28)(7.5) + 100 = 109.60
=
1.28 * + 100 = (1.28)(15) + 100 = 119.20
).