The t-distribution
We're now going to move back up to the top
portion of our diagram to at cases where we are
making comparisons and looking for differences.
We've already looked at one test that will allow
us to do that: one-sample z-test. Today, we'll
look at another test that allows us to compare a
treatment population (represented with a single
sample) with a known population mean. Let's a
quick look at our diagram:
Which test?
We're going to consider in this
lab the case where we know the original
population mean, but we don't know the
population standard deviation (σ) so we have to
estimate it from our sample. We
do that in the 1-sample t-test. Find the string
of decisions that lead to a one-sample t-test.
Here are the relevant formulas.
Rule: When you know
the value of σ, use a z-score. If σ is
unknown, use s to to estimate σ and use the
t-statistic.
The t statistic is used to
test hypotheses about μ when the value for σ2
is not known. The formula for the t statistic
is similar in structure to that for the
z-score, except that the t statistic uses
estimated standard error.
Consider the following
scenarios. For each determine which formula (z
or t) is the appropriate one to use to answer
the question asked. (You don't need to do any
computations.)
(1)
Pat, a personal trainer would like to
examine the effects of humidity on exercise
behavior. It is known that the average
person in the United States exercises an
average of μ = 21 minutes each day. The
personal trainer selects a random sample of
n = 100 people and places them in a
controlled atmosphere environment where the
relative humidity is maintained at 90%. The
daily amount of time spent exercising for
the sample averages = 18.7
minutes with s = 5.0. So Pat wants to know
if humidity affects exercise behavior.
(2) In
an attempt to regulate the profession, the
US Department of Fitness has developed a
fitness test for personal trainers. The test
requires that the trainers must perform a
series of exercises within a certain period
of time. Normative data, collected in a
nationwide test, reveal a normal
distribution with an average completion time
of of μ = 92 minutes and of σ = 11. Pat, and
four other Hollywood personal trainers (so n
= 5) take the test. For these trainers, the
average time to complete the task is
averages = 115
minutes. Pat is worried that the Holywood
personal trainers (in this sample) differ
significantly from the norm.
Using the t table
Because we are using the sample
standard deviation to estimate the
population standard deviation (σ), we need to
take the degrees of freedom into
account. Degrees of freedom describe
the number of scores in a sample that are free
to vary. Because the sample mean places a
restriction on the value of one score in the
sample, there are n - 1 degrees of
freedom for the sample.
Notice that we're talking about
a new distribution here (or family of
distributions, the t-distributions), found in
the t distribution table. Part of it is shown
here, the same as the one appended to your
packet.
One tail
probability p |
.. |
0.25 |
0.10 |
0.05 |
0.025 |
0.01 |
.005 |
|
Two tail
probability p |
df |
0.50 |
0.20 |
0.10 |
0.05 |
0.02 |
0.01 |
|
1 |
1.00 |
3.078 |
6.314 |
12.706 |
31.821 |
63.657 |
2 |
0.816 |
1.886 |
2.920 |
4.303 |
6.965 |
9.925 |
3 |
0.765 |
1.638 |
2.353 |
3.182 |
4.541 |
5.841 |
4 |
0.741 |
1.533 |
2.132 |
2.776 |
3.747 |
4.604 |
5 |
0.727 |
1.476 |
2.015 |
2.571 |
3.365 |
4.032 |
6 |
0.718 |
1.440 |
1.943 |
2.447 |
3.143 |
3.707 |
:
: |
:
: |
:
: |
:
: |
:
: |
:
: |
:
: |
z* |
0.674 |
1.282 |
1.645 |
1.96 |
2.326 |
2.576 |
CI% |
50% |
80% |
90% |
95% |
98% |
99% |
|
|
|
t-distribution table
Think back to the 1-sample
z-test. One of the ways that we would make our
decision about whether or not to reject the H0
was to figure out what z-score corresponded to
the critical region (e.g., 1.65 = critcal z
for 1-tailed test with α = to 0.05), then look
at our the z that we computed and see if it
was greater than (or equal to) that critical
z. If if was, then we rejected the H0,
if it wasn't then we failed to reject H0.
But keep in mind that for
z-scores we use the unit normal table which
only describes one distribution. So, for all
1-tailed tests with α = 0.05, the critical
value of z will be 1.65. The logic is the same
here with the t-table. But now, the critical
values are going to change as a function of
which t-distribution that we are looking at,
which is in turn dependent on df.
The t-table clearly shows the
relationship between a 1-tailed and 2-tailed
p-value. Any given t-value has half
the p-value 1-tailed compared to 2-tailed. That
is why the p =.025, 1-tailed, and p = .05,
2-tailed, are in the same column. If t (df=6)
= 2.0 and a = .05, it is in the critical
region 1-tailed (critical t = 1.943) but not
2-tailed (critical t = 2.447).
1-sample t-test
NOTE: We
will use the following notations:
tobs
= |
tcrit
= critical t from the table |
Example 1:
Suppose that your physics professor, Dr. M.
C. Squared, gives a 20 point true-false quiz
to 9 students and wants to know if they did
worse than guessing. Their scores were: 6, 7,
7, 8, 8, 8, 9, 9, 10. We'll assume a
significance level of α = 0.05.
step 1:
one-tailed test (worse than guessing)
H0: μ > 10
H1: μ < 10;
α = 0.05
(note: the null is what they'd get if they
were guessing which would be 10 out of 20).
step 2: compute sample stats
step 3: compute tobs
est standard error = s/sqroot(n) =
1.225/sqroot(9) = 0.41
tobs = = (10 - 8) / 0.41
= -4.88
and df: n = 9, so df = 9
- 1 = 8
step 4: find the critical t from the
table to compare with tobs and make
decision:
df = 8, one-tailed test, α = 0.05, so
tcrit = -1.86
(keep in mind that this is worse than, so the
critical t is negative)
tobs = -4.88 < tcrit
= -1.86
reject the H0 - so it looks as
if the students would have been better off
guessing.
Example 2:
Suppose that your psychology professor, Dr.
I. D. Ego, gives a 20 point true-false quiz to
9 students and wants to know if they were
different from groups in the past who have
tended to have an average of 9.0. Their scores
from the current group were: 6, 7, 7, 8, 8, 8,
9, 9, 10. Did the current group perform
differently from those in the past. We'll
assume a significance level of α = 0.05.
step 1:
step 2: compute sample stats
step 3: compute tobs
est standard error = s/sqroot(n) =
1.225/sqroot(9) = 0.41
tobs = = (9 - 8) / 0.41
= -2.44
what is our df? n = 9, so df = 9
- 1 = 8
step 4: find the critical t from
the table to compare with tobs
and make decision:
df = 8, two-tailed test, α = 0.05, so
tcrit = ±2.306
tobs = -2.44 < tcrit
= ±2.306
reject the H0 - so it looks
as if the current students are different
from past students (they are doing worse).
(3) Ok, now let's try the
examples from above. First, decide whether to
use the one-sample z-test or t-test. Then do
the hypothesis for that test, showing all
steps.
(a) Pat, a personal
trainer would like to examine the effects of
humidity on exercise behavior. It is known
that the average person in the United States
exercises an average of μ = 21 minutes each
day. The personal trainer selects a random
sample of n = 100 people and places them in
a controlled atmosphere environment where
the relative humidity is maintained at 90%.
The daily amount of time spent exercising
for the sample averages = 18.7 minutes
with a s = 5.0. So Pat want to know if
humidity affects exercise behavior at the
alpha = 0.05 level.
(b) In an attempt to
regulate the profession, the US Department
of Fitness has developed a fitness test for
personal trainers. The test requires that
the trainers must perform a series of
exercises within a certain period of time.
Normative data, collected in a nationwide
test, reveal a normal distribution with an
average completion time of of μ = 92 minutes
and of σ = 11. Pat, and four other Hollywood
personal trainers (so n = 5) take the test.
For these trainers, the average time to
complete the task is averages = 115 minutes. Pat
is worried that the Hollywood personal
trainers (in this sample) differ
significantly from the norm.
Using SPSS to compute a one-sample t-test
Excel does not have a formula
for this test. (It is rarely used except for
teaching purposes. If, as in standard tests,
the population mean is known, so is the
population standard deviation.) We'll now go
through the procedure in SPSS to compute
one-sample t-tests.
Go to the Analyze menu
and select the submenu Compare Means.
In this submenu you'll see several
tests. The one that we're interested
in today is One-sample
t-test. |
|
After
selecting One-sample t-test,
you'll get a window that looks
like this. Here you should select the
Test Variable from
your sample that you want to
analyze. Then you enter the Test
Value,
which is the mean for H0
(10 for the Dr.MC Squared
example; population mean for
the other cases). This is very
important: You must always
enter the population mean on
this screen in a one-sample
t-test! |
|
|
Here is what
the output will look like. |
|
The output includes
the sample mean, the sample standard
deviation, the standard error, the tobs
(in the t column), the degrees of
freedom, the mean difference (the
sample mean - the test value), and a
p-value for a 2-tailed test
of significance. All of these
results should match what we got
above going through the problem by
hand. check to see that they do.
Note that SPSS does
not tell you to reject or fail to
reject the H0, nor does
it give you the tcrit. To
make your decision about the H0,
you must compare the
2-tailed p-value with
your α-level.
If you are
conducting a 1-tailed test, the
p-value is half of what is
reported. So if p (2-tailed) =
.001, then p (1-tailed) = .0005.
If the p-value
is equal to or smaller than the
your α-level (it's in the critical
region), then you should reject
the H0; otherwise you
should fail to reject. So if a =
.05 and p = .0005, p < a;
therefore, reject the null
hypothesis.
|
(4) Use SPSS for the rest
of the questions. Enter the data into SPSS and
then conduct the one-sample t-test in order
the answer the questions that follow. An
example of how to address this question is
provided in the instructions above.
Suppose that your
psychology professor, Dr. I. D. Ego, gives a
20 point true-false quiz to 9 students and
wants to know if they were different from
groups in the past who have tended to have an
average of 9.0. Their scores from the current
group were: 6, 7, 7, 8, 8, 8, 9, 9, 10.
(a) Did the current group
perform differently from those in the past?
We'll assume a significance level of α = 0.05.
(5) Now try another problem
with SPSS and write out each step of
hypothesis testing, using the values on the
SPSS output for tobs, df, and the
significance of tobs.
The personnel department for a
major corporation in the Northeast reported
that the average number of absences during the
months of January and February last year was μ
= 7.4. In an attempt to reduce absences, the
company offered free flu shots to all
employees this year. For a sample of n = 10
people who took the flu shots, the number of
absences this year were 6, 8, 10, 3, 4, 6, 5,
4, 5, 6. Do these data indicate a significant
reduction in the number of absences?
Use α = .05.
|