Outline

  • t-tests to compare differences for two separate samples
  • Run tests in SPSS
banner

Lab 18

Independent-Samples t-tests

Let's start with a brief review. In the last several labs we looked at ways to test samples.

    We used z-scores to compare a treatment sample with a population for which we knew the population mean μ and the population standard deviation s

    We used one-sample t-tests to examine the same situation, except we don't know the population s, so we need to use an estimate (the sample standard deviation).

    The last lab used a different computational formula to calculate the observed t for two more situations:

    • repeated measures, in which there is one sample, but each individual is tested twice.
    • matched pairs, in which there are two samples, but they are related on a subject by subject basis.

The logic of today's lab should seem similar to the last few. The overall logic is the same, we still use the t-distribution to find our critical values. However, things get a little more complicated, because of the situation that we are interested in. Now we are going to look at a situation where we are interested in the potential difference between two different populations, where each population is represented by a separate (and unrelated) sample. And again, we'll deal with situations in which we don't know the μ or s for either of these populations, so we'll have to use samples to estimate them.

An experiment that uses a separate independent sample for each treatment condition (or each population) is called an independent-measures research design. Often you'll also see it referred to as a between-subjects or between-groups design.

So we'll use the same logic and steps for hypothesis testing that we used in the previous labs, and fill in the details of the differences as we go.

    Step 1: State your H0 and H1 and choose your criterion: α

    Step 2: Collect the sample

    Step 3: Compute the tobs for your sample

    Step 4: Compare tobs to tcrit (or p to α) and make a decision

Let's start with Step 1:

    Figuring out your critera is exactly the same process as before, you pick what your field has decided as being an accepted level of alpha (chance of making a type I error). For our example, let's assume a = 0.05

    The hypotheses are going to be a bit different, because the situation is different. Remember, that now we are making hypotheses about two different populations, not just comparing a treatment to what is known.

    For example, suppose that you want to compare two different treatments (e.g., two ways of studing, two different drugs, etc), or you want to compare two groups of people (e.g., men vs. women, young vs. old, etc.). So now, the hypotheses are about population A (men) and population B (women), and how they are different from one another.

      Suppose that we are interested in how tall men and women are.

      x

      Is this going to be a one-tailed test or a two-tailed test? In this case, we'll conduct a two-tailed test. We won't make a directional prediction.

      So the H0 hypothesis would be that men and women are the same height. That is,

        H0: μMen = μWomen

        - or -

        H0: μMen - μWomen = 0

      Our alternative hypthesis would be that men and women are different mean heights. That is,

        H1: μMen not equal to mWomen

        - or -

        H1: μMen - μWomen not equal to 0

      Note: What might the hypothesis be for a one-tailed test? Men are taller than women.

        H0: μMen< μWomen
        H1: μMen > μWomen

Step 2 : Criterion for decision: α= .05

Step 3 & 4: Sample & test statistics

We are going to be using two samples, one to represent each population.

Men's heights in inches: 67, 73, 74, 70, 70, 75, 73, 68, 69
Women's heights in inches: 69, 63, 67, 64, 61, 66, 60, 63, 63

Remember that because we're using samples, we can only estimate the values of the population parameters, and so we need to take degrees of freedom into account. We've got two samples. How many values are free to vary?

      sample 1: nMen - 1 and sample 2: nWomen - 1
      so together there are nMen + nWomen - 2 = df

    Individuals we have in our samples: nA = 9 and nB = 9

    So the df for our example is: nA + nB - 2 = 9 + 9 - 2 = 16

    Now comes what will look to be the big difference. We need to compute our observed t statistic. Basically, at the conceptual level, the formula is the same. However, at the practical level, it is more complex because we have two samples, which means that we have two estimates.

    tobs =

    is

    The numerator is straightforward:

      xaxb= the difference between the two sample means
      A - μB) = 0: this is H0, and that's what we're testing

    This is the an estimate of the error from the two samples. Recall that each sample will have some sampling error associated with it. What we need to do here is pool the error from the two samples. The reason that we want to pool the samples is to make the estimate of the standard error better. Basically, what we're doing is increasing the sample size that our estimate is based on, which will increase the precision of the estimate.

    - because each sample may be of different sizes (n's) we need to weight each sample's estimate of variability by its degrees of freedom.

    Pooled variance =

      pooled variance = pvarbut we can simplify this  as:  pvar

        The formula for our estimated standard error is:

          esder

        So let's fill in the numbers from our example. To calculate by hand, we need to go back to the raw numbers and compute the SS's and the sample means. Here are the results or those computations:

          xa = 71.0, xb = 64.0
          SSA = 64.0, SSB = 66.0
          sA = 2.83, sB = 2.87

            pv = v= 8.125

            j=25= 1.34

         

            tobs = = = 5.22

 

Step 5 : Compare observed to critical value and make a decision about the null hypothesis

    What is our critical t?

    Go to the table and look up the value for: 2-tailed, α = 0.05, df = 16.

      tcrit = 2.12

So now we compare the two t statistics.

        tobs = 5.22
        tcrit = 2.12

        Our observed (computed) t statistic is much greater than the critical t statistic; in fact, it is greater than the largest critical t reported in the t-table for df = 16, p = .01, 2-tailed. Click on the table above to find that value. So we feel confident in rejecting the H0. There does seem to be a difference between the heights of men and women.

        (1) A psychologist is interested in studying the effects of fatigue on mental alertness. She decides to study this question using an independent samples design. She randomly assigns individuals to two groups.

          Group 1 stays awake for 24 hours

          Group 2 gets to go to sleep

        After this period, each subject is tested to see how well they detect a light on screen (the dependent variable is the subjects' number of mistakes which reflects their mental alertness. So the higher the number, the less alert they are.).

          Here are the results from the two groups:

            n1 = 5

            n2 = 10

            x1 = 35

            x2 = 24
            SS1 = 120
            SS2 = 270

        Using an independent samples t-test, answer the question of whether fatigue adversely affects mental alertness (α = 0.05).


Using SPSS to compute independent-samples t-tests

The data entry for this test is different from that for related samples. There we had two measures taken on the same or related persons. Here we have only one dependent measure ("readingbefore" in the screenshots below) taken on two independent groups (two clubs "Latin" and "Sports"). There we had two columns for measures; here we have only one. But as a second column we need to identify which group the person belonged to. So we code group membership just as we could code club by indicating 1 = latin and 2 = sports.

    The usual practice is to indicate the independent variable (group) in the first column and the dependent variable in the second column.
    Go to Analyze and Compare Means. The one that we're interested in today is Independent- Samples T Test.

     

    In the next window, select the variables that you are testing and then enter your independent variable into Grouping Variable.

    r
    Click the button to Define Groups and enter the values that you used to define the groups (e.g., 1 for group 1 and 2 for group 2). Then click Continue, and when you reutrn to the previous page, OK. d

     

    Here is what the output will look like.

    2

     

    The output includes statistics for each group, tobs, degrees of freedom, p-value (Sig. 2-tailed), the mean difference (sample mean1 - sample mean2), and SE of differences.

    Note that SPSS doesn't tell you to reject or fail to reject the H0, nor does it give you the tcrit. To make your decision about the H0 you must compare the 2-tailed p-value with your α-level. If the p-value (or 1/2 of it for a 1-tailed test) is equal to or smaller than the your α-level, then you should reject the H0; otherwise you should fail to reject H0.

    Notice that there are two rows of numbers in the t-test output. These correspond to the two t-tests we ran in Excel. Equal variances are assumed in row 1and not assumed in row 2. The results are presented for Levene's Test for Equality (or homogeneity) of Variance . If the Levene's test is not significant (Sig. > 0.05), then we can assume equal variances and use the values in row 1. If Levene's test is significant (Sig. < 0.05), we must use the values in row 2, which makes adjustments to more accurately estimate the t-value in such a case.

  (2): A psychology instructor at a large university teaches statistics. Because there are 22 students in the class, he has broken them into 2 groups. Each group has a different graduate assistant who is responsible for running separate breakout lecture and lab sections of the course. One GA has lots of experience teaching, while the other has more limited experience. The instructor wants to check for comparable learning across the two GAs, hoping to find no difference. The data below are the scores (out of 100) of the students on the first midterm. Assume an a = 0.05 level. The data are as follows (notice that one group has more students than the other):

  Group 1 (less experienced GA)  

Group 2 (more experienced GA)

1      60 11      70
2      65 12      85
3      69 13      72
4      58 14      83
5      57 15      81
6      59 16      69
7      52 17      65
8      72 18      75
9      70 19      79
10      65 20      71


21      89


22      80

Enter the data into SPSS. Test your H0 using an independent-samples t-test. Create a bar graph to show the results.


Lab Exercises 3 - 5 show how different designs affect the following set of data. Both tests will be to find any difference between treatments with α = .05. (These data can represent either 10 different participants or 5 participants tested in each condition.)

Treatment 1	Treatment 2
10 11
2 5
1 2
15 18
7 9

(3) Assume that the data are from an independent samples experiment using two separate samples, each with 5 subjects. Use SPSS to test whether the data indicate a significant difference between the two treatments (assume a = .05). List each step of the hypothesis testing procedure.

(4) Now assume that the data are from a repeated-measures design using one sample of 5 subjects, each of whom have been tested twice. Use SPSS to test whether the data indicate a significant difference between the two treatments (again assume a = .05). Remember that you'll have to change the way the data are entered in the data window.

(5) You should find that the repeated measures design and the independent samples design reaches a different conclusion. How do you explain the differences (hint: think about how sampling error is estimated for the two tests).