Psychology 340 Syllabus
Statistics for the Social Sciences

Illinois State University
J. Cooper Cutting
Fall 2002



ANOVA: Analysis of Variance

  • Why Analysis of Variance?
  • Hypothesis testing with ANOVA
  • The F-ratio
  • Computing 1-way between factors ANOVA
  • Reporting and interpreting ANOVA results
  • Using SPSS to do 1-way between factors ANOVA
  • Planned and post-hoc comparisons
  • Using SPSS to do planned and post-hoc comparisons


    The statistical tests that we have covered so far are limited in that they can only be used with designs that have one (one sample t and z tests) or two groups (related samples and independent samples t-tests). Often, however, researchers are interested in hypotheses that require more than the comparison of two groups.

    t-tests and z-scores are great but they are limited, they can't be used for more than 2 groups. Instead what we need to do is use a new inferential statistical procedure: Analysis of Variance (ANOVA).

    Why do we have to use variance?

    We'll start with by talking about the overall logic of ANOVA, and discuss some new notation. Then we'll work through the process and some examples. We'll end the chapter with a discussion of post-hoc tests (don't worry about what these are, yet).


    Hypothesis testing with ANOVA

    We'll be using the same hypothesis testing logic that we used in previous chapters, but the details will change (as they did from chapter to chapter).

    Okay, let's start by considering a new example that is a single-factor, independent-measures research design (more complex designs will flow in the future weeks).

    Example research project


    Some new notation

    The experimental situations that we're dealing with are more complex than those we've dealt with so far. We aren't going to need to use any new kinds of math (still just adding, subtracting, multiplying, and dividing), but we do need some new notation. The computations that we'll go through are pretty much the same things that we've done in the past (e.g. computing sums of squares, means, etc.), but there are more of them.

    K = # of treatment conditions (or groups), each of which is called a level of the factor.
    ni = # of observations in the ith group (if they are not equal)
    N = Sni = total sample size
    Ti = SXij (where Xij is the jth observation in the ith group)
    G = SXij = the sum of all the X's = grand sum
    = G / N = grand mean
    SSi = the sum of squares for each group = S(Xij - i)2

    Basically the new notation is takes into account the need to know means and standard deviations for each group alone and for all of the data together (collapsed across the different groups).

    Example:

    So let's consider some data for our proposed experiment.

    Study method
    Method A
    book alone
    Method B
    taking notes
    Method C
    borrowing notes
    0 4 1
    1 3 2
    3 6 2
    1 3 0
    0 4 0
    T1 = 5 T2 = 20 T3 = 5
    SS1 = 6 SS2 = 6 SS3 = 4
    n1 = 5 n2 = 5 n3 = 5
    1 = 1 2 = 4 3 = 1

    SX2 = 106
    G = 30 = grand sum
    N = 15 = total sample size
    = 30/15 = 2 = grand mean
    K = 3 = # of treatment conditions (or groups), each of which is called a level of the factor


    Okay, now that we have all of the new pieces of information that we need, let's start doing the Analysis of Variance.

    recall that what we're after is:

                    F-ratio =       variance between treatments
                                     variance within treatments
    
    

    so what we need to do is figure out how to get these two variances.

    in the past we've used s2 = SS/df. That's essentially what we're going to be doing here, but things will look a bit more complex.

    But it isn't the SStotal that we really need. Remember that what we really need are the within and between groups variabilities.


    Using the ANOVA table (the F-table)

    So what's the next step?


    Using SPSS to do single factor ANOVA

    To set up a paired samples t-test you will need two columns of data, one for each sample (related samples) or one for each meansurement (repeated measures).

    Note: To do One-way ANOVA you'll need to have two variables (columns) in your data file (this is just like with 2 independent samples t-test except now your independent variable will have more than two categories). One column will contain the data (your dependent measure). The other column will be an independent variable that specifies which group the subject belongs to (e.g., 1 for group 1, 2 for group 2).
    Go to the Analyze menu and select the submenu Compare Means. In this submenu you'll see several tests. The one that we're interested in today is One-way ANOVA.
    After selecting One-way ANOVA you'll get a window that looks like this. Here you should select the variables that you are testing. Your test variable is your dependent variable. Your group variable is the independent variable that assigns each subject to a group.
    Here is what the output will look like.
    Notice that the output is given in the standard ANOVA table output. SPSS doesn't tell you to reject or fail to reject the H0, nor does it give you the Fcrit. To make your decision about the H0 you must compare the p-value with your a-level. If the p-value is equal to or smaller than the your a-level, then you should reject the H0, otherwise you should fail to reject H0.



    One factor independent samples ANOVA

    
    F-ratio =    variance (differences) between sample means
                variance (differences) expected by chance (error)
    
    F-ratio = treatment effect + individual differences + random error
    	       	individual differences + random error
    
    

    Relevant formulas

    K = # of treatment conditions (or groups), each of which is
    called a level of the factor.

    n = # of observations in each group
    (if they are equal)

    ni= # of observations in the ith group
    (if they are not equal)

    N =  = total sample size

    Ti =

    G =  = grand sum

    SSi =

     = grand mean

    SStotal =

    SSwithin = SSSinside each treatment
           =

    SStotal = SSwithin + SSbetween

    dftotal = dfwithin + dfbetween

    SSbetween =

                  =

    SSbetween =

                  =

    dftotal = N - 1

    dfwithin = = N - K

    dfbetween = K - 1

    MSbetween =

    MSwithin = MSerror =

    F-ratio = MSbetween
                  MSwithin



    Single variable – one Factor

    ·      Two levels (t-test)

    o      Basically you want to compare two groups

    o      The statistics are pretty easy, a t-test

     

     

    Disadvantages:

    ·      “True” shape of the function is hard to see

    ·      interpolation and extrapolation are not a good idea

    ·      more complex theories typically need more complex designs (more than two levels of one IV)

     


    ·      More than two-levels (ANOVA)

     

    o      Gives a better picture of the relationship (function)

    o      Requires more complex statistical analysis (analysis of variance and pairwise-comparisions)

    o      Needs more resources (participants and/or stimuli)

     

     


    Let's finish up with the situation that we started with in the fisrt two ANOVA lectures.


    An example

    A drug company is developing several new pain killers. It wants two test the effectiveness of the drugs compared to a placebo. They give 4 groups of participants one of 4 drugs, A, B, C, and Placebo and then measure their pain tolerance. Consider the following set of data. Use SPSS to perform the One-way ANOVA.

    Drug type
    Placebo Drug A Drug B Drug C
    0 0 3 8
    0 1 4 5
    3 2 5 5



    If you have any questions, please feel free to contact me at jccutti@mail.ilstu.edu.