Psychology 240: Statistics 1 Lectures: Chapter 13

Psychology 240 Lectures
Chapter 12
Statistics 1

Illinois State University
J. Cooper Cutting
Fall 1998, Section 04

UNDER CONSTRUCTION

Your textbook:

Gravetter, F. J., Wallnau, L. B. (1996). Statistics for the Behavioral Sciences:
A First Course for Students of Psychology and Education, 4th Edition. New York: West Publishing.

Chapter 13: Analysis of Variance

Consider the following senario:

		Reading session duration
		5 mins	15 mins	30 mins
Age	3 yrs
	8 yrs
	14 yrs

T-tests and z-scores are great but they are limited, they can't be used for more than 2 groups. Instead what we need to do is use a new inferential statistical procedure: Analysis of Variance (ANOVA).

Let's look at the versatility of ANOVA. In the above example we have two independent variables (often referred to as factors): age and reading session duration. In this case both are between groups (independent sample) variables. However, you can also use ANOVA to analyze designs with within groups (repeated measures) factors. And, you can mix between and within groups factors too (e.g. suppose that instead of using different kids of different ages, you tracked the same kids over many years. So Age is within groups, but reading session duration times is still between groups).

single-factor design

factorial design

The individual treatment conditions that make up a factor are called levels of the factor.

So the study described above is a factorial design, with two between groups factors, and each factor has 3 levels (sometimes described as a 3 by 3 between groups design).

For this course, we'll learn the basics of ANOVA. We'll focus on a single-factor, independent-measures research design. Later chapters (and courses, e.g., psych 341), cover ANOVA's for more complex designs.

We'll start with by talking about the overall logic of ANOVA, and discuss some new notation. Then we'll work through the process and some examples. We'll end the chapter with a discussion of post-hoc tests (don't worry about what these are, yet).

We'll be using the same hypothesis testing logic that we used in chapters 8-11, but the details will change (as they did from chapter to chapter).

step 1

₀

₁

step 2

step 3

step 4

step 5

step 6

step 7

₀

Okay, let's consider a new example that is a single-factor, independent-measures research design.

Step 1

₀

₁

₂

₃

H₁: one of the groups is different from one or more of the other groups so there are really lots of possible alternative hypotheses

₁

₂

₃

₁

₃

₂

₁

₂

₃

₁

₂

₃

Often, people will just give the null hypothesis, because there are just too many alternatives (imagine how many we could have for our orignal 3X3 design)

step 2

insert figures here step 3: figure out the df for your test. We'll save this for a little later when we start using a concrete example. One thing to note is that we're going to have several dfs to consider (or worry about, if that's the way you feel).

step 4: find the critical f-statistic from the table (new table starting on pg 695)

	df in the numerator
df in denominator	1	2	3	4	5
1	161 4052	200 4999	216 5403	225 5625	230 5764
2	18.51 98.49	19.00 99.00	19.16 99.17	19.25 99.25	19.30 99.30
3	10.13 34.12	9.55 30.92	9.28 29.46	9.12 28.71	9.01 28.24
: :	: :	: :	: :	: :	: :

Recall that we'll have two dfs, we use one to find the correct row and the other to find the correct column. You'll also note that there are two numbers per cell. The lighter numbers correspond to a = 0.05, the bold numbers correspond to a = 0.01.

step 5

In many ways the ANOVA is very similar to a two independent smaples t-test (in fact we'll later we'll talk about how they are nearly identical under certain circumstances).

tobs = obtained difference between sample means difference expected by chance

= insert formula for independent samples t-test here

remember the pop mean part = 0, so what we're left with is a difference between the sample means

For ANOVA the test statistic (called the F-ratio) is similar

		F = 	  variance (differences) between sample means
			variance (differences) expected by chance (error)

Why do we have to use variance?

How would we compute a single score that would describe the difference between these distributions? Difference just doesn't cut it, but variance does (recall it is a measure of how these much these three distributions differ from oneanother).

Notice that this is how ANOVA gets its name. Analysis of Variance.

between treatments variability

Treatment/group effects

Individual difference effects

Random (experimental) error

within treatment variability

Individual difference effects

Random (experimental) error

BUT NOT treatment/group effects - this is the key

So the F-ratio can be expressed as:

			F-ratio = 	  variance (differences) between sample means
					variance (differences) expected by chance (error)


			F-ratio = 	variance between treatments
					  variance within treatments


			F-ratio = treatment effect + individual differences + random 
error
					individual differences + random error

Note: sometimes you'll hear the denominator referred to as the error term- it provides a measure of the variance due to chance

if H₀ is true, then what should the value of the treatment effect be?

H₀: m₁ = m₂ = m₃

so there are no difference, so variance should be = 0
If variance = 0, then what is the value of the F-ratio?

	F-ratio = 0 + individual differences + random error = 1 = 
1.0
		     individual differences + random error    1

of H₀ is false, then the F-ratio will be greater than 1.

step 6

₀

New Notation for ANOVA

K = # of treatment conditions (or groups), each of which is called a level of the factor.
n = # of observations in each group (if they are equal)
n_i = # of observations in the ith group (if they are not equal)
N = Sn_i = total sample size
T_i = SX_ij (where X_ij is the j^th observation in the ith group)
G = SX_ij = the sum of all the X's = grand sum
G-bar = G / N = grand mean

SS_i = the sum of squares for each group = S(X_ij - _i)²

So let's consider some data for our proposed experiment.

Study method

Method A
book alone Method B
taking notes Method C
borrowing notes
0 4 1

1 3 2

3 6 2

1 3 0

0 4 0

T₁ = 5 T₂ = 20 T₃ = 5

SS₁ = 6 SS₂ = 6 SS₃ = 4

n₁ = 5 n₂ = 5 n₃ = 5

₁ = 1 ₂ = 4 ₃ = 1

SX² = 106
G = 30 = grand sum
N = 15 = total sample size
G-bar = 30/15 = 2 = grand mean
K = 3 = # of treatment conditions (or groups), each of which is called a level of the factor

recall that what we're after is:

F-ratio = variance between treatments variance within treatments

in the past we've used s² = SS/df. That's essentially what we're going to be doing here, but things will look a bit more complex.

SS_total = SX² - (G²/N)

SS_total = 106 - (30²/15) =106 - 60 = 46

_total

SS_total = SS_within + SS_between

_within

we add together all of the SS estimates for each group

SS_within = SSS inside each treatment = SSS_i

= 6 + 6 + 4 = 16

_between

_total

_within

But, there are two drawbacks of doing it this way:

if you make a mistake in computing any of you SS_i's then you'll get the wrong answer.
it doesn't give you the sense of what SS_between is made up of.

_between

definitional computational

SS_between = S[n_i( - G-bar)²] SS_between = S(T²/n_i) - G²/N

= 5(1 - 2) ² + 5(4 - 2) ² + 5(1 - 2)²
= 5 + 20 + 5
= 30
= 5²/5 + 20²/5 + 5²/5 - 30²/15
= 5 + 80 + 5 - 60
= 30

Let's check out math.

SS_total = SS_within + SS_between = 16 + 30 = 46 (and that's the number we got before)

Okay, let's return to what we're interested in.

s² = SS/df.

We've got our SS, now we need to figure out our df. There are going to be two (or three depending on how you look at it) degrees of freedom, one for the between group variance and one for the within groups variance (and a third for total df).

_total

df_within = = N - K

df_between = K - 1

df_total = df_within + df_between

_within

df_between = 3 - 1 = 2

df_total = 15 - 1 = 14, which is also = 12 + 2

Mean Square

MS_between = SS_between/df_between

for our example = 30/2 = 15

_within

mean square error

MS_within = MS_error = Mean Square Error = SS_within/df_within

--> for our example = 16/12 = 1.33

Almost done. Let's look back at what we're after:

F-ratio =  variance between treatments      = 	
MS_between
	   variance within treatments		MS_within

So the F-ratio for our example is: 15/1.33 = 11.28

Source			SS	df	MS			
Between treatments	30	2	15.0	F = 11.28
Within treatments	16	12	1.33			
Total			46	14

So what's the next step?

_between

_within

df_within = 12

df_between = 2

df in the numerator

df in denominator 1 2 3 4 5

1 161
4052 200
4999 216
5403 225
5625 230
5764

2 18.51
98.49 19.00
99.00 19.16
99.17 19.25
99.25 19.30
99.30

3 10.13
34.12 9.55
30.92 9.28
29.46 9.12
28.71 9.01
28.24

:
: :
: :
: :
: :
: :
:

12 4.75
9.33 3.88
6.93 3.49
5.95 3.26
5.41 3.11
5.06
13 4.67
9.07 3.80
6.70 3.41
5.74 3.18
5.20 3.02
4.86

:
: :
: :
: :
: :
: :
:

₀

₁

₂

₃

How would one report this (this is what you'll want to know for the Holcomb exercise)?

_between

_within

"A one-way ANOVA yeilded a significant effect of study method, F(2,12) = 11.28, p < 0.01."

Note: in this day and age, computers actually have the family of F distributions, so your data output may actually give you your actual p-value. Rememeber that the logic of the test is such that you must specify your a level ahead of time. If you select 0.01, then that's the level that you are using for all of your tests. So if you do two experiments and your computer stats program tells you that in experiment 1, your p-value = .001, and in experiment 2 your p-value is .01 they are both equally statistically significant. The H₀ is a yes/no decision. In this example the answer to both is YES. The results in Experiment 1 are NOT "more significant" than Experiment 2.

Post hoc tests

₀

₁

₂

₃

m₁ not equal to m₂ = m₃ m₁ = m₃ not equal to m₂ m₁ = m₂ not equal to m₃ m₁ not equal to m₂ not equal to m₃

So typically, after getting a significant difference result from your ANOVA (rejecting the H₀) one would then perform some post hoc tests. Post hoc tests will allow us to compare the groups to one another, to see which are different from which.

Basically, what the post hoc tests allow you to do is go back and compare each treatment group to each other treatment group, two at a time. This is called making pairwise comparisions.

So in our above example, we could go back and compare m₁ to m₂, m₁ to m₃, and m₂ to m₃.

Anybody see a potential problem with doing this?

Each of these comparisons is a separate hypothesis test, each one has a risk of making a Type I error. So, the more comparisions that you make, the higher the risk of concluding that there is a difference when there really isn't one. This is called experimentwise alpha level (or familywise error)

aEW = 1 - (1 - a)c c = # of comparisons

so for our example, if we chose a = 0.05 and make 3 comparisons

aEW = 1 - (1 - a)c = 1 - (.95)3 = 1 - .857 = .143

our chance of making a Type I error when making the comparisions is now 1 in 7 rather than 1 in 20

Most post hoc tests have been designed to control the experimentwise error. We'll talk about two such tests: Tukey's HSD test (honestly signficant difference) and the Scheff� test.

Tukey's HSD test

allows us to compute a single value that determines the minimum difference betweeen treatment means that we must have to consider the difference statistically significant.

This test requires that the groups have equal sample sizes.

HSD =

the value for q is found in Table B.5 (in the Appendix, p. A- 32). To figure out q you must know K, and df_within, and what aEW you want to use.

So for our study methods example (pick aEW = .05):

HSD = = = (3.77)(.516) = 1.94

Recall: 1 = 1, 2 = 4, 3 = 1

Comparison 1: H₀: m₁ = m₂

2 -1 = 4.0 - 1.0 = 3.0

HSD = 1.94 < 3.0 So we reject H₀

Comparison 2: H₀: m₁ = m₃

3 -1 = 1.0 - 1.0 = 0.0

HSD = 1.94 > 0.0 So we fail to reject H₀

Comparison 3: H₀: m₂ = m₃

2 -3 = 4.0 - 1.0 = 3.0

HSD = 1.94 < 3.0 So we reject H₀

So group B is different from A and C, but A and C don't differ from one another.

Scheff� test

Uses the F-ratio to test the differences. This is an extremely conservative test (reduces the risk of Type I error, but increases risk of Type II error). You CAN use this test with unequal n's

we will recompute our MS_between, to reflect only the comparison that we test each time. Note: we use the overall df_between and the overall MS_within.

Recall:

Study method Method A book alone Method B taking notes Method C borrowing notes 0 4 1 1 3 2 3 6 2 1 3 0 0 4 0 T1 = 5 T2 = 20 T3 = 5 SS1 = 6 SS2 = 6 SS3 = 4 n1 = 5 n2 = 5 n3 = 5 1 = 1 2 = 4 3 = 1 Source SS df MS

Between treatments 30 2 15.0 F = 11.28

Within treatments 16 12 1.33 Total 46 14

Comparison 1: H₀: m₁ = m₂

SS_between == + - = 22.5

MS_between = = 22.5/2 = 11.25

MS_within = = 16/12 = 1.33

F-ratio = MS_between = 11.25/1.33 = 8.46 MS_within

Now go look at the F-table. a = .05, Fcrit(2,12) = 3.88

8.46 > 3.88 So we reject H₀

Comparison 2: H₀: m₁ = m₃

SS_between == + - = 0

MS_between = = 0/2 = 0

MS_within = = 16/12 = 1.33

F-ratio = MS_between = 0/1.33 = 0 MS_within

Now go look at the F-table. a = .05, Fcrit(2,12) = 3.88

0 < 3.88 So we fail to reject H₀

Comparison 3: H₀: m₂ = m₃

SS_between == + - = 22.5

MS_between = = 22.5/2 = 11.25

MS_within = = 16/12 = 1.33

F-ratio = MS_between = 11.25/1.33 = 8.46 MS_within

Now go look at the F-table. a = .05, Fcrit(2,12) = 3.88

8.46 > 3.88 So we reject H₀

One final note: relation to t-test

What is the difference between an independent samples t-test and a one factor between groups ANOVA with only two levels?

Not much. F-ratio = t2

Think about it. The difference between t-tests and ANOVA, is that t-tests look at differences between two means and ANOVAs look at variance. But when there are only two groups, then variance basically boils down to squared differences. So square the t statistic and you get the F statistic.

Go to Chapter 12: Estimation
Go to Chapter 16: Correlation and Regression

Return to Psych 240 syllabus page
Return to Psych 345 syllabus page
Return to Statistics Lectures page

Return to Illinois State University Home Page
Return to Illinois State University Psychology Home Page

Study method
Method A book alone	Method B taking notes	Method C borrowing notes
0	4	1
1	3	2
3	6	2
1	3	0
0	4	0
T₁ = 5	T₂ = 20	T₃ = 5
SS₁ = 6	SS₂ = 6	SS₃ = 4
n₁ = 5	n₂ = 5	n₃ = 5
₁ = 1	₂ = 4	₃ = 1

Psychology 240 LecturesChapter 12 Statistics 1

Illinois State University J. Cooper Cutting Fall 1998, Section 04

Post hoc tests

If you have any questions, please feel free to contact me at cutting@main.psy.ilstu.edu.

Psychology 240 Lectures
Chapter 12
Statistics 1

Illinois State University
J. Cooper Cutting
Fall 1998, Section 04