Psychology 240: Statistics 1 Lectures: Chapter 16

Psychology 240 Lectures
Chapter 16
Statistics 1

Illinois State University
J. Cooper Cutting
Fall 1998, Section 04

Your textbook:

Gravetter, F. J., Wallnau, L. B. (1996). Statistics for the Behavioral Sciences:
A First Course for Students of Psychology and Education, 4th Edition. New York: West Publishing.

Chapter 16: Correlation and Regression

Correlation is a statistical technique that measures and describes the relationship between two variables.

Notice that this means that there must be at least two scores from each individual, one for each of the two variables.

Consider the follwing example:

Data Set		Scatterplot
Person X Y A 1 1 B 1 3 C 3 2 D 4 5 E 6 4 F 7 5	Y
		X

1) The direction of the relationship

positive correlation (a positive number) means that the two variables tend to move in the same direction. That is, as one gets larger, so does the other.

negative correlation (a negative number) means that the two variables tend to move in opposite directions. That is, as one gets larger, the other gets smaller.

2) The form of the relationship

we will focus on linear correlations (straight lines), but there are also other forms that the relationship can take.

linear (e.g., height and weight)	non-linear (e.g., age and height)

The degree of the relationship

A correlation also measures the "strength" of the relationship between X and Y. A correlation will have a value between -1 and +1. A correlation of 0 means that there is no relationship. A +1 means that there is a positive "perfect correlation" between two, and a -1 means that there is a negative perfect correlation.

Why (and When) do we use correlations?

Prediction - if we know that two variables are strongly related, then we may be able to predict the value of one, based on the value of the other.

e.g., if you know that ultrasound measurements of a baby's head are positively correlated with birth weight, then you can make an educated guess of the baby's birth weight by measuring the baby's head from an ultrasound

Validity - if you develop a new test (TEST A) for X, and you want to know whether it is truely measuring X, then you can see if TEST A correlates with things that you already know correlate with X.

e.g., if you discover a new formula for predicting birth weight (imagine some magic formula that includes the height and weight of the mother and father combined), then this formula should also correlate with the ultrasound estimates of birthweight.

Reliability - if you use the same test twice on the same individuals, you can correlate the two sets of scores. If the test is reliable, then it should give similar results both times, giving you a high correlation

Theory Verification - many theories will predict that a relationship exists between different variables. So you can then go out, collect some data, and see if such a relationship exists.

Okay, so how do we quantify the idea of correlation? There are a number of different correlations, we will focus on the most common measure, the Pearson product-moment correlation.

	r    =   degree to which X and Y vary together     =       covariability of X and Y    	   
		degree to which X and Y vary separately       variability of X and Y separately

covary means that as X changes, Y also changes.

remember that a "perfect correlation" is r = 1.0 (or -1.0). This means that the number in the numerator equals the number in the denominator. On the bottom, we have two things, how much does X change and how much does Y change. On the top we have, how much to X and Y change together. If these three parts add up to the same thing, then we have and r = 1.0.

now let's consider how we actually compute r.

need to introduce a new concept: sum of products of deviations (SP)

SP =

definitional formula

Consider the following:

X Y X- Y- (devX)(devY) 0 1 -6 -1 6 10 3 +4 +1 4 4 1 -2 -1 2 8 2 +2 0 0 8 3 +2 +1 2 sum 30 10 14 mean 6.0 2.0
So: SP = 14

computational formula

X 	Y	XY
0	1	  0
10	3	 30
4	1	  4
8	2	 16
8	3	 24
30       10	74

SP =

= 74 - 60 = 14

Hopefully, SP reminds you of SS (Sum of Squares). The concepts are very similar. The basic difference is that with SS, we just had one variable (X), however with SP we have two variables (X & Y).

Sum of Squares (SS) Sum of products (SP)

SS = SP =

SS = SP =

Try to keep this in mind, it should help you remember SP.

Okay, now let's compute the pearson correlation (r).

This is what we know so far:

	r    = 	 degree to which X and Y vary together   =       covariability of X and Y	   
		degree to which X and Y vary separately     variability of X and Y separately

r =

in other words, we've got SP on top, which is our measure of covariability of X and Y. On the bottom we've got our measure of variability of X alone and Y alone

so let's return to our example:

	X 	Y	X- 	Y-  	(dev_X)(dev_Y)	(X- )²	(Y- )²
	0	1	-6	-1		6		36	1
	10	3	+4	+1		4		16	1
	4	1	-2	-1		2		4	1
	8	2	+2	0		0		4	0
	8	3	+2	+1		2		4	1	
sum     30	10				14		64	4
mean	6.0	2.0
			So SP = 14; SS_X = 64; SS_Y = 4

So there is a fairly strong positive correlation, as X goes up we can predict that Y will too.

But there are some additional things that we need to consider.

Let's look at each point in a little more depth

4) Correlations describe a relationship between two variables, but DOES NOT explain why the variables are related

e.g.,

a) Suppose that Dr. Steward finds that rates of spilled coffee and severity of plane turbulents are strongly positively correlated.

correlationally speaking, one might argue that spilling coffee causes turbulents

b) Suppose that Dr. Cranium finds a positive correlation between head size and digit span (describe digit span).

correlationally speaking, one might argue that people with bigger heads have bigger digit spans (instead of something like, head size and digit span increase with age)

c) Suppose the Dr. Ruth finds a positive correlation between the number of baby's born and the rate of stork sightings (I believe that such a correlation has been reported)

correlationally speaking, one might interpret this as support for the hypothesis that storks bring babies to home

Suppose that in one study we look for a correlation between age and height, but we only test 0 to 10 yr olds. But in a second study we look for the same relationship but only test 25 to 25 yr olds. In the first case we will probably find a strong positive correlation, but in the later case we may find a near 0 correlation.

Which correlation is correct? Both are, if considered with respect to the range represented in the data. We should conclude that the strong positive correlation exists for a restricted range. That is, from years 0 to 10, there is a strong positive correlation between age and height. (note: a non-linear function is appropriate for this relationship)

A single extreme score can really mess up a correlation.

r = -0.05 r = +0.76

7) When considering "how good" a relationship is, we really should consider r², not just r.

r² is called the coefficient of determination

we'll talk more about this towards the end of this chapter. What it basically measures is how much of the variability in one variable can be determined by the other variable.

In other words, suppose that we find that the correlation (r) between height and weight is 0.76. We can use this information to predict a person's weight, if we know their height. But, notice that the correlation is not perfect, so we know that we may be off by a bit.

But we also know that we'll be close. The r² for this relationship is (0.76)² = .578. What we can conclude from this is that 57.8% of the variability in weight can be accounted for from the relationship that it has with height.

notice that if we do have a perfect correlation (r = ą 1.0), then r² = 1.0² = 1.0. So 100% of the variance in Y can be accounted for by X.

hypothesis testing

Yes. We can test predictions about whether or not there is a relationship and even about what direction the relationship has. At the population level, a relationship is represented by rho ( r ), and at the sample level by our familar r.

What are the hypotheses?

₀

H₁: r not equal to 0

no positive rel.	no negative rel.
H₀: r < 0	H₀: r > 0
H₁: r > 0	H₁: r < 0

Why subtract 2? Because we know two values, X & Y, so we lose two degrees of freedom.

We're going to skip over 16.5. Read through it and try to understand the main point. Basically it is describing some other types of correlations.

Linear Regression - a brief introduction

Let's start by talking about lines and graphs. Consider the follwing graph.

at X = 0, Y = 1
at X = 1, Y = 1.5
at X = 2, Y = 2.0
at X = 3, Y = 2.5
at X = 4, Y = 3.0

So as X goes up by 1, Y goes up by 0.5. This is called the slope (b). This is a constant.
The intercept (a) is the value of Y when X = 0. This is also a constant.
We can describe the line in the following linear equation:
Y = bX + a ---> Y = (.5)X + 1.0
in other words, using the linear equation, we can determine the value of Y, if we know the values of X, b, & a
- recall that predicting Y based on X is one of the main things that this chapter is all about

Okay, now let's return to our scatter plots. Let's start with the case of r = 1.0.

When we do a regression analysis, what we are doing is trying to find the line (and linear equation) that best fits the data points. For this example it is pretty easy. There is only one possible line that makes sense to fit to this set of data.

Now let's look at a case when the correlation is not perfect.

Now it isn't as easy. Clearly no single straight line will fit each data point (that is, you can't draw a single line through all of the data points). In fact it is not too hard to imagine several different possible lines fitting to this data. What we want is the line (and linear equation) the fits the best.

What does it mean to be the line that best fits the data?

least-squares solution

distance = Y -

SS_error = total squared error = formula

We get the values from the line, and the Y values from the actual data points

We need to do this for all of the values of a and b.

	X 	Y	X- 	Y-  	(dev_X)(dev_Y)	(X- )²	(Y- )²
	0	1	-6	-1		6		36	1
	10	3	+4	+1		4		16	1
	4	1	-2	-1		2		4	1
	8	2	+2	0		0		4	0
	8	3	+2	+1		2		4	1	
sum     30	10				14		64	4
mean	6.0	2.0
			So SP = 14; SS_X = 64; SS_Y = 4

slope = b = SP/SSX = 14/64 = .22

intercept = a = - b = 2.0 - (.22)(6.0) = .68

So the regression equation is:

= .22(X) + .68

So now we have our regression equation for these data. We can use this equation to predict Y, given values of X. However, there are some precautions that we will need to consider when interpreting the regression.

standard error of the estimate

2) Regression should not be used to make predictions beyond the range of values of X included in the data set. We discussed this last time when talking about correlations. The reasons are the same.

standard error of the estimate

SS_error =

Then we'll divide that by our degrees of freedom (which gives us a measure of variance, or mean squared error)

SS_error / df

remember that df = n - 2

So in the end we end up with:

S_error =

	X 	Y		(Y - )	(Y - ) ²
	0	1	0.68	    .32		.102
	10	3	2.88	    .12		.014
	4	1	1.56	   -.56		.314
	8	2	2.44	   -.44		.193
	8	3	2.44	    .56		.314	
sum     30        10         10	      0		.937
mean	6.0	2.0
			So SP = 14; SSX = 64; SSY = 4; r = 0.875

= .22(X) + .68

S_error = = = = .559

An easier way to compute S_error is to use the correlational information.

SS_error = (1 - r²)SS_Y = (1 - (+0.875)²)(4) = (1 - .766)(4) = .9375

S_error = = = .559

Go to Chapter 13: Analysis of Variance

Return to Psych 240 syllabus page
Return to Psych 345 syllabus page
Return to Statistics Lectures page

Return to Illinois State University Home Page
Return to Illinois State University Psychology Home Page


Now it isn't as easy. Clearly no single straight line will fit each data point (that is, you can't draw a single line through all of the data points). In fact it is not too hard to imagine several different possible lines fitting to this data. What we want is the line (and linear equation) the fits the best.

Psychology 240 LecturesChapter 16 Statistics 1

Illinois State University J. Cooper Cutting Fall 1998, Section 04

If you have any questions, please feel free to contact me at cutting@main.psy.ilstu.edu.

Psychology 240 Lectures
Chapter 16
Statistics 1

Illinois State University
J. Cooper Cutting
Fall 1998, Section 04