3

Outline

  • hypothesis testing with correlation
banner

Lab 20

Hypothesis testing with correlation


In this lab, we're going to revisit some of the descriptive statistics we've already talked about that help us describe relationships between measured variables. However, today we're going to see how these statistics can be used to do hypothesis testing.


Hypothesis Testing with Pearson r

Recall that the Pearson r statistic tells us how much and in what way two measured variables are related. We can also use this statistic to conduct hypothesis tests about population correlation values.

    The population correlation value is indicated by ρ (Greek letter rho corresponding to the sample r). This would be the correlation if the entire population provided scores on the two measured variables you are interested in.

This means that we can state a null and alternative hypothesis for the population correlation ρ based on our predictions for a correlation. Let's look at how this works in an example.

    Suppose that we wanted to know if students who live near campus have higher GPAs than students who live farther away and commute to campus. We could measure students' GPAs and also measure how far away they live by measuring the distance to their residence from the middle of the quad. These are the two measured variables we're interested in.

Now let's go through our hypothesis testing steps:

Step 1: State hypotheses and choose α level

    Remember we're going to state hypotheses in terms of our population correlation ρ. In this example, we expect GPA to decrease as distance from campus increases. This means that we are making a directional hypothesis and using a 1-tailed test. It also means we expect to find a negative value of ρ, because that would indicate a negative relationship between GPA and distance from campus. So here are our hypotheses:

      H0 ρ > 0

      HA: ρ < 0

    We're making our predictions as a comparison with 0, because 0 would indicate no relationship. Note that if we were conducting a 2-tailed test, our hypotheses would be ρ = 0 for the null hypothesis and ρ not equal to 0 for the alternative hypothesis.

    We'll use our conventional α = .05.

Step 2: Collect the sample

    Here are our sample data:

      Subject GPA Distance from campus (in miles)
      A 3.45 1.3
      B 3.03 .8
      C 2.67 5.7
      D 2.50 .5
      E 3.16 2.9

Step 3: Calculate test statistic

For this example, we're going to calculate a Pearson r statistic. Recall the formula for Person r:

      r

The bottom of the formula requires us to calculate the sum of squares (SS) for each measure individually and the top of the formula requires calculation of the sum of products of the two variables (SP).

We'll start with the SS terms. Remember the formula for SS is:

    SS = Σ(X - xbar)2

We'll calculate this for both GPA and Distance. If you need a review of how to calculate SS, review Lab 9. For our example, we get:

    SSGPA = .58 and SSdistance = 18.39

Now we need to calculate the SP term. Remember the formula for SP is

    SP = Σ(X - xbar)(Y - ybar)

If you need to review how to calculate the SP term, go to Lab 12. For our example, we get

    SP = -.63

Plugging these SS and SP values into our r equation gives us

    r = -.19

Now we need to find our critical value of r using a table like we did for our z and t-tests. We'll need to know our degrees of freedom, because like t, the r distribution changes depending on the sample size. For r,

    df = n - 2

So for our example, we have df = 5 - 2 = 3. Now, with df = 3, α = .05, and a one-tailed test, we can find rcritical in the Table of Pearson r values. This table is organized and used in the same way that the t-table is used.

    Our rcrit = .805. We write rcrit(3) = -.805 (negative because we are doing a 1-tailed test looking for a negative relationship).

Step 4: Compare observed test statistic to critical test statistic and make a decision about H0

    Our robs(3) = -.19 and rcrit(3) = -.805

    Since -.19 is not in the critical region that begins at -.805, we cannot reject the null. We must retain the null hypothesis and conclude that we have no evidence of a relationship between GPA and distance from campus.


    Now try a few of these types of problems on your own. Show all four steps of hypothesis testing in your answer (some questions will require more for each step than others) and be sure to state hypotheses in terms of ρ.

    (1) A high school counselor would like to know if there is a relationship between mathematical skill and verbal skill. A sample of n = 25 students is selected, and the counselor records achievement test scores in mathematics and English for each student. The Pearson correlation for this sample is r = +0.50. Do these data provide sufficient evidence for a real relationship in the population? Test at the .05 α level, two tails.

    (2) It is well known that similarity in attitudes, beliefs, and interests plays an important role in interpersonal attraction. Thus, correlations for attitudes between married couples should be strong and positive. Suppose a researcher developed a questionnaire that measures how liberal or conservative one's attitudes are. Low scores indicate that the person has liberal attitudes, while high scores indicate conservatism. Here are the data from the study:

      Couple A: Husband - 14, Wife - 11

      Couple B: Husband - 7, Wife - 6

      Couple C: Husband - 15, Wife - 18

      Couple D: Husband - 7, Wife - 4

      Couple E: Husband - 3, Wife - 1

      Couple F: Husband - 9, Wife - 10

      Couple G: Husband - 9, Wife - 5

      Couple H: Husband - 3, Wife - 3

    Test the researcher's hypothesis with α set at .05.

(3) A researcher believes that a person's belief in supernatural events (e.g., ghosts, ESP, etc) is related to their education level. For a sample of n = 30 people, he gives them a questionnaire that measures their belief in supernatural events (where a high score means they believe in more of these events) and asks them how many years of schooling they've had. He finds that SSbeliefs = 10, SSschooling = 10, and SP = -8. With α = .01, test the researcher's hypothesis.



Using SPSS for Hypothesis Testing with Pearson r

We can also use SPSS to a hypothesis test with Pearson r. We could calculate the Pearson r with SPSS and then look at the output to make our decision about H0. The output will give us a p value for our Pearson r (listed under Sig in the Output). We can compare this p value with alpha to determine if the p value is in the critical region.

Remember from Lab 12, to calculate a Pearson r using SPSS:

Under the Analyze menu you will find the Correlate submenu.

From the Correlate submenu you want to select Bivariate.

corr
In the bivariate correlation window, select the variables that you want correlated (you can have more than two at a time). Make sure that Pearson is selected as the Correlation coefficient you are testing. Notice that you can select a 1- or 2-tailed test and have significant findings flagged. coor

The output that you get is a correlation matrix. It correlates each variable against each variable (including itself). You should notice that the table has redundant information on it (e.g., you'll find an r for height correlated with weight, and and r for weight correlated with height. These two statements are identical.)

The information about significance is the table row "Sig. 2-tailed." It provides the p value we're looking for to compare with a. In this case, the given p is .000 (meaning p < .001). Since this value is lower than any conventional alpha, we can reject H0. Note that the significant correlation is flagged (**), and the footnote also provides the information about significance. r out


(4) To measure the relationship between anxiety and test performance, a researcher asked his students to come to the lab 15 minutes before they were to take an exam in his class. The researcher measured the students' heart rates and then matched these scores with their exam performance after they had taken the exam. Use the data below and SPSS to conduct a hypothesis test for the correlation between anxiety and test performance in the population. Use α = .05.

                Student         Heart rate          Exam score
                     A                 76                  78
                     B                 81                  68
                     C                 60                  88
                     D                 65                  80
                     E                 80                  90
                     F                 66                  68
                     G                 82                  60
                     H                 71                  95
                     I                 66                  84
                     J                 75                  75
                     K                 80                  62
                     L                 76                  51
                     M                 77                  63
                     N                 79                  71
                   _______________________________________________