Outline

  • Finding measures of center by hand and with SPSS
banner

Lab 8

Central Tendency and Recoding in SPSS

Finding the Center (or central tendency) of a Distribution

We’ll use the same students.sav file that you used in the last lab.

A measure of central tendency is a statistic that identifies a single value as representative of an entire distribution. The goal of central tendency is to find the single value that is most typical or most representative of the entire group.

  • The mean is the arithmetic average of all of the scores in the distribution.
  • The median is the score in the middle of the rest of the scores. That is, half of the scores are to the right of the median and half are to the left.
  • The mode is the most frequently occurring score in the distribution.

For now let’s learn how to compute these three measures of center.

We’ll begin by looking at the distribution for quiz4. Look at the histogram.

If you had to pick a single value that is representative of the entire distribution, there are several reasonable options:

  1. Select 7.0 because is the most frequent score. It is the mode.
  2. Select the value in the middle of the distribution, at which half of the scores are to the right and half are to the left. This will again be 7.0. This is the median.
  3. Select the arithmetic average. For this distribution the mean is 6.89.

Calculating by Hand

Calculating the Mode

This is easy. You simply find the number that occurs most often. For example, consider this distribution:

11, 3, 5, 3, 1, 7

Rearrange it in ascending order:

1, 3, 3, 5, 7, 11

Now it is easy to see that the most frequent number (i.e., the mode) is a 3.

Calculating the Median

This is only slightly more complicated. After you ahve arranged a distribution of numbers in ascending order, simply count how many numbers there are. Let’s call this total N, which is also called the sample size because it is the number of participants in your study). Notice that N is italicized. This is proper for all letters (even Greek ones) that stand for statistics or parameters.

If N is an odd number, the median is the middle number in the ordered distribution. To find which number is the middle, find N+12. If N is 7, then

7+12=4.

This tells us that the median number is the 4th number in the distribution. So if the distribution were:

45, 52, 76, 88, 99, 100, 100

then the median would be the 4th number, 88.

In the earlier example above (1, 3, 3, 5, 7, 11), N was 6. When N is even, the median is the average of both middle scores. Actually the formula from above still works. The median is the score that falls in slot number N+12. In this case, 6+12=3.5. What is the 3.5th number? By convention it is the average of the 3rd and 4th numbers. In this case, the 3rd and 4th numbers are 3 and 5 so the median is 3+52=4. More complicated formulas exist for the median but we will not concern ourselves with them in this class.

Calculating the Mean

This is something that is familiar to you. Simply add up all the numbers and divide by N (the number of numbers in the distribution). In example above, the mean is 1+3+3+5+7+116=306=5. Technically, this is called the arithmetic mean. Note that there are other kinds of averages that won’t be covered in this course. Each kind of average has its specific purposes but the arithmetic mean will do for most purposes in this course.

Generally, unless it makes sense to do otherwise, we will calculate the mean and other kinds of statistics to 2 decimal points of precision.

    (1) Answer the following questions about this dataset (Note: calculate these by hand not SPSS):

21, 22, 23, 24, 24, 24, 25, 25, 25, 25, 26, 27, 28

        (a) What is the mean of this distribution?
        (b) What is the mode of this distribution?
        (c) What is the median of this distribution?

Calculating the measures of central tendency in SPSS

The following section outlines how to compute the mean, median, and mode in SPSS.

We can get SPSS to compute all of these values in the same command submenu. Go to the Statistics menu, select the Analyze submenu, and then the Descriptive Statistics submenu, and then the Frequencies option.

means

This should open a window that looks like this:

Select quiz4 as your variable. And then click on the Statistics button.

This will open another window.

In this window select mean, median, and mode. Then click Continue. This will take you back to the previous window. Now click OK.

Now SPSS should open up an output window that includes a table that looks like this: 

That’s all there is to it.

Let’s try to do what I just outlined above but with a different variable. For the variable final in your students.sav file I’d like you to answer the following questions.

    (2) Using SPSS for your compuations, answer the following questions.
        (a) What is the mean for the "final" variable?
        (b) What is the median for the "final" variable?
        (c) What is the mode for the "final" variable?
        (d) What percent of students scored lower than the mode on the final?
            (hints: Don't include the students who scored the mode exactly.  Don't include "%" in your    
                answer. Round to 1 digit. Thus, 22.2% would be entered as 22.2)

Properties of Central Tendency Measures

The mean, median and mode are descriptive statistics that are designed to tell us something about the center of a distribution. That is, where most of the data are.

So how do you know which measure of central tendency should be used?

The answer depends on a number of factors, including the shape of the distribution and the scale of measurement that you use.

Use the mean if you can.

The mean is the preferred measure of central tendency for interval and ratio level data. It takes every item in the distribution into account, and it is more stable than the median and the mode. In this context, stable refers to the tendency to get similar results when you draw multiple samples from the same population and calculate a statistic. In one sample of numbers ranging from 0 to 24, the mean might be 10.4 and in another sample the mean might be 10.3. 10.4 and 10.3 are similar to each other. If we drew many samples from the same population and the mean was roughly the same each time, we would say that the sample mean is stable. In small samples, the mode is very unstable. That is, from one sample to the text, the mode is very different each time. The median is more stable than the mode, but typically less stable than than the mean, especially in smaller samples.

  • If you change a given score, add an observation, delete an observation, and then the mean will change.
  • If you add (or subtract) a constant to each score, then the mean will change by adding that constant.
  • If you multiply (or divide) each score by a constant, then the mean will change by being multiplied by that constant.

However, there are times when the mean is not the appropriate measure.

Use the median if you must.

  • The variable is ordinal.
  • The interval or ratio data have extreme values (outliers) that cause the distribution to be highly skewed.

Use the mode as a last resort.

The mode is pretty much your only choice if you are looking at a nominal scale.

Distribution Shape and Central Tendencies

Okay now let’s look at a few distributions to examine these different measures of central tendency relate to one another.

In a distribution like this one (symmetric distribution), the mean, median, and mode will have similar values.

However, in a positively skewed distribution, the mean will be larger than the median, which will be larger than the mode.

The opposite is true for a negatively skewed distribution:

    (3) Look at the distribution in the histogram above. 
        (a) From lowest to hightest, rank the mean, median, and mode.

        (b) Of the three measues of central tendency (mean, median, and mode), which would be the
            least representative number in this case?

        (c) Which kind of skewness is evident in this distribution?