Analysis of Species Data

This dataset includes brain and body weight, maximum life span, gestation time, and sleeping time of 30 mammals.

After obtaining the data through the Internet (see source below), I pasted it into an Excel worksheet and created a scatterplot of two of the variables: brain weight and maximum life span. To view this file, click here: Download an Excel 5.0 File of Species Data. You can also download an Excel 3.0 File of the same data.

After you have downloaded it, you can either print a copy or keep it on the desktop. Keeping the file on the desktop will enable you to be more interactive with the lesson. You can use Excel functions to more quickly calculate the mean, variance, etc., and you can more easily see the effect on those calculations when you eliminate certain data or when you compare other pairs of data. If you are leaving the Excel file on the desktop, just flip back and forth between it and the web browser to answer the following questions.

Lesson:

Comparison of brain weight (X) and maximum life span (Y):

1. Find the mean of X and of Y:

2. Find the variance and standard deviation of X and of Y:

3. Find the covariance of X and Y:

4. Find the slope of the regression line (Recall the slope is the covariance of X and Y divided by the variance of X.):

5. Find the Y-intercept of the regression line:

6. What is the final regression equation that best describes the linear relationship between brain weight and maximum life span?

7. What is the correlation coefficient (r) between brain weight and life span?

8. What does your answer to number 7 say about the strength and direction of the linear relationship between brain weight and maximum life span? (Verbally describe the meaning of the numberical value of r.)

9. If an animal had a brain weight of 500 g, then what would you predict its maximum life span to be?

New comparison of brain weight (X) and maximum life span (Y):

Look at the scatterplot of these two variables. Do you see two points that are far away from the rest of the points? Such stray points are called outliers. Removing outliers from a set of data can often give you a much better picture of the remaining data points and their tendencies.

10. What two species are outliers in this dataset?

11. Eliminate those two outliers and repeat the following calculations:
(If using the Excel file, you can eliminate those two rows of data, and then the calculations and scatterplots will all adjust automatically!)

a. mean of X and of Y:

b. variance and standard deviation of X and of Y:

c. covariance of X and Y:

d. slope:

e. y-intercept:

f. regression equation:

12. Using this new regression equation, predict the maximum life span of an animal whose brain weight is 500g:

13. Find the new correlation coefficient between brain weight and maximum life span. Compare this value to the old one found in #7 above.

14. Is the correlation coefficient between brain weight and maximum life span strong enough to claim a strong relationship between those two variables? Explain.

In this lesson you only explored the relationship between brain weight and life span. However, the variables provided in this dataset could be paired up in many other ways. Hypothesize a linear relationship between two other variables and then use statistical measures to explore if such a relationship indeed exists.

This set of data was downloaded through the Internet from the following source: http://www.einet.net/galaxy/Science/Mathematics.html. To get to the actual data you must go through the following folders: Statistics, StatLib Index (CMU), datasets, and then "sleep". This dataset was submitted by Roger Johnson (rjohnson@carleton.edu) and was originally drawn from the article "Sleep in Mammals: Ecological and Constitutional Correlates" by Allison, T. and Cicchetti, D. (1976), _Science_, November 12, vol. 194, pp. 732-734. The complete dataset includes brain and body weight, life span, gestation time, time sleeping, and predation and danger indices for 62 mammals.