What is the Best Fit?

Well, you have gotten to the point of looking at lines that fit the data fairly well. But how can you tell how well they fit?

Is there a measure for this?

Of course! Statistics has a measure for everything.

Think about this, after you put a line through the points in the last exercise not all of them were on the line. Not only that, but they were all a measurable distance from the line.

For every x-value from our data set we have two y-values.

We have the y-value of the data set and we have the y-value given by the equation y= mX+b (remember m is slope, X is x-value, and b is y-intercept). If we take the absolute value of the difference between the two y-values for all of the data points and find the average (mean) then we would have a measure of how well the line fits, right?

This measure is called the mean error, because another word for difference is error, and mean is another word for average.

This table shows how a mean error is found, it has data from the heights and weights of boys in Birmingham, England.

The total of the absolute value of the errors is 15.79 and the mean of this is 1.75.

Let's look at this pictorially.

Discuss these questions in a small group:

1. Does this seem right?
2. What does a mean error of zero mean?
3. Why do we use the absolute value of the difference?
4. Can a line produce a larger mean error than another line and still be a better fit?
5. What would make this test more powerful? (Hint: instead of taking the absolute value ...)

Go on to see more power.