|
Part 1:
Making Sense of the Dynamic Environment and the Calculated Values
Note: The formula we are exploring for Correlation Coefficient is:

-
Open the Fathom file called CorrelParts.ftm.
-
Examine Collection 1. The first two columns are the numerical values of the data points shown in the scatterplot. If you change any of the numerical values in
the table, the graph will update. Likewise, if you move one of the data points in the scatterplot, the numerical values in the table will change accordingly. The
rest of the columns in Collection 1 have been calculated based on the X and Y values and are needed calculations for the formula for r above.
To
Plot a Function:
- Select the graph. Under the Graph menu, select Plot
Function. The Expression for Function pop-up box will
appear.
- Type the right hand expression for the function. In this case
we are plotting the function y=mean(y). Thus you need to type in
mean(y).
- Click on OK. The function should appear in the graph
window.
|
To
Plot a Value:
- Select the graph. Under the
Graph menu,
select Plot Value. The Expression for
Value pop-up box will appear.
- Type the expression for the relation. In this
case we are plotting the relation x=mean(x). Thus you need to
type in mean(x).
- Click on OK. The vertical line should appear in the graph
window.
|
- What data point is
represented at the intersection of the vertical and horizontal lines?
- What is the significance
of this data point?
- Is
it possible that all points can be on one side of either
or
? Why or why not?
- Can nine of the 10 data
points be on one side of either
or ? Why or why not?
- Can the data be changed in such a way that nine
of the data points lie in the new “third quadrant” with the
last point in the “first quadrant”? What would this result say
about the mean?
- DoubleClick
the Collection 1 icon to display calculated measures. These measures
are based on the calculated values in collection 1 and correspond to
parts of the formula for r. Make sense of what these measures
represent before proceeding.
Part 2:
Making Sense of Correlation
Have students work in pairs to do and answer the following questions.
-
Move the points on the graph so they are approximately on a line with positive slope.
-
What do you notice about the magnitude and sign of
the Xdeviations and Ydeviations??
-
What do you notice about the magnitude and sign of
the XdevSquared and YdevSquared?
-
How are these values influencing the value of r?
-
Click on the Graph. Then under the Graph menu, choose Least-Squares Line. Does the equation show a positive slope?
-
Move the points so they are approximately on a line with negative slope.
-
What do you notice about the magnitude and sign of
the Xdeviations and Ydeviations?
-
What do you notice about the magnitude and sign of
the XdevSquared and YdevSquared?
-
How are these values influencing the value of r?
-
Move the points so they appear to have no
association.
-
What do you notice about the magnitude and sign of
the Xdeviations and Ydeviations?
-
What do you notice about the magnitude and sign of
the XdevSquared and YdevSquared?
-
How are these values influencing the value of r?
-
Place the points in a positive linear trend. Drag
one of the points on the graph so that it is clearly an outlier.
Observe the effects on the regression line and the value of r.
-
Based on the formula for r, describe why the
value of r is affected so greatly by an outlier.
-
Pick up the outlier point and drag it to
different locations on the graph. Find three different locations
of an outlier that cause the regression line to drastically
change. Where did you have to place the outlier for this effect?
Why does this make sense?
-
Make sense of the effects of an outlier.
Reason
from the formula for r and slope as well as the calculated
measures.
-
In the lower left corner of the coordinate plane,
place 9 of your points in a “cloud” that appears to have no trend.
Then move one point to the upper right corner.
-
Is this scatterplot linear?
-
What is the value of r? Reason from the formula
for r and the displayed measures to make sense of this value.
-
Find two other sets of data points that give a
high r value but show no linear trend.
-
Does a high r value necessarily mean that the
data are generally linear?
-
Does a low r value necessarily mean that the data
are NOT generally linear?
-
Place your points in a nearly horizontal line on
the graph. What is the value of r? Why?
-
Place your points in a nearly vertical line on
the graph. What is the value of r? Why?
-
Based on how the formula for r is computed, why
do you think the values of r are constrained between –1 and 1?
Part
3:
Conceptualizing the Regression Line
-
Throughout the investigation in Part 2, what did
you notice about the relationship between the data point
and
the regression line?
-
How is correlation coefficient r related
to the slope of the regression equation? Is the value of r the
same as the slope of the line? Does an r value of 1 imply a y=x
relationship? Why or why or not?
-
Recall that the formula for slope of a regression
line can be expressed as .
Thus, by calculating the means and standard deviations for X values
and Y values in a data set, as well as r, one can derive the line of
best fit using algebraic techniques. Test this procedure with 10 data
points.
Part
4:
Suggested Data Exploration
A biology student noticed that crickets seemed to
chirp faster in the summer than in the spring or fall.
Her grandmother had always told her that she could determine the
temperature by listening to the crickets.
Over the next season she counted the chirps per minute of a cricket
and recorded the temperature. Her data is provided in the table below.
|
Chirps
(per minute)
|
Temperature
(Fahrenheit)
|
|
67
|
54
|
|
75
|
55
|
|
83
|
58
|
|
91
|
58
|
|
99
|
60
|
|
119
|
67
|
|
134
|
69
|
|
140
|
70
|
|
149
|
74
|
|
164
|
77
|
-
Find a mathematical model that the student can
use to estimate the temperature by listening to the crickets.
-
Interpret the r-value
for this data set.
-
Interpret the slope and y-intercept in terms of
the phenomenon.
-
Explain how this model could by used to estimate
the temperature quickly by counting chirps for only 15 seconds.
-
If you wanted to describe mathematically the
relationship between temperature and cricket chirps, which variable is
more appropriate to consider as the dependent variable? Is this the same variable that you treated as the dependent
variable? If not, find a
new model. Interpret the
slope and y-intercept.
|