Regression Line Exercise
We know that if there is a linear correlation between bivariate data, then a least squares regression line can be used to model the data. Let's say that the equation of the regression line is
f(x) = mx + b. We also know that the least squares regression line gets its name by minimizing the sum of the squares of the vertical distance from the line to every data point yi. We shall call this distance the error. Each error then is represented by f(xi) - yi. Therefore, the sum of the squared errors is given by , where the points (xi , yi ) represent the data.
From your work in Math Analysis and Physics last year, you know that finding where the derivative of a function is zero can assist you in finding maximum and minimum values of the function. Since the least squares regression line uses this exact concept, minimizing the sum of the squares of the errors, we can use the derivative to help us find this minimum. Let S(m, b) represent the sum of the squared errors. Notice that this is a function in two variables, m and b, where m represents the slope of the regression line and b represents the y-intercept. Therefore
Using Derive on the problems listed below, find the derivative of S with respect to m and the derivative of S with respect to b (these are known as the partial derivatives of S). Set each of these derivatives equal to zero and solve this system of two linear equations. This solution will be the slope and y-intercept of the least squares regression line.Test the above on problems 25, 26 on page 648 of your text.