|
Regression I. Midterm I II. Least Squares Regression Line Example: Suppose that you want to find a linear relationship between advertising and revenue. You experiment with three different levels of advertising and come up with the following data:
If you graph the data you will see that it does not lie on a line. What is the best linear fit. We define best to mean that the sum of the squares of the errors are minimized. If (xi,yi) lies on the line and the true revenue for xi spent on ads is y, then the error is y - yi . We compute the sum of squares of the error for a line y = a + bx: [(1 - (a + b(0)))2 + (5 - (a + b(1)))2 + (12 - (a + b(3)))2 ] Since we want the minimum the error (for all possible choices of a and b), we set fa = 0 and fb = 0 0 = fa = -2(1 - a) - 2(5 - a - b) - 2(12 - a - 3b) = 6a + 8b - 36 or 3a + 4b - 18 = 0 0 = fb = -2(5 - a - b) - 6(12 - a - 3b) = 8a + 20b - 82 or 4a + 10b - 41 = 0 Solving, 12a + 16b - 72 = 0 12a + 30b - 51 = 0 14b = 51 b = 51/14 = 3.64 a = 1.14 The equation of the linear regression line is y = 1.14+ 3.64x so if you want to forecast the revenue if 3,000 is spent on ads we compute y = 1.14 + 3.64(3) =12.06 or $1.2 million Challenge: Suppose that you collect data on growth over time of a pine tree and come up with the following data:
You expect that the graph is parabolic. Find the best fitting parabola. |